Newer
Older
* Endness metadata "Single Read" changed to "Single End" in data-hub, pipeline updated to handle (#110) ("Single Read" still acceptable for backwards compatibility)
* Strandedness metadata "yes"/"no" changed to boolean "t"/"f" in data-hub, pipeline updated to handle (#70) ("yes"/"no" still acceptable for backwards compatibility)
* Upload empty mRNA_QC entry if data error (#111)
* Add memory limit (75%) per thread for samtools sort (#108)
* Remove parsing restrictions for submitted stranded/spike/species (#105, #106)
* Pass unidentified ends instead of overwriting it as unknown
* Move fastqc process before trim to catch fastq errors (#107)
* Only use fastq's that match *[_.]R[1-2].fastq.gz naming convention (#107)
* Update input bag export config to only fetch fastq's that match *[_.]R[1-2].fastq.gz naming convention
* Remove check for multiple fastq check in parse metadata (redundant and no longer valid)
* Don't use file.csv from inputBag to parse manual endness, use counted from getData
* Restrict sampled alignment process to use >32GB nodes on BioHPC (#108)
* Use nproc**-1** for alignment processes (#108)
* Data-hub column title change from "Sequencing_Type" to "Experiment_Type" (#114)
* Data-hub column title change from "Has_Strand_Specific_Information" to "Strandedness" (#115)
* Merge data error pre-inference execution run upload/finalize to 1 process
* Change uploadOutputBag logic to change reuse hatrac file if alread exists (re-uses Output_Bag entry by reassigning Execution_Run RID) (#112)
*Known Bugs*
* Override params (inputBag, fastq, species) aren't checked for integrity
* Authentication files and tokens must be active (active auth client) for the duration of the pipeline run (until long-lived token utilization included)
* Check for outputBag in hatrac doesn't check for any uploaded by chaise
**User Facing**
**Background**
* Fix spelling in config file for process of failed fastq to upload error message (#104)
*Known Bugs*
* Override params (inputBag, fastq, species) aren't checked for integrity
* Authentication files and tokens must be active (active auth client) for the duration of the pipeline run (until long-lived token utilization included)
<hr>
**User Facing**
**Background**
* Split non-metadata mismatch error handling proces into 2, 1 to handle fastq errors and one for species errors (BUG FIX #101)
* Add known errors to integration CI tests (ambiguous species, trunkated fastq, R1/R2 mismatch (#103)
* Fix pre exeuction run fails uploading of execution run RID to tracking site (#96, #97)
* Change CI replicate count badge CI to count all execution runs that match major version
*Known Bugs*
* Override params (inputBag, fastq, species) aren't checked for integrity
* Authentication files and tokens must be active (active auth client) for the duration of the pipeline run (until long-lived token utilization included)
<hr>
# v1.0.0
* Change consistency test to check if +/- 5% of standard
* Change tool version checker for badges to use latest tag
*Known Bugs*
* Override params (inputBag, fastq, species) aren't checked for integrity
* Authentication files and tokens must be active (active auth client) for the duration of the pipeline run (until long-lived token utilization included)
* Add option to send email on workflow error, with pipeline error message
* Add versions and paper references of software used to report
* Update references to use bags
* Update to newer references (GRCh38.p13.v36 and GRCm38.p6.vM25)
* Use production server for data-hub reference call
* Error if fastq error (>2, if pe != 2, if se !=1)
* Error if pe and line count of R1 != R2
* Error if ambiguous species inference
* Remove non fastq from inputBag from the export bag config level
* Remove (comment out) option to pull references from S3
* Make pull references from BioHPC default (including in biohpc.config)
* Start using new gudmaprbk dockerhub (images autobuilt)

Venkat Malladi
committed
* Changed order of steps so that fastqc is done after the trim step
* Only calculate/report tin values on regular chromosomes (from gtf)
* Change inputBag fetch to manifest then validate (if fail fetch missing and revalidate up to 3 times)
* Retry getData and trimData processes up to once
* Make inputBag export config to create inputBag with only small txt file for CI unit test of getData (and update test)
*Known Bugs*
* Override params (inputBag, fastq, species) aren't checked for integrity
<hr>
# v0.0.3
**User Facing**
* TPM table:
* Add Ensembl Gene ID
* Rename columns: *GENCODE_Gene_Symbol*, *Ensembl_GeneID*, *NCBI_GeneID*
* Run table: *Session ID* and *Pipeline Version*
* Reference Table: *Species*, *Genome Reference Consortium Build*, *Genome Reference Consortium Patch*, *GENCODE Annotation Release* (outputs both human and mouse versions)
* Add inputBag override param (`inputBagForce`) [`*.zip`]
* Uses provided inputBag instead of downloading from data-hub
* Still requires matching repRID input param
* Add fastq override param (`fastqsForce`) [`R1`,`R2`]
* Uses provided fastq instead of downloading from data-hub
* Still requires matching repRID input param and will pull inputBag from data-hub to access submitted metadata for reporting
* Add species override param (`speciesForce`) [`Mus musculus` or `Homo sapiens`]
* forces the use of the provided species
**Background**
* Add GeneSymbol/EnsemblID/EntrezID translation files to references
*Known Bugs*
* outputBag does not contain fetch for processed data
* Does not include automatic data upload
* Override params (inputBag, fastq, species) aren't checked for integrity
* Remove gene details from tpm table
* Add EntrezID translation to tpm table (from version specific reference)
* Add GeneSymbol/EnsemblID/EntrezID translation files to references
* outputBag does not contain fetch for processed data
* Does not include automatic data upload
<hr>
# v0.0.1
**INITIAL BETA VERSION**\
Does not include automatic data upload\
This version is for initial upload of test data to GUDMAP/RBK data-hub for internal integration