diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index a6a8735dda09768e301a41fbabb4c236e18dd2f6..68b4d37b534a3046ac75207bc3c679b62d8c85f6 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -34,7 +34,7 @@ img_cache: stage: singularity script: - mkdir -p ${dir}cache/ - - cat nextflow.config | grep -oP "container = \K.*" | tr -d "'" | sort | uniq | xargs -P 3 -I {} bash -c "singularity pull --dir ${dir} 'docker://'{} || true" + - cat nextflow.config | grep -oP "container = \K.*" | tr -d "'" | sort | uniq | xargs -P 1 -I {} bash -c "singularity pull --dir ${dir} 'docker://'{} || true" - wait - echo images cached diff --git a/CHANGELOG.md b/CHANGELOG.md index b141ec11c50def895aaf1a98076e860fec6e473d..09b6f43621b0b817a04e85514e6d728ef3d8e6f4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,7 @@ * Modify repository structure to allow for use with XPACK-DNANEXUS * Add override for endness * Add seqtk to references +* Update software versions to latest (containers) **Background** * Add memory limit (75%) per thread for samtools sort (#108) diff --git a/docs/dag.png b/docs/dag.png old mode 100755 new mode 100644 index b9bcdfe73831dc13544df3a33feb008ac9aed269..1e22ca0930d25af3fa6ab134413b4a241d4cbdea Binary files a/docs/dag.png and b/docs/dag.png differ diff --git a/docs/references.md b/docs/references.md index 54b83b5ebe5fe38f0f6d4b38fee4279f9af5898c..3aa5e67f4b5a5bf680fe88e2f4e5d8e2a4b67f62 100644 --- a/docs/references.md +++ b/docs/references.md @@ -4,28 +4,28 @@ * Anaconda (Anaconda Software Distribution, [https://anaconda.com](https://anaconda.com)) 2. **DERIVA**: - * Bugacov, A., Czajkowski, K., Kesselman, C., Kumar, A., Schuler, R. E. and Tangmunarunkit, H. 2017 Experiences with DERIVA: An Asset Management Platform for Accelerating eScience. IEEE 13th International Conference on e-Science (e-Science), Auckland, 2017, pp. 79-88, doi:[10.1109/eScience.2017.20](https://doi.org/10.1109/eScience.2017.20). + * Bugacov, A., Czajkowski, K., Kesselman, C., Kumar, A., Schuler, R. E., & Tangmunarunkit, H. (2017, October). Experiences with DERIVA: An asset management platform for accelerating eScience. In 2017 IEEE 13th International Conference on e-Science (e-Science) (pp. 79-88). IEEE. doi:[10.1109/eScience.2017.20](https://doi.org/10.1109/eScience.2017.20). 3. **BDBag**: - * D'Arcy, M., Chard, K., Foster, I., Kesselman, C., Madduri, R., Saint, N., & Wagner, R.. 2019. Big Data Bags: A Scalable Packaging Format for Science. Zenodo. doi:[10.5281/zenodo.3338725](http://doi.org/10.5281/zenodo.3338725). + * Madduri, R., Chard, K., DÂ’Arcy, M., Jung, S. C., Rodriguez, A., Sulakhe, D., ... & Foster, I. (2019). Reproducible big data science: A case study in continuous FAIRness. PloS one, 14(4), e0213013. doi:[10.1371/journal.pone.0213013](https://doi.org/10.1371/journal.pone.0213013). 4. **trimgalore**: * trimgalore [https://github.com/FelixKrueger/TrimGalore](https://github.com/FelixKrueger/TrimGalore) 5. **hisat2**: - * Kim ,D.,Paggi, J.M., Park, C., Bennett, C., Salzberg, S.L. 2019 Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. Aug;37(8):907-915. doi:[10.1038/s41587-019-0201-4](https://doi.org/10.1038/s41587-019-0201-4). + * Kim, D., Paggi, J. M., Park, C., Bennett, C., & Salzberg, S. L. (2019). Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nature biotechnology, 37(8), 907-915. doi:[10.1038/s41587-019-0201-4](https://doi.org/10.1038/s41587-019-0201-4). 6. **samtools**: - * Li H., B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin, and 1000 Genome Project Data Processing Subgroup. 2009. The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics 25: 2078-9. doi:[10.1093/bioinformatics/btp352](http://dx.doi.org/10.1093/bioinformatics/btp352) + * Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., ... & Durbin, R. (2009). The sequence alignment/map format and SAMtools. Bioinformatics, 25(16), 2078-2079. doi:[10.1093/bioinformatics/btp352](http://dx.doi.org/10.1093/bioinformatics/btp352) 7. **picard**: * “Picard Toolkit.†2019. Broad Institute, GitHub Repository. [http://broadinstitute.github.io/picard/](http://broadinstitute.github.io/picard/); Broad Institute 8. **featureCounts**: - * Liao, Y., Smyth, G.K., Shi, W. 2014 featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. Apr 1;30(7):923-30. doi:[10.1093/bioinformatics/btt656](https://doi.org/10.1093/bioinformatics/btt656). + * Liao, Y., Smyth, G. K., & Shi, W. (2014). featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics, 30(7), 923-930. doi:[10.1093/bioinformatics/btt656](https://doi.org/10.1093/bioinformatics/btt656). 9. **deeptools**: - * RamÃrez, F., D. P. Ryan, B. Grüning, V. Bhardwaj, F. Kilpert, A. S. Richter, S. Heyne, F. Dündar, and T. Manke. 2016. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Research 44: W160-165. doi:[10.1093/nar/gkw257](http://dx.doi.org/10.1093/nar/gkw257) + * RamÃrez, F., Ryan, D. P., Grüning, B., Bhardwaj, V., Kilpert, F., Richter, A. S., ... & Manke, T. (2016). deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic acids research, 44(W1), W160-W165. doi:[10.1093/nar/gkw257](http://dx.doi.org/10.1093/nar/gkw257) 10. **Seqtk**: * Seqtk [https://github.com/lh3/seqtk](https://github.com/lh3/seqtk) @@ -37,13 +37,13 @@ * FastQC [https://www.bioinformatics.babraham.ac.uk/projects/fastqc/](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) 13. **SeqWho** - * SeqWho [https://git.biohpc.swmed.edu/s181649/seqwho](https://git.biohpc.swmed.edu/s181649/seqwho) + * Bennett, C., Thornton, M., Park, C., Henry, G., Zhang, Y., Malladi, V. S., & Kim, D. (2021). SeqWho: Reliable, rapid determination of sequence file identity using k-mer frequencies. bioRxiv, 2021.2003.2010.434827. doi:[10.1101/2021.03.10.434827](https://doi.org/10.1101/2021.03.10.434827) 14. **RSeQC**: * Wang, L., Wang, S., Li, W. 2012 RSeQC: quality control of RNA-seq experiments. Bioinformatics. Aug 15;28(16):2184-5. doi:[10.1093/bioinformatics/bts356](https://doi.org/10.1093/bioinformatics/bts356). 15. **MultiQC**: - * Ewels P., Magnusson M., Lundin S. and Käller M. 2016. MultiQC: Summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32(19): 3047–3048. doi:[10.1093/bioinformatics/btw354](https://dx.doi.org/10.1093/bioinformatics/btw354) + * Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics, 32(19), 3047-3048. doi:[10.1093/bioinformatics/btw354](https://dx.doi.org/10.1093/bioinformatics/btw354) 16. **Nextflow**: - * Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E., and Notredame, C. 2017. Nextflow enables reproducible computational workflows. Nature biotechnology, 35(4), 316. \ No newline at end of file + * Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E., & Notredame, C. (2017). Nextflow enables reproducible computational workflows. Nature biotechnology, 35(4), 316-319. \ No newline at end of file diff --git a/docs/software_references_mqc.yaml b/docs/software_references_mqc.yaml index ac14c454cb40ac795d984c5b6c6a5970c56bd007..825f21fc153d952ebe7b654936f8eb9af76ae302 100644 --- a/docs/software_references_mqc.yaml +++ b/docs/software_references_mqc.yaml @@ -16,58 +16,58 @@ <li><strong>DERIVA</strong>:</li> </ol> <ul> - <li>Bugacov, A., Czajkowski, K., Kesselman, C., Kumar, A., Schuler, R. E. and Tangmunarunkit, H. 2017 Experiences with DERIVA: An Asset Management Platform for Accelerating eScience. IEEE 13th International Conference on e-Science (e-Science), Auckland, 2017, pp. 79-88, doi:<a href="https://doi.org/10.1109/eScience.2017.20">10.1109/eScience.2017.20</a>.</li> + <li>Bugacov, A., Czajkowski, K., Kesselman, C., Kumar, A., Schuler, R. E., & Tangmunarunkit, H. (2017, October). Experiences with DERIVA: An asset management platform for accelerating eScience. In 2017 IEEE 13th International Conference on e-Science (e-Science) (pp. 79-88). IEEE. doi:<a href="https://doi.org/10.1109/eScience.2017.20">10.1109/eScience.2017.20</a>.</li> </ul> <ol start="3" style="list-style-type: decimal"> <li><strong>BDBag</strong>:<br /> </li> </ol> <ul> - <li>D'Arcy, M., Chard, K., Foster, I., Kesselman, C., Madduri, R., Saint, N., & Wagner, R.. 2019. Big Data Bags: A Scalable Packaging Format for Science. Zenodo. doi:<a href="http://doi.org/10.5281/zenodo.3338725">10.5281/zenodo.3338725</a>.</li> + <li>Madduri, R., Chard, K., DÂ’Arcy, M., Jung, S. C., Rodriguez, A., Sulakhe, D., ... & Foster, I. (2019). Reproducible big data science: A case study in continuous FAIRness. PloS one, 14(4), e0213013. doi:<a href="https://doi.org/10.1371/journal.pone.0213013">10.1371/journal.pone.0213013</a>.</li> </ul> - <ol start="5" style="list-style-type: decimal"> + <ol start="4" style="list-style-type: decimal"> <li><strong>trimgalore</strong>:</li> </ol> <ul> <li>trimgalore <a href="https://github.com/FelixKrueger/TrimGalore" class="uri">https://github.com/FelixKrueger/TrimGalore</a></li> </ul> - <ol start="6" style="list-style-type: decimal"> + <ol start="5" style="list-style-type: decimal"> <li><strong>hisat2</strong>:</li> </ol> <ul> - <li>Kim ,D.,Paggi, J.M., Park, C., Bennett, C., Salzberg, S.L. 2019 Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. Aug;37(8):907-915. doi:<a href="https://doi.org/10.1038/s41587-019-0201-4">10.1038/s41587-019-0201-4</a>.</li> + <li>Kim, D., Paggi, J. M., Park, C., Bennett, C., & Salzberg, S. L. (2019). Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nature biotechnology, 37(8), 907-915. doi:<a href="https://doi.org/10.1038/s41587-019-0201-4">10.1038/s41587-019-0201-4</a>.</li> </ul> - <ol start="7" style="list-style-type: decimal"> + <ol start="6" style="list-style-type: decimal"> <li><strong>samtools</strong>:</li> </ol> <ul> - <li>Li H., B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin, and 1000 Genome Project Data Processing Subgroup. 2009. The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics 25: 2078-9. doi:<a href="http://dx.doi.org/10.1093/bioinformatics/btp352">10.1093/bioinformatics/btp352</a></li> + <li>Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., ... & Durbin, R. (2009). The sequence alignment/map format and SAMtools. Bioinformatics, 25(16), 2078-2079. doi:<a href="http://dx.doi.org/10.1093/bioinformatics/btp352">10.1093/bioinformatics/btp352</a></li> </ul> - <ol start="8" style="list-style-type: decimal"> + <ol start="7" style="list-style-type: decimal"> <li><strong>picard</strong>:</li> </ol> <ul> <li>“Picard Toolkit.†2019. Broad Institute, GitHub Repository. <a href="http://broadinstitute.github.io/picard/" class="uri">http://broadinstitute.github.io/picard/</a>; Broad Institute</li> </ul> - <ol start="9" style="list-style-type: decimal"> + <ol start="8" style="list-style-type: decimal"> <li><strong>featureCounts</strong>:</li> </ol> <ul> - <li>Liao, Y., Smyth, G.K., Shi, W. 2014 featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. Apr 1;30(7):923-30. doi:<a href="https://doi.org/10.1093/bioinformatics/btt656">10.1093/bioinformatics/btt656</a>.</li> + <li>Liao, Y., Smyth, G. K., & Shi, W. (2014). featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics, 30(7), 923-930. doi:<a href="https://doi.org/10.1093/bioinformatics/btt656">10.1093/bioinformatics/btt656</a>.</li> </ul> - <ol start="11" style="list-style-type: decimal"> + <ol start="9" style="list-style-type: decimal"> <li><strong>deeptools</strong>:</li> </ol> <ul> - <li>RamÃrez, F., D. P. Ryan, B. Grüning, V. Bhardwaj, F. Kilpert, A. S. Richter, S. Heyne, F. Dündar, and T. Manke. 2016. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Research 44: W160-165. doi:<a href="http://dx.doi.org/10.1093/nar/gkw257">10.1093/nar/gkw257</a></li> + <li>RamÃrez, F., Ryan, D. P., Grüning, B., Bhardwaj, V., Kilpert, F., Richter, A. S., ... & Manke, T. (2016). deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic acids research, 44(W1), W160-W165. doi:<a href="http://dx.doi.org/10.1093/nar/gkw257">10.1093/nar/gkw257</a></li> </ul> - <ol start="11" style="list-style-type: decimal"> + <ol start="10" style="list-style-type: decimal"> <li><strong>Seqtk</strong>:</li> </ol> <ul> <li>Seqtk <a href="https://github.com/lh3/seqtk" class="uri">https://github.com/lh3/seqtk</a></li> </ul> - <ol start="10" style="list-style-type: decimal"> + <ol start="11" style="list-style-type: decimal"> <li><strong>R</strong>:</li> </ol> <ul> @@ -79,27 +79,27 @@ <ul> <li>FastQC <a href="https://www.bioinformatics.babraham.ac.uk/projects/fastqc/" class="uri">https://www.bioinformatics.babraham.ac.uk/projects/fastqc/</a></li> </ul> - <ol start="12" style="list-style-type: decimal"> + <ol start="13" style="list-style-type: decimal"> <li><strong>SeqWho</strong></li> </ol> <ul> - <li>SeqWho <a href="https://git.biohpc.swmed.edu/s181649/seqwho" class="uri">https://git.biohpc.swmed.edu/s181649/seqwho</a></li> + <li>Bennett, C., Thornton, M., Park, C., Henry, G., Zhang, Y., Malladi, V. S., & Kim, D. (2021). SeqWho: Reliable, rapid determination of sequence file identity using k-mer frequencies. bioRxiv, 2021.2003.2010.434827. doi:<a href="https://doi.org/10.1101/2021.03.10.434827">10.1101/2021.03.10.434827</a></li> </ul> - <ol start="4" style="list-style-type: decimal"> + <ol start="14" style="list-style-type: decimal"> <li><strong>RSeQC</strong>:</li> </ol> <ul> <li>Wang, L., Wang, S., Li, W. 2012 RSeQC: quality control of RNA-seq experiments. Bioinformatics. Aug 15;28(16):2184-5. doi:<a href="https://doi.org/10.1093/bioinformatics/bts356">10.1093/bioinformatics/bts356</a>.</li> </ul> - <ol start="13" style="list-style-type: decimal"> + <ol start="15" style="list-style-type: decimal"> <li><strong>MultiQC</strong>:</li> </ol> <ul> - <li>Ewels P., Magnusson M., Lundin S. and Käller M. 2016. MultiQC: Summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32(19): 3047–3048. doi:<a href="https://dx.doi.org/10.1093/bioinformatics/btw354">10.1093/bioinformatics/btw354</a></li> + <li>Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics, 32(19), 3047-3048. doi:<a href="https://dx.doi.org/10.1093/bioinformatics/btw354">10.1093/bioinformatics/btw354</a></li> </ul> - <ol start="14" style="list-style-type: decimal"> + <ol start="16" style="list-style-type: decimal"> <li><strong>Nextflow</strong>:</li> </ol> <ul> - <li>Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E., and Notredame, C. 2017. Nextflow enables reproducible computational workflows. Nature biotechnology, 35(4), 316.</li> + <li>Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E., & Notredame, C. (2017). Nextflow enables reproducible computational workflows. Nature biotechnology, 35(4), 316-319.</li> </ul> diff --git a/docs/software_versions_mqc.yaml b/docs/software_versions_mqc.yaml index 61c00f4b803d5da66cd19b930a1a15f74fa521ea..06585acbc9964a76d4ba1dff62a4d4464b73bd8e 100644 --- a/docs/software_versions_mqc.yaml +++ b/docs/software_versions_mqc.yaml @@ -7,20 +7,20 @@ data: | <dl class="dl-horizontal"> - <dt>Python</dt><dd>v3.8.3</dd> - <dt>DERIVA</dt><dd>v1.4.3</dd> - <dt>BDBag</dt><dd>v1.5.6</dd> - <dt>Trim Galore!</dt><dd>v0.6.4_dev</dd> + <dt>Python</dt><dd>v3.8.5</dd> + <dt>DERIVA</dt><dd>v1.4.5</dd> + <dt>BDBag</dt><dd>v1.6.0</dd> + <dt>Trim Galore!</dt><dd>v0.6.6</dd> <dt>HISAT2</dt><dd>v2.2.1</dd> <dt>Samtools</dt><dd>v1.11</dd> - <dt>picard (MarkDuplicates)</dt><dd>v2.23.9</dd> + <dt>picard (MarkDuplicates)</dt><dd>v2.25.0</dd> <dt>featureCounts</dt><dd>v2.0.1</dd> <dt>deepTools</dt><dd>v3.5.0</dd> <dt>Seqtk</dt><dd>v1.3-r106</dd> - <dt>R</dt><dd>v4.0.3</dd> + <dt>R</dt><dd>v4.0.4</dd> <dt>FastQC</dt><dd>v0.11.9</dd> <dt>SeqWho</dt><dd>vBeta-1.0.0</dd> <dt>RSeQC</dt><dd>v4.0.0</dd> - <dt>MultiQC</dt><dd>v1.9</dd> + <dt>MultiQC</dt><dd>v1.10</dd> <dt>Pipeline Version</dt><dd>v2.0.0rc01</dd> </dl> diff --git a/nextflow.config b/nextflow.config index f59068528cbbebf0e58855b6ad385e011a1eca0e..2d00ca96f843b3f8256d7fb3ea084ab08833cdaa 100644 --- a/nextflow.config +++ b/nextflow.config @@ -26,91 +26,91 @@ profiles { process { withName:trackStart { - container = 'gudmaprbk/gudmap-rbk_base:1.0.0' + container = 'gudmaprbk/gudmap-rbk_base:1.0.1' } withName:getBag { - container = 'gudmaprbk/deriva1.4:1.0.0' + container = 'gudmaprbk/deriva1.4:1.0.1' } withName:getData { - container = 'gudmaprbk/deriva1.4:1.0.0' + container = 'gudmaprbk/deriva1.4:1.0.1' } withName:parseMetadata { - container = 'gudmaprbk/python3:1.0.0' + container = 'gudmaprbk/python3:1.0.1' } withName:getRefERCC { - container = 'gudmaprbk/deriva1.4:1.0.0' + container = 'gudmaprbk/deriva1.4:1.0.1' } withName:getRef { - container = 'gudmaprbk/deriva1.4:1.0.0' + container = 'gudmaprbk/deriva1.4:1.0.1' } withName:fastqc { - container = 'gudmaprbk/fastqc0.11.9:1.0.0' + container = 'gudmaprbk/fastqc0.11.9:1.0.1' } withName:seqwho { - container = 'gudmaprbk/seqwho0.0.1:1.0.0' + container = 'gudmaprbk/seqwho1.0.0:1.0.0' } withName:trimData { - container = 'gudmaprbk/trimgalore0.6.5:1.0.0' + container = 'gudmaprbk/trimgalore0.6.6:1.0.0' } withName:downsampleData { - container = 'gudmaprbk/seqtk1.3:1.0.0' + container = 'gudmaprbk/seqtk1.3:1.0.1' } withName:alignSampleDataERCC { - container = 'gudmaprbk/hisat2.2.1:1.0.0' + container = 'gudmaprbk/hisat2.2.1:1.0.1' } withName:alignSampleData { - container = 'gudmaprbk/hisat2.2.1:1.0.0' + container = 'gudmaprbk/hisat2.2.1:1.0.1' } withName:inferMetadata { - container = 'gudmaprbk/rseqc4.0.0:1.0.0' + container = 'gudmaprbk/rseqc4.0.0:1.0.1' } withName:checkMetadata { - container = 'gudmaprbk/gudmap-rbk_base:1.0.0' + container = 'gudmaprbk/gudmap-rbk_base:1.0.1' } withName:alignData { - container = 'gudmaprbk/hisat2.2.1:1.0.0' + container = 'gudmaprbk/hisat2.2.1:1.0.1' } withName:dedupData { - container = 'gudmaprbk/picard2.23.9:1.0.0' + container = 'gudmaprbk/picard2.25.0:1.0.0' } withName:countData { - container = 'gudmaprbk/subread2.0.1:1.0.0' + container = 'gudmaprbk/subread2.0.1:1.0.1' } withName:makeBigWig { - container = 'gudmaprbk/deeptools3.5.0:1.0.0' + container = 'gudmaprbk/deeptools3.5.0:1.0.1' } withName:dataQC { - container = 'gudmaprbk/rseqc4.0.0:1.0.0' + container = 'gudmaprbk/rseqc4.0.0:1.0.1' } withName:aggrQC { - container = 'gudmaprbk/multiqc1.9:1.0.0' + container = 'gudmaprbk/multiqc1.10:1.0.0' } withName:uploadInputBag { - container = 'gudmaprbk/deriva1.4:1.0.0' + container = 'gudmaprbk/deriva1.4:1.0.1' } withName:uploadExecutionRun { - container = 'gudmaprbk/deriva1.4:1.0.0' + container = 'gudmaprbk/deriva1.4:1.0.1' } withName:uploadQC { - container = 'gudmaprbk/deriva1.4:1.0.0' + container = 'gudmaprbk/deriva1.4:1.0.1' } withName:uploadProcessedFile { - container = 'gudmaprbk/deriva1.4:1.0.0' + container = 'gudmaprbk/deriva1.4:1.0.1' } withName:uploadOutputBag { - container = 'gudmaprbk/deriva1.4:1.0.0' + container = 'gudmaprbk/deriva1.4:1.0.1' } withName:finalizeExecutionRun { - container = 'gudmaprbk/deriva1.4:1.0.0' + container = 'gudmaprbk/deriva1.4:1.0.1' } withName:failPreExecutionRun { - container = 'gudmaprbk/deriva1.4:1.0.0' + container = 'gudmaprbk/deriva1.4:1.0.1' } withName:failExecutionRun { - container = 'gudmaprbk/deriva1.4:1.0.0' + container = 'gudmaprbk/deriva1.4:1.0.1' } withName:uploadQC_fail { - container = 'gudmaprbk/deriva1.4:1.0.0' + container = 'gudmaprbk/deriva1.4:1.0.1' } } @@ -140,6 +140,6 @@ manifest { homePage = 'https://git.biohpc.swmed.edu/gudmap_rbk/rna-seq' description = 'This pipeline was created to be a standard mRNA-sequencing analysis pipeline which integrates with the GUDMAP and RBK consortium data-hub.' mainScript = 'rna-seq.nf' - version = 'v2.0.0rc01' + version = 'v2.0.0rc02' nextflowVersion = '>=19.09.0' } diff --git a/rna-seq.nf b/rna-seq.nf index 6aa3926666585353b40688f976bdd49431b375b0..6f77afd99d636e773e26e1abe9eb9c1012c461d4 100644 --- a/rna-seq.nf +++ b/rna-seq.nf @@ -723,13 +723,13 @@ process seqwho { echo -e "LOG: seqwho ran" >> ${repRID}.seqwho.log # parse inference from R1 - speciesR1=\$(cat SeqWho_call.tsv | grep ${fastq[0]} | cut -f17 -d\$'\t' | cut -f2 -d":" | tr -d " ") - seqtypeR1=\$(cat SeqWho_call.tsv | grep ${fastq[0]} | cut -f18 -d\$'\t' | cut -f2 -d":" | tr -d " ") - confidenceR1=\$(cat SeqWho_call.tsv | grep ${fastq[0]} | cut -f16 -d\$'\t' | cut -f2 -d":" | tr -d " ") + speciesR1=\$(cat SeqWho_call.tsv | grep ${fastq[0]} | cut -f18 -d\$'\t' | cut -f2 -d":" | tr -d " ") + seqtypeR1=\$(cat SeqWho_call.tsv | grep ${fastq[0]} | cut -f19 -d\$'\t' | cut -f2 -d":" | tr -d " ") + confidenceR1=\$(cat SeqWho_call.tsv | grep ${fastq[0]} | cut -f17 -d\$'\t' | cut -f2 -d":" | tr -d " ") if [ "\${confidenceR1}" == "low" ] then - speciesConfidenceR1=\$(cat SeqWho_call.tsv | grep ${fastq[0]} | cut -f16 -d\$'\t' | cut -f3 -d":" | tr -d " ") - seqtypeConfidenceR1=\$(cat SeqWho_call.tsv | grep ${fastq[0]} | cut -f16 -d\$'\t' | cut -f4 -d":" | tr -d " ") + speciesConfidenceR1=\$(cat SeqWho_call.tsv | grep ${fastq[0]} | cut -f17 -d\$'\t' | cut -f3 -d":" | tr -d " ") + seqtypeConfidenceR1=\$(cat SeqWho_call.tsv | grep ${fastq[0]} | cut -f17 -d\$'\t' | cut -f4 -d":" | tr -d " ") else speciesConfidenceR1="1" seqtypeConfidenceR1="1" @@ -739,13 +739,13 @@ process seqwho { # parse inference from R2 if [ "${ends}" == "pe" ] then - speciesR2=\$(cat SeqWho_call.tsv | grep ${fastq[1]} | cut -f17 -d\$'\t' | cut -f2 -d":" | tr -d " ") - seqtypeR2=\$(cat SeqWho_call.tsv | grep ${fastq[1]} | cut -f18 -d\$'\t' | cut -f2 -d":" | tr -d " ") - confidenceR2=\$(cat SeqWho_call.tsv | grep ${fastq[1]} | cut -f16 -d\$'\t' | cut -f2 -d":" | tr -d " ") + speciesR2=\$(cat SeqWho_call.tsv | grep ${fastq[1]} | cut -f18 -d\$'\t' | cut -f2 -d":" | tr -d " ") + seqtypeR2=\$(cat SeqWho_call.tsv | grep ${fastq[1]} | cut -f19 -d\$'\t' | cut -f2 -d":" | tr -d " ") + confidenceR2=\$(cat SeqWho_call.tsv | grep ${fastq[1]} | cut -f17 -d\$'\t' | cut -f2 -d":" | tr -d " ") if [ "\${confidenceR2}" == "low" ] then - speciesConfidenceR2=\$(cat SeqWho_call.tsv | grep ${fastq[1]} | cut -f16 -d\$'\t' | cut -f3 -d":" | tr -d " ") - seqtypeConfidenceR2=\$(cat SeqWho_call.tsv | grep ${fastq[1]} | cut -f16 -d\$'\t' | cut -f4 -d":" | tr -d " ") + speciesConfidenceR2=\$(cat SeqWho_call.tsv | grep ${fastq[1]} | cut -f17 -d\$'\t' | cut -f3 -d":" | tr -d " ") + seqtypeConfidenceR2=\$(cat SeqWho_call.tsv | grep ${fastq[1]} | cut -f17 -d\$'\t' | cut -f4 -d":" | tr -d " ") else speciesConfidenceR2="1" seqtypeConfidenceR2="1" @@ -860,9 +860,9 @@ process seqwho { gzip sampled.1.seed300.fastq & wait seqwho.py -f sampled.1.seed*.fastq.gz -x SeqWho.ix - seqtypeR1_1=\$(cat SeqWho_call.tsv | grep sampled.1.seed100.fastq.gz | cut -f18 -d\$'\t' | cut -f2 -d":" | tr -d " ") - seqtypeR1_2=\$(cat SeqWho_call.tsv | grep sampled.1.seed200.fastq.gz | cut -f18 -d\$'\t' | cut -f2 -d":" | tr -d " ") - seqtypeR1_3=\$(cat SeqWho_call.tsv | grep sampled.1.seed300.fastq.gz | cut -f18 -d\$'\t' | cut -f2 -d":" | tr -d " ") + seqtypeR1_1=\$(cat SeqWho_call.tsv | grep sampled.1.seed100.fastq.gz | cut -f19 -d\$'\t' | cut -f2 -d":" | tr -d " ") + seqtypeR1_2=\$(cat SeqWho_call.tsv | grep sampled.1.seed200.fastq.gz | cut -f19 -d\$'\t' | cut -f2 -d":" | tr -d " ") + seqtypeR1_3=\$(cat SeqWho_call.tsv | grep sampled.1.seed300.fastq.gz | cut -f19 -d\$'\t' | cut -f2 -d":" | tr -d " ") cp SeqWho_call.tsv SeqWho_call_sampledR1.tsv if [ "\${seqtypeR1_1}" == "\${seqtypeR1}" ] && [ "\${seqtypeR1_2}" == "\${seqtypeR1}" ] && [ "\${seqtypeR1_3}" == "\${seqtypeR1}" ] then @@ -881,9 +881,9 @@ process seqwho { gzip sampled.2.seed300.fastq & wait seqwho.py -f sampled.2.seed*.fastq.gz -x SeqWho.ix - seqtypeR2_1=\$(cat SeqWho_call.tsv | grep sampled.2.seed100.fastq.gz | cut -f18 -d\$'\t' | cut -f2 -d":" | tr -d " ") - seqtypeR2_2=\$(cat SeqWho_call.tsv | grep sampled.2.seed200.fastq.gz | cut -f18 -d\$'\t' | cut -f2 -d":" | tr -d " ") - seqtypeR2_3=\$(cat SeqWho_call.tsv | grep sampled.2.seed300.fastq.gz | cut -f18 -d\$'\t' | cut -f2 -d":" | tr -d " ") + seqtypeR2_1=\$(cat SeqWho_call.tsv | grep sampled.2.seed100.fastq.gz | cut -f19 -d\$'\t' | cut -f2 -d":" | tr -d " ") + seqtypeR2_2=\$(cat SeqWho_call.tsv | grep sampled.2.seed200.fastq.gz | cut -f19 -d\$'\t' | cut -f2 -d":" | tr -d " ") + seqtypeR2_3=\$(cat SeqWho_call.tsv | grep sampled.2.seed300.fastq.gz | cut -f19 -d\$'\t' | cut -f2 -d":" | tr -d " ") cp SeqWho_call.tsv SeqWho_call_sampledR2.tsv if [ "\${seqtypeR2_1}" == "\${seqtypeR1}" ] && [ "\${seqtypeR2_2}" == "\${seqtypeR1}" ] && [ "\${seqtypeR2_3}" == "\${seqtypeR1}" ] then diff --git a/workflow/scripts/get_updated_badge_info.sh b/workflow/scripts/get_updated_badge_info.sh index f4a30ec5e0ec7dd84146e4c701fec2876be802e0..3a5df46c52a6e1fe0cbd41946cdea09c67d1e08e 100644 --- a/workflow/scripts/get_updated_badge_info.sh +++ b/workflow/scripts/get_updated_badge_info.sh @@ -2,29 +2,29 @@ echo "collecting stats for badges" latest_release_tag=$(git tag --sort=-committerdate -l *.*.* | head -1) -current_pipeline_version=$(git show ${latest_release_tag}:nextflow.config | grep -o version.* | grep -oP "(?<=').*(?=')") -current_nextflow_version=$(git show ${latest_release_tag}:nextflow.config | grep -o nextflowVersion.* | grep -oP "(?<=').*(?=')") -master_pipeline_version=$(git show origin/master:nextflow.config | grep -o version.* | grep -oP "(?<=').*(?=')") -master_nextflow_version=$(git show origin/master:nextflow.config | grep -o nextflowVersion.* | grep -oP "(?<=').*(?=')") +current_pipeline_version=$(git show ${latest_release_tag}:nextflow.config | grep -o version.* | grep -oP "(?<=').*(?=')" | tr "-" _) +current_nextflow_version=$(git show ${latest_release_tag}:nextflow.config | grep -o nextflowVersion.* | grep -oP "(?<=').*(?=')" | tr "-" _) +master_pipeline_version=$(git show origin/master:nextflow.config | grep -o version.* | grep -oP "(?<=').*(?=')" | tr "-" _) +master_nextflow_version=$(git show origin/master:nextflow.config | grep -o nextflowVersion.* | grep -oP "(?<=').*(?=')" | tr "-" _) develop_pipeline_version=$(git show origin/develop:nextflow.config | grep -o version.* | grep -oP "(?<=').*(?=')") -develop_nextflow_version=$(git show origin/develop:nextflow.config | grep -o nextflowVersion.* | grep -oP "(?<=').*(?=')") +develop_nextflow_version=$(git show origin/develop:nextflow.config | grep -o nextflowVersion.* | grep -oP "(?<=').*(?=')" | tr "-" _) echo "collecting tool version for badges" -python_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o Python.* | grep -oP "(?<=d>).*(?=\<)") -deriva_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o DERIVA.* | grep -oP "(?<=d>).*(?=\<)") -bdbag_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o BDBag.* | grep -oP "(?<=d>).*(?=\<)") -trimgalore_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o 'Trim Galore!'.* | grep -oP "(?<=d>).*(?=\<)") -hisat2_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o HISAT2.* | grep -oP "(?<=d>).*(?=\<)") -samtools_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o Samtools.* | grep -oP "(?<=d>).*(?=\<)") -picard_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o 'picard (MarkDuplicates)'.* | grep -oP "(?<=d>).*(?=\<)") -featurecounts_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o featureCounts.* | grep -oP "(?<=d>).*(?=\<)") -deeptools_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o deepTools.* | grep -oP "(?<=d>).*(?=\<)") -seqtk_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o Seqtk.* | grep -oP "(?<=d>).*(?=\<)") -r_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o '>R<'.* | grep -oP "(?<=d>).*(?=\<)") -fastqc_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o FastQC.* | grep -oP "(?<=d>).*(?=\<)") -seqwho_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o SeqWho.* | grep -oP "(?<=d>).*(?=\<)") -rseqc_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o RSeQC.* | grep -oP "(?<=d>).*(?=\<)") -multiqc_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o MultiQC.* | grep -oP "(?<=d>).*(?=\<)") +python_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o Python.* | grep -oP "(?<=d>).*(?=\<)" | tr "-" _) +deriva_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o DERIVA.* | grep -oP "(?<=d>).*(?=\<)" | tr "-" _) +bdbag_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o BDBag.* | grep -oP "(?<=d>).*(?=\<)" | tr "-" _) +trimgalore_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o 'Trim Galore!'.* | grep -oP "(?<=d>).*(?=\<)" | tr "-" _) +hisat2_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o HISAT2.* | grep -oP "(?<=d>).*(?=\<)" | tr "-" _) +samtools_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o Samtools.* | grep -oP "(?<=d>).*(?=\<)" | tr "-" _) +picard_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o 'picard (MarkDuplicates)'.* | grep -oP "(?<=d>).*(?=\<)" | tr "-" _) +featurecounts_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o featureCounts.* | grep -oP "(?<=d>).*(?=\<)" | tr "-" _) +deeptools_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o deepTools.* | grep -oP "(?<=d>).*(?=\<)" | tr "-" _) +seqtk_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o Seqtk.* | grep -oP "(?<=d>).*(?=\<)" | tr "-" _) +r_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o '>R<'.* | grep -oP "(?<=d>).*(?=\<)" | tr "-" _) +fastqc_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o FastQC.* | grep -oP "(?<=d>).*(?=\<)" | tr "-" _) +seqwho_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o SeqWho.* | grep -oP "(?<=d>).*(?=\<)" | tr "-" _) +rseqc_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o RSeQC.* | grep -oP "(?<=d>).*(?=\<)" | tr "-" _) +multiqc_version=$(git show ${latest_release_tag}:docs/software_versions_mqc.yaml | grep -o MultiQC.* | grep -oP "(?<=d>).*(?=\<)" | tr "-" _) echo "collecting badges" mkdir -p ./badges/tools