### References

1. **python**:
  * Anaconda (Anaconda Software Distribution, [https://anaconda.com](https://anaconda.com))

2. **DERIVA**:
  * Bugacov, A., Czajkowski, K., Kesselman, C., Kumar, A., Schuler, R. E., & Tangmunarunkit, H. (2017, October). Experiences with DERIVA: An asset management platform for accelerating eScience. In 2017 IEEE 13th International Conference on e-Science (e-Science) (pp. 79-88). IEEE. doi:[10.1109/eScience.2017.20](https://doi.org/10.1109/eScience.2017.20).

3. **BDBag**:  
  * Madduri, R., Chard, K., D’Arcy, M., Jung, S. C., Rodriguez, A., Sulakhe, D., ... & Foster, I. (2019). Reproducible big data science: A case study in continuous FAIRness. PloS one, 14(4), e0213013. doi:[10.1371/journal.pone.0213013](https://doi.org/10.1371/journal.pone.0213013).

4. **trimgalore**:
  * trimgalore [https://github.com/FelixKrueger/TrimGalore](https://github.com/FelixKrueger/TrimGalore)

5. **hisat2**:
  * Kim, D., Paggi, J. M., Park, C., Bennett, C., & Salzberg, S. L. (2019). Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nature biotechnology, 37(8), 907-915. doi:[10.1038/s41587-019-0201-4](https://doi.org/10.1038/s41587-019-0201-4).

6. **samtools**:
  * Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., ... & Durbin, R. (2009). The sequence alignment/map format and SAMtools. Bioinformatics, 25(16), 2078-2079. doi:[10.1093/bioinformatics/btp352](http://dx.doi.org/10.1093/bioinformatics/btp352)

7. **picard**:
  * “Picard Toolkit.” 2019. Broad Institute, GitHub Repository. [http://broadinstitute.github.io/picard/](http://broadinstitute.github.io/picard/); Broad Institute

8. **featureCounts**:
  * Liao, Y., Smyth, G. K., & Shi, W. (2014). featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics, 30(7), 923-930. doi:[10.1093/bioinformatics/btt656](https://doi.org/10.1093/bioinformatics/btt656).

9. **deeptools**:
  * Ramírez, F., Ryan, D. P., Grüning, B., Bhardwaj, V., Kilpert, F., Richter, A. S., ... & Manke, T. (2016). deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic acids research, 44(W1), W160-W165. doi:[10.1093/nar/gkw257](http://dx.doi.org/10.1093/nar/gkw257)

10. **Seqtk**:
  * Seqtk [https://github.com/lh3/seqtk](https://github.com/lh3/seqtk)

11. **R**:
  * R Core Team 2014. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL:[http://www.R-project.org/](http://www.R-project.org/).

12. **FastQC**
  * FastQC [https://www.bioinformatics.babraham.ac.uk/projects/fastqc/](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)

13. **SeqWho**
  * Bennett, C., Thornton, M., Park, C., Henry, G., Zhang, Y., Malladi, V. S., & Kim, D. (2021). SeqWho: Reliable, rapid determination of sequence file identity using k-mer frequencies. bioRxiv, 2021.2003.2010.434827. doi:[10.1101/2021.03.10.434827](https://doi.org/10.1101/2021.03.10.434827)

14. **RSeQC**:
  * Wang, L., Wang, S., Li, W. 2012 RSeQC: quality control of RNA-seq experiments. Bioinformatics. Aug 15;28(16):2184-5. doi:[10.1093/bioinformatics/bts356](https://doi.org/10.1093/bioinformatics/bts356).

15. **MultiQC**:
  * Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics, 32(19), 3047-3048. doi:[10.1093/bioinformatics/btw354](https://dx.doi.org/10.1093/bioinformatics/btw354)

16. **Nextflow**:
  * Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E., & Notredame, C. (2017). Nextflow enables reproducible computational workflows. Nature biotechnology, 35(4), 316-319.