Skip to content
Snippets Groups Projects
Gervaise H. Henry's avatar
7ec6bd65
master develop
pipeline status pipeline status

RNA-Seq Analytic Pipeline for GUDMAP/RBK

Introduction

This pipeline was created to be a standard mRNA-sequencing analysis pipeline which integrates with the GUDMAP and RBK consortium data-hub.

flowchart

Cloud Compatibility:

This pipeline is also capable of being run on AWS. To do so:

  • Build a AWS batch queue and environment either manually or with aws-cloudformantion
  • Edit one of the aws configs in workflow/config/
    • Replace workDir with the S3 bucket generated
    • Change region if different
    • Change queue to the aws batch queue generated
  • The user must have awscli configured with an appropriate authentication (with aws configure and access keys) in the environment which nextflow will be run
  • Add -profile with the name aws config which was customized

To Run:

  • Available parameters:
    • --deriva active credential.json file from deriva-auth
    • --bdbag active cookies.txt file from deriva-auth
    • --repRID mRNA-seq replicate RID
    • --refMoVersion mouse reference version (optional)
    • --refHuVersion human reference version (optional)
    • -profile config profile to use: standard = local processes on BioHPC (default), biohpc = BioHPC cluster, aws_ondemand = AWS Batch on-demand instant requests, aws_spot = AWS Batch spot instance requests (optional)
  • NOTES:
    • once deriva-auth is run and authenticated, the two files above are saved in ~/.deriva/ (see official documents from deriva on the lifetime of the credentials)
    • reference version consists of Genome Reference Consortium version, patch release and GENCODE annotation release # (leaving the params blank will use the default version tied to the pipeline version)
      • current mouse 38.p6.vM22 = GRCm38.p6 with GENCODE annotation release M22
      • current human 38.p6.v31 = GRCh38.p12 with GENCODE annotation release 31

FULL EXAMPLE:

nextflow run workflow/rna-seq.nf --deriva ./data/credential.json --bdbag ./data/cookies.txt --repRID Q-Y5JA

CHANGELOG


Credits

This workflow is was developed by Bioinformatic Core Facility (BICF), Department of Bioinformatics

PI

Venkat S. Malladi
Faculty Associate & Director
Bioinformatics Core Facility
UT Southwestern Medical Center
ORCID iD iconorcid.org/0000-0002-0144-0564
venkat.malladi@utsouthwestern.edu

Developers

Gervaise H. Henry
Computational Biologist
Department of Urology
UT Southwestern Medical Center
ORCID iD iconorcid.org/0000-0001-7772-9578
gervaise.henry@utsouthwestern.edu

Jonathan Gesell
Computational Biologist
Bioinformatics Core Facility
UT Southwestern Medical Center
ORCID iD iconorcid.org/0000-0001-5902-3299
johnathan.gesell@utsouthwestern.edu

Jeremy A. Mathews
Computational Intern
Bioinformatics Core Facility
UT Southwestern Medical Center
ORCID iD iconorcid.org/0000-0002-2931-1430
jeremy.mathews@utsouthwestern.edu

Please cite in publications: Pipeline was developed by BICF from funding provided by Cancer Prevention and Research Institute of Texas (RP150596).



Pipeline Directed Acyclic Graph

dag