Skip to content
Snippets Groups Projects
Commit e6516dc5 authored by Brandi Cantarel's avatar Brandi Cantarel
Browse files

Update README.md -- design files misrepresented

parent 6417cd7b
Branches
Tags
No related merge requests found
Pipeline #10129 failed with stage
in 25 minutes and 1 second
# RNASeq Analysis Worklow # RNASeq Analysis Worklow
This workflow can be run in with the whole genome, or with a specific list of genes of interest. This workflow can be run in with the whole genome, or with a specific list of genes of interest.
## Initiate Nextflow Workflows ## Initiate Nextflow Workflows
### Required Tools ### Required Tools
This pipeline uses [Nextflow](https://www.nextflow.io/docs/latest/index.html), a bioinformatics workflow tool and [Singularity](https://sylabs.io/docs/), a containerization tool. This pipeline uses [Nextflow](https://www.nextflow.io/docs/latest/index.html), a bioinformatics workflow tool and [Singularity](https://sylabs.io/docs/), a containerization tool.
Make sure both tools rae installed before running this pipeline. If running on a HPC cluster then load required modules. Make sure both tools rae installed before running this pipeline. If running on a HPC cluster then load required modules.
``` ```
module load nextflow/20.01.0 singularity/3.5.3 module load nextflow/20.01.0 singularity/3.5.3
``` ```
### RNA Design File ### RNA Design File
The design file must named design.txt and be in tab seperated format for the workflows. All RNA workflows can be run usin the same design file format. You can run in single-end mode with blank cells in the FqR2 column. The design file must named design.txt and be in tab seperated format for the workflows. All RNA workflows can be run usin the same design file format. You can run in single-end mode with blank cells in the FqR2 column.
| SampleID | CaseID | FqR1 | FqR2 | | SampleID | SampleGroup | SubjectID | SampleName | FqR1 | FqR2 |
|---|---|---|---| |---|---|---|---|---|---|
| Sample1 | Fam1 | Sample1.R1.fastq.gz | Sample1.R2.fastq.gz | | SRR1551069 | monocytes | 53 | 53_Monocytes | SRR1551069_1.fastq.gz | SRR1551069_2.fastq.gz |
| Sample2 | Fam1 | Sample2.R1.fastq.gz | Sample2.R2.fastq.gz | | SRR1551068 | neutrophils | 53 | 53_Neutrophils | SRR1551068_1.fastq.gz | SRR1551068_2.fastq.gz |
| Sample3 | Fam2 | Sample3.R1.fastq.gz | Sample3.R2.fastq.gz | | SRR1551055 | monocytes | 21 | 21_Monocytes | SRR1551055_1.fastq.gz | SRR1551055_2.fastq.gz |
| Sample4 | Fam2 | Sample4.R1.fastq.gz | Sample4.R2.fastq.gz | | SRR1551054 | neutrophils | 21 | 21_Neutrophils | SRR1551054_1.fastq.gz | SRR1551054_2.fastq.gz |
| SRR1551048 | monocytes | 20 | 20_Monocytes | SRR1551048_1.fastq.gz | SRR1551048_2.fastq.gz |
| SRR1551047 | neutrophils | 20 | 20_Neutrophils | SRR1551047_1.fastq.gz | SRR1551047_2.fastq.gz |
### RNA Parameters | SRR1550987 | monocytes | 44 | 44_Monocytes | SRR1550987_1.fastq.gz | SRR1550987_2.fastq.gz |
* **--input** | SRR1550986 | neutrophils | 44 | 44_Neutrophils | SRR1550986_1.fastq.gz | SRR1550986_2.fastq.gz |
* directory containing the design file and fastq files
* default is set to *'${basedir}/fastq'*
* eg: **--input '/project/shared/bicf_workflow_ref/workflow_testdata/rnaseq/fastq'** ### RNA Parameters
* **--output** * **--input**
* directory for the analysis output * directory containing the design file and fastq files
* default is set to *'${basedir}/analysis'* * default is set to *'${basedir}/fastq'*
* eg: **--output '${basedir}/output'** * eg: **--input '/project/shared/bicf_workflow_ref/workflow_testdata/rnaseq/fastq'**
* **--genome** * **--output**
* directory containing all reference files for the various tools. This includes the genome.fa, gencode.gtf, genenames.txt, ect. * directory for the analysis output
* default is set for use on UTSW BioHPC. * default is set to *'${basedir}/analysis'*
* eg: **--genome '/project/shared/bicf_workflow_ref/human/grch38_cloud/rnaref'** * eg: **--output '${basedir}/output'**
* **--stranded** * **--genome**
* option for -s flag in featurecount used in geneabundance calculations * directory containing all reference files for the various tools. This includes the genome.fa, gencode.gtf, genenames.txt, ect.
* default is set to *'0'* * default is set for use on UTSW BioHPC.
* eg: **--stranded '0'** * eg: **--genome '/project/shared/bicf_workflow_ref/human/grch38_cloud/rnaref'**
* **--pairs** * **--stranded**
* select either 'pe' (paired-end) or 'se' (single-end) based on read inputs. Select 'pe' when both R1 and R2 are present. If only R1, then select 'se'. * option for -s flag in featurecount used in geneabundance calculations
* default is set to *'pe'* * default is set to *'0'*
* eg: **--pairs 'pe'** * eg: **--stranded '0'**
* **--align** * **--pairs**
* select the algorithm/tool for alignment from 'hisat' or 'star' * select either 'pe' (paired-end) or 'se' (single-end) based on read inputs. Select 'pe' when both R1 and R2 are present. If only R1, then select 'se'.
* default is set to *'hisat'* * default is set to *'pe'*
* eg: **--align 'hisat'** * eg: **--pairs 'pe'**
* **--markdups** * **--align**
* select either picard (Mark Duplicates) or null (do not Mark Duplicates) * select the algorithm/tool for alignment from 'hisat' or 'star'
* default is set to *'picard'* * default is set to *'hisat'*
* eg: **--align 'picard'** * eg: **--align 'hisat'**
* **--markdups**
### RNA Run Workflow Testing * select either picard (Mark Duplicates) or null (do not Mark Duplicates)
* default is set to *'picard'*
Human PE * eg: **--align 'picard'**
``` ### RNA Run Workflow Testing
module load nextflow/20.01.0 singularity/3.5.3
base=$repoClonedDirectory Human PE
datadir='/project/shared/bicf_workflow_ref/workflow_testdata/rnaseq'
```
nextflow -C ${base}/nextflow.config run ${base}/workflow/main.nf --design ${datadir}/design.rnaseq.txt --input ${datadir} --output analysis module load nextflow/20.01.0 singularity/3.5.3
base=$repoClonedDirectory
``` datadir='/project/shared/bicf_workflow_ref/workflow_testdata/rnaseq'
Mouse SE nextflow -C ${base}/nextflow.config run ${base}/workflow/main.nf --design ${datadir}/design.rnaseq.txt --input ${datadir} --output analysis
``` ```
module load nextflow/20.01.0 singularity/3.5.3
base=$repoClonedDirectory Mouse SE
datadir='/project/shared/bicf_workflow_ref/workflow_testdata/rnaseq'
```
nextflow -C ${base}/nextflow.config run -with-dag flowchart.png -with-timeline mouse_timeline.html -with-report mouse_report.html ${base}/workflow/main.nf --design ${datadir}/mouse_se.design.txt --input ${datadir} --pairs se --output analysis module load nextflow/20.01.0 singularity/3.5.3
base=$repoClonedDirectory
``` datadir='/project/shared/bicf_workflow_ref/workflow_testdata/rnaseq'
nextflow -C ${base}/nextflow.config run -with-dag flowchart.png -with-timeline mouse_timeline.html -with-report mouse_report.html ${base}/workflow/main.nf --design ${datadir}/mouse_se.design.txt --input ${datadir} --pairs se --output analysis
```
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment