diff --git a/README.md b/README.md index 9ddb0e6d7c6576420bc9dde5b41ad311337f30a9..56f8013b9f27f59938638d738b055a0b11731950 100755 --- a/README.md +++ b/README.md @@ -10,4 +10,85 @@ The pipeline uses Nextflow, a bioinformatics workflow tool. This pipeline is primarily used with a SLURM cluster on the BioHPC Cluster. However, the pipeline should be able to run on any system that Nextflow supports. -Additionally, the pipeline is designed to work with Astrocyte Workflow System using a simple web interface. \ No newline at end of file +Additionally, the pipeline is designed to work with Astrocyte Workflow System using a simple web interface. + +To Run: +------- + +* Available parameters: + * **--fastq** + * path to the fastq location + * R1 and R2 only necessary but can include I2 + * eg: **--fastq '/project/shared/bicf_workflow_ref/workflow_testdata/cellranger/cellranger_count/v3s2r100k/\*.fastq.gz'** + * **--designFile** + * path to design file (csv format) location + * column 1 = "Sample" + * column 2 = "fastq_R1" + * column 3 = "fastq_R2" + * can have repeated "Sample" if there are multiole fastq R1/R2 pairs for the samples + * eg: **--designFile '/project/shared/bicf_workflow_ref/workflow_testdata/cellranger/cellranger_count/v3s2r100k/design.csv'** + * **--genome** + * reference genome + * requires workflow/conf/biohpc.config to work + * name of available 10x Gemomics premade reference genomes: + * *'GRCh38-3.0.0'* = Human GRCh38 release 93 + * *'GRCh38-1.2.0'* = Human GRCh38 release 84 + * *'hg19-3.0.0'* = Human GRCh37 (hg19) release 87 + * *'hg19-1.2.0'* = Human GRCh37 (hg19) release 84 + * *'mm10-3.0.0'* = Human GRCm38 (mm10) release 93 + * *'mm10-3.0.0'* = Human GRCm38 (mm10) release 84 + * *'hg19_and_mm10-3.0.0'* = Human GRCh37 (hg19) + Mouse GRCm38 (mm19) release 93 + * *'hg19_and_mm10-1.2.0'* = Human GRCh37 (hg19) + Mouse GRCm38 (mm19) release 84 + * *'ercc92-1.2.0'* = ERCC.92 Spike-In + * if --genome is used then --genomeLocationFull is not necessary + * eg: **--genome 'GRCh38-3.0.0'** + * **--genomeLocationFull** + * path to a custom genome + * if --genomeLocationFull is used --genome is not necessary and is overwritten + * eg. **--genomeLocationFull '/project/apps_database/cellranger/refdata-cellranger-GRCh38-3.0.0'** + * **--expectCells** + * expected number of cells to be detected + * guides cellranger in it's cutoff for background/low quality cells + * as a guide it doesn't have to be exact + * 0-10000 + * if --expextedCells is used then --forceCells is not necessary + * only used if --forceCells is not entered or set to 0 + * eg: **--expectCells 10000** + * **--forceCells** + * forces filtering of the top number of cells matching this parameter + * 0-10000 + * if --forceCells is used then --expectedCells is not necessary and is overwritten + * eg: **--forceCells 10000** + * **--kitVersion** + * the library chemistry version number for the 10x Genomics Gene Expression kit + * setting to auto will attempt to autodetect from the detected cycle strategy in the fastq's + * version numbers are spelled out + * --kitversion is only used if --version (cellranger version) is > 2 + * --version (cellranger version) 2.1.1 can only read --kitVersion of two (2) + * options: + * *'auto'* + * *'three'* + * *'two'* + * eg: **--kitVersion 'three'**' + * **--version** + * cellranger version + * --version (cellranger version) 2.1.1 can only read --kitVersion of two (2) + * options: + * *'3.0.2'* + * *'3.0.1'* + * *'2.1.1'* + * eg: **--version '3.0.2'**' + * **--outDir** + * optional output directory for run + * eg: **--outDir 'test'** + * FULL EXAMPLE: + +**nextflow main.nf --fastq '/project/shared/bicf_workflow_ref/workflow_testdata/cellranger/cellranger_count/v3s2r100k/\*.fastq.gz' --designFile '/project/shared/bicf_workflow_ref/workflow_testdata/cellranger/cellranger_count/v3s2r100k/design.csv' --genome 'GRCh38-3.0.0' --kitVersion 'three' --version '3.0.2' --outDir 'test'** + +* Design example: + +| Sample | fastq_R1 | fastq_R2 | +|---------|------------------------------------|------------------------------------| +| sample1 | pbmc_1k_v2_S1_L001_R1_001.fastq.gz | pbmc_1k_v2_S1_L001_R2_001.fastq.gz | +| sample2 | pbmc_1k_v2_S2_L001_R1_001.fastq.gz | pbmc_1k_v2_S2_L001_R2_001.fastq.gz | +| sample2 | pbmc_1k_v2_S2_L002_R1_001.fastq.gz | pbmc_1k_v2_S2_L002_R2_001.fastq.gz | \ No newline at end of file