Commit e608b062 authored by Gervaise Henry's avatar Gervaise Henry 🤠
Browse files

Merge branch 'update' into 'develop'

Update

Closes #37, #38, and #39

See merge request !61
parents 5907a115 852122be
Pipeline #7345 passed with stages
in 7 minutes and 58 seconds
......@@ -2,7 +2,7 @@ before_script:
- module load astrocyte
- module load python/3.6.1-2-anaconda
- pip install --user pytest-pythonpath==0.7.1 pytest-cov==2.5.1
- module load nextflow/19.09.0
- module load nextflow/20.01.0
- module load singularity/3.0.2
- mkdir -p test_data/simple1
- mkdir -p test_data/simple2
......@@ -20,14 +20,14 @@ astrocyte_check:
artifacts:
expire_in: 2 days
retry:
max: 1
max: 0
when:
- always
simple_1FC:
stage: simple
script:
- nextflow run workflow/main.nf --bcl "test_data/simple1/*.tar.gz" --designFile "test_data/simple1/cellranger-tiny-bcl-simple-1_2_0.csv"
- nextflow run workflow/main.nf -profile biohpc,cluster --bcl "test_data/simple1/*.tar.gz" --designFile "test_data/simple1/cellranger-tiny-bcl-simple-1_2_0.csv" --ci true
##- pytest -m simple1
artifacts:
name: "$CI_JOB_NAME"
......@@ -37,14 +37,14 @@ simple_1FC:
- workflow/output/multiqc/run/multiqc_report.html
expire_in: 2 days
retry:
max: 1
max: 0
when:
- always
simple_2FC:
stage: simple
script:
- nextflow run workflow/main.nf --bcl "test_data/simple2/*.tar.gz" --designFile "test_data/simple2/cellranger-tiny-bcl-simple-1_2_0.csv"
- nextflow run workflow/main.nf -profile biohpc,cluster --bcl "test_data/simple2/*.tar.gz" --designFile "test_data/simple2/cellranger-tiny-bcl-simple-1_2_0.csv" --ci true
##- pytest -m simple2
artifacts:
name: "$CI_JOB_NAME"
......@@ -54,6 +54,6 @@ simple_2FC:
- workflow/output/multiqc/run/multiqc_report.html
expire_in: 2 days
retry:
max: 1
max: 0
when:
- always
# v2.0.0
# v2.x.x-indev
**UserFacing**
* Check Design File for spaces in name and file contents
* Update design example, README, and astrocyte.yml with current barcode IDs
* Remove unnecessary intermediate process files
* Add config option for running on AWS
* Ignore dual index
* Add parameter for base mask (not available in Astrocyte)
* Output Nextflow version in QC report
* Output pipeline version (manual) in QC report
**Background**
* Add Nextflow Tower integration into CI (GHH's profile)
* Utilize BICF docker containers
* Update to be runnable on cloud
* Refactor configs
*Known Bugs*
* cellranger mkfastq will not accept spaces in path for run param even if quoted, issue raised on 10XGenomics/cellranger github issue [#31](https://github.com/10XGenomics/cellranger/issues/31)
......
......@@ -24,18 +24,23 @@ Cloud Compatibility
-------------------
This pipeline is also capable of being run on AWS. To do so:
* Build a AWS batch queue and environment either manually or with [aws-cloudformantion](https://console.aws.amazon.com/cloudformation/home?#/stacks/new?stackName=Nextflow&templateURL=https://s3.amazonaws.com/aws-genomics-workflows/templates/nextflow/nextflow-aio.template.yaml)
* Edit one of the aws configs in workflow/config/
* In the aws configs in `workflow/configs/`:
* Replace workDir with the S3 bucket generated
* Change region if different
* Change queue to the aws batch queue generated
* In the ondemand and spot configs in `workflow/config/`:
* Change queue to the aws batch queues generated
* The user must be have awscli configured with an appropriate authentication (with ```aws configure``` and access keys) in the environment which nextflow will be run
* Add ```-profile ``` with the name aws config which was custamized
* eg. ```nextflow run workflow/main.nf -profile aws_ondemand```
* Add ```-profile ``` with aws and the queue config which was customized
* eg. ```nextflow run workflow/main.nf -profile aws,ondemand```
To Run:
-------
* Available parameters:
* **-profile**
* what environments to run on, available: `biohpc`, `local`, `cluster`, `aws`, `ondemand`, `spot`
* eg: **-profile biohpc,cluster** to run on BioHPC in cluster mode
* eg: **-profile aws,ondemand** to run on AWS on a on-demand queue
* **--name**
* run name, puts outputs in a directory with this name
* eg: **--name 'test'**
......@@ -58,7 +63,7 @@ To Run:
* eg: ```--outDir 'test'```
* FULL EXAMPLE:
```
nextflow run workflow/main.nf --name 'test' --bcl '/project/shared/bicf_workflow_ref/workflow_testdata/cellranger/cellranger_mkfastq/simple1/cellranger-tiny-bcl-1_2_0.tar.gz' --designFile '/project/shared/bicf_workflow_ref/workflow_testdata/cellranger/cellranger_mkfastq/simple1/cellranger-tiny-bcl-simple-1_2_0.csv' --outDir 'test'
nextflow run workflow/main.nf -profile biohpc,cluster --name 'test' --bcl '/project/shared/bicf_workflow_ref/workflow_testdata/cellranger/cellranger_mkfastq/simple1/cellranger-tiny-bcl-1_2_0.tar.gz' --designFile '/project/shared/bicf_workflow_ref/workflow_testdata/cellranger/cellranger_mkfastq/simple1/cellranger-tiny-bcl-simple-1_2_0.csv' --outDir 'test'
```
* Design example:
......
......@@ -90,6 +90,15 @@ workflow_parameters:
description: |
A design file listing lane, sample, corresponding index (last characters represent well position of 96-well plate). [Current sample barcode IDs](https://s3-us-west-2.amazonaws.com/10x.files/supp/cell-exp/chromium-shared-sample-indexes-plate.csv)
- id: astrocyte
type: select
choices:
- [ 'true', 'true' ]
required: true
default: 'true'
description: |
Ensure configuraton for astrocyte.
# -----------------------------------------------------------------------------
# SHINY APP CONFIGURATION
......
......@@ -4,7 +4,7 @@
Introduction
------------
This pipeline is a wrapper for the cellranger mkfastq tool from 10x Genomics (which uses Illumina's bcl2fastq). It takes demultiplexes samples from 10x Genomics Single Cell Gene Expression libraries into fastqs.
This pipeline is a wrapper for the cellranger mkfastq tool from 10x Genomics (which uses Illumina's bcl2fastq). It takes bcl's and demultiplexes samples from 10x Genomics Single Cell Gene Expression libraries into fastqs.
FastQC is run on the resulting fastq and those reports and bcl2fastq reports are collated with the MultiQC tool.
......@@ -29,8 +29,8 @@ To Run:
* Design example:
| Lane | Sample | Index |
|------|-------------|-----------|
| Lane | Sample | Index |
|------|-------------|----------|
| * | test_sample | SI-GA-C9 |
......
......@@ -8,31 +8,23 @@ aws {
}
process {
executor = 'awsbatch'
queue = 'default-3278a8b0-1fc8-11ea-b1ac-021e2396e2cc'
container = 'bicf/bicfbase:1.4'
executor = 'awsbatch'=
container = 'bicf/bicfbase:2.0.0'
cpus = 1
memory = '1 GB'
withName:checkDesignFile {
container = 'bicf/python3:1.3'
cpus = 4
}
withName:mkfastq {
container = 'bicf/cellranger3.1.0:1.0'
cpus = 6
memory = '2 GB'
}
withName:fastqc {
container = 'bicf/fastqc:1.5'
cpus = 30
memory = '15 GB'
}
withName:versions {
container = 'bicf/python3:1.3'
cpus = 6
}
withName:multiqc {
container = 'bicf/multiqc:1.4'
}
}
\ No newline at end of file
}
workDir = 's3://gudmap.rbk/work'
aws.client.storageEncryption = 'AES256'
aws {
region = 'us-east-2'
batch {
cliPath = '/home/ec2-user/miniconda/bin/aws'
}
}
process {
executor = 'awsbatch'
queue = 'highpriority-3278a8b0-1fc8-11ea-b1ac-021e2396e2cc'
container = 'bicf/bicfbase:1.4'
cpus = 1
memory = '1 GB'
withName:checkDesignFile {
container = 'bicf/python3:1.3'
cpus = 4
}
withName:mkfastq {
container = 'bicf/cellranger3.1.0:1.0'
cpus = 6
memory = '2 GB'
}
withName:fastqc {
container = 'bicf/fastqc:1.5'
cpus = 30
memory = '15 GB'
}
withName:versions {
container = 'bicf/python3:1.3'
cpus = 6
}
withName:multiqc {
container = 'bicf/multiqc:1.4'
}
}
\ No newline at end of file
process {
executor = 'slurm'
queue = 'super'
clusterOptions = '--hold'
container = 'docker://bicf/bicfbase:1.4'
withName:checkDesignFile {
container = 'docker://bicf/python3:1.3'
executor = 'local'
}
withName:untarBCL {
queue = 'super'
}
withName:mkfastq {
container = 'docker://bicf/cellranger3.1.0:1.0'
queue = '128GB,256GB,256GBv1,384GB'
}
withName:countDesign {
executor = 'local'
}
withName:fastqc {
container = 'docker://bicf/fastqc:1.5'
queue = 'super'
}
withName:versions {
container = 'docker://bicf/python3:1.3'
executor = 'local'
}
withName:multiqc {
container = 'docker://bicf/multiqc:1.4'
executor = 'local'
}
}
singularity {
enabled = true
cacheDir = '/project/shared/bicf_workflow_ref/singularity_images/'
}
env {
http_proxy = 'http://proxy.swmed.edu:3128'
https_proxy = 'http://proxy.swmed.edu:3128'
all_proxy = 'http://proxy.swmed.edu:3128'
}
\ No newline at end of file
}
process {
executor = 'local'
container = 'docker://bicf/bicfbase:1.3'
withName:checkDesignFile {
container = 'docker://bicf/python3:1.3'
}
withName:mkfastq {
container = 'docker://bicf/cellranger3.1.0:1.0'
}
withName:fastqc {
container = 'docker://bicf/fastqc:1.5'
}
withName:versions {
container = 'docker://bicf/python3:1.3'
}
withName:multiqc {
container = 'docker://bicf/multiqc:1.4'
}
}
singularity {
enabled = true
cacheDir = '/project/shared/bicf_workflow_ref/singularity_images/'
}
env {
http_proxy = 'http://proxy.swmed.edu:3128'
https_proxy = 'http://proxy.swmed.edu:3128'
all_proxy = 'http://proxy.swmed.edu:3128'
}
\ No newline at end of file
process {
executor = 'slurm'
queue = 'super'
clusterOptions = '--hold'
withName:trackStart {
executor = 'local'
}
withName:checkDesignFile {
executor = 'local'
}
withName:untarBCL {
queue = 'super'
}
withName:mkfastq {
queue = '128GB,256GB,256GBv1,384GB'
}
withName:countDesign {
executor = 'local'
}
withName:fastqc {
queue = 'super'
}
withName:versions {
executor = 'local'
}
withName:multiqc {
executor = 'local'
}
}
process {
executor = 'local'
}
process {
queue = 'highpriority-3278a8b0-1fc8-11ea-b1ac-021e2396e2cc'
}
process {
queue = 'default-3278a8b0-1fc8-11ea-b1ac-021e2396e2cc'
}
......@@ -12,6 +12,8 @@ main.nf
params.name = "run"
params.bcl = "${baseDir}/../test_data/simple1/*.tar.gz"
params.designFile = "${baseDir}/../test_data/single1/cellranger-tiny-bcl-simple-1_2_0.csv"
params.mask = ""
params.astrocyte = false
params.outDir = "${baseDir}/output"
// Define list of files
......@@ -22,10 +24,12 @@ bclCount = Channel
.count()
// Define regular variables
pipelineVersion = "2.x.x-indev"
name = params.name
designLocation = Channel
.fromPath(params.designFile)
.ifEmpty { exit 1, "design file not found: ${params.designFile}" }
mask = params.mask
outDir = params.outDir
// Define script files
......@@ -36,7 +40,7 @@ fastqcScript = Channel.fromPath("$baseDir/scripts/fastqc.sh")
versionsScript = Channel.fromPath("$baseDir/scripts/generate_versions.py")
referencesScript = Channel.fromPath("$baseDir/scripts/generate_references.py")
versions_pythonScript = Channel.fromPath("$baseDir/scripts/versions_python.sh")
versions_pigzScript = Channel.fromPath("$baseDir/scripts/versions_pigz.sh")
//versions_pigzScript = Channel.fromPath("$baseDir/scripts/versions_pigz.sh")
versions_cellrangerScript = Channel.fromPath("$baseDir/scripts/versions_cellranger.sh")
versions_bcl2fastqScript = Channel.fromPath("$baseDir/scripts/versions_bcl2fastq.sh")
versions_fastqcScript = Channel.fromPath("$baseDir/scripts/versions_fastqc.sh")
......@@ -46,6 +50,31 @@ multiqcConf = Channel.fromPath("${baseDir}/configs/multiqc_config.yaml")
references = Channel.fromPath("${baseDir}/../docs/references.md")
/*
* trackStart: track start of pipeline
*/
params.ci = false
process trackStart {
script:
"""
hostname
ulimit -a
export https_proxy=\${http_proxy}
curl -H 'Content-Type: application/json' -X PUT -d '{ \
"sessionId": "${workflow.sessionId}", \
"pipeline": "cellranger_mkfastq", \
"pipelineVersion": "${pipelineVersion}", \
"start": "${workflow.start}", \
"astrocyte": ${params.astrocyte}, \
"status": "started", \
"nextflowVersion": "${workflow.nextflow.version}",
"ci": ${params.ci}}' \
"https://xku43pcwnf.execute-api.us-east-1.amazonaws.com/ProdDeploy/pipeline-tracking"
"""
}
process checkDesignFile {
tag "${name}"
......@@ -58,13 +87,14 @@ process checkDesignFile {
output:
file("design.checked.csv") into designPaths
file("design.checked.csv") into designCount
//file("version_pipeline.txt") into version_pipeline
//file("version_nextflow.txt") into version_nextflow
file("version_pipeline.txt") into version_pipeline
file("version_nextflow.txt") into version_nextflow
file("version_python.txt") into version_python
script:
"""
hostname
ulimit -u 16384
ulimit -a
noSpaceDesign=\$(echo "${designLocation}" | tr -d ' ')
if [[ "\${noSpaceDesign}" != "${designLocation}" ]]; then
......@@ -72,12 +102,13 @@ process checkDesignFile {
fi
python3 check_design.py -d \${noSpaceDesign}
bash versions_python.sh > version_python.txt
echo "${workflow.nextflow.version}" > version_nextflow.txt
echo "${pipelineVersion}" > version_pipeline.txt
"""
}
/* nextflow workflow version calls that aren't compatible with nextflow 0.31.0
/* nextflow workflow manifest version calls that aren't compatible with Asrcocyte
echo "${workflow.manifest.version}" > version_pipeline.txt
echo "${workflow.nextflow.version}" > version_nextflow.txt
*/
process untarBCL {
......@@ -86,19 +117,20 @@ process untarBCL {
input:
file untarBCLScript
file versions_pigzScript
//file versions_pigzScript
each file(tar) from tarList
output:
file("*[!version_pigz.txt]") into bclPaths mode flatten
file("version_pigz.txt") into version_pigz
//file("version_pigz.txt") into version_pigz
script:
"""
hostname
ulimit -u 16384
ulimit -a
bash untarBCL.sh -t ${tar}
bash versions_pigz.sh > version_pigz.txt
#bash versions_pigz.sh > version_pigz.txt
"""
}
......@@ -126,8 +158,9 @@ process mkfastq {
script:
"""
hostname
ulimit -a
cellranger mkfastq --id=mkfastq_${bcl.simpleName} --run=${bcl} --csv=${design}
ulimit -u 16384
ulimit -a
cellranger mkfastq --id=mkfastq_${bcl.simpleName} --run=${bcl} --csv=${design} --ignore-dual-index ${mask}
mkdir fq
mkdir "fq/${bcl.simpleName}"
find . -name "*.fastq.gz" -exec cp {} fq/${bcl.simpleName}/ \\;
......@@ -155,6 +188,9 @@ if (bclCount.value == 1) {
script:
"""
hostname
ulimit -u 16384
ulimit -a
bash countDesign.sh
"""
......@@ -180,6 +216,7 @@ process fastqc {
script:
"""
hostname
ulimit -u 16384
ulimit -a
find *.fastq.gz -exec mv {} ${bcl}.{} \\;
bash fastqc.sh
......@@ -196,10 +233,10 @@ process versions {
input:
file versionsScript
file referencesScript
//file version_pipeline
//file version_nextflow
file version_pipeline
file version_nextflow
file version_python
file version_pigz
//file version_pigz
file version_cellranger
file version_bcl2fastq
file version_fastqc
......@@ -211,6 +248,7 @@ process versions {
script:
"""
hostname
ulimit -u 16384
ulimit -a
python3 generate_versions.py -f version_*.txt -o versions
python3 generate_references.py -r ${references} -o references
......@@ -236,6 +274,7 @@ process multiqc {
script:
"""
hostname
ulimit -u 16384
ulimit -a
export LC_ALL=C.UTF-8
export LANG=C.UTF-8
......
......@@ -2,15 +2,50 @@ profiles {
standard {
includeConfig 'configs/biohpc.config'
}
biohpc_local {
includeConfig 'configs/biohpc_local.config'
biohpc {
includeConfig 'configs/biohpc.config'
}
local {
includeConfig 'configs/local.config'
}
cluster {
includeConfig 'configs/cluster.config'
}
aws {
includeConfig 'configs/aws.config'
}
aws_ondemand {
includeConfig 'configs/aws_ondemand.config'
ondemand {
includeConfig 'configs/ondemand.config'
}
spot {
includeConfig 'configs/spot.config'
}
}
process {
withName:checkDesignFile {
container = 'docker://bicf/python3:2.0.0'
}
aws_spot {
includeConfig 'configs/aws_spot.config'
withName:untarBCL {
container = 'docker://bicf/bicfbase:2.0.0'
}
withName:mkfastq {
container = 'docker://bicf/cellranger3.1.0:2.0.0'
}
withName:fastqc {
container = 'docker://bicf/fastqc:2.0.0'
}
withName:versions {
container = 'docker://bicf/python3:2.0.0'
}
withName:multiqc {
container = 'docker://bicf/multiqc:2.0.0'
}
}
singularity {
enabled = true
cacheDir = '/project/shared/bicf_workflow_ref/singularity_images/'
}
trace {
......
......@@ -26,10 +26,10 @@ logger.propagate = False
logger.setLevel(logging.INFO)
SOFTWARE_REGEX = {
#'Pipeline': ['version_pipeline.txt', r"(\S+)"],
#'Nextflow': ['version_nextflow.txt', r"(\S+)"],
'Pipeline': ['version_pipeline.txt', r"(\S+)"],
'Nextflow': ['version_nextflow.txt', r"(\S+)"],
'python': ['version_python.txt', r"(\S+)"],
'pigz': ['version_pigz.txt', r"(\S+)"],
#'pigz': ['version_pigz.txt', r"(\S+)"],
'cellranger': ['version_cellranger.txt', r"(\S+)"],
'bcl2fastq': ['version_bcl2fastq.txt', r"(\S+)"],
'fastqc': ['version_fastqc.txt', r"(\S+)"],
......@@ -78,10 +78,10 @@ def main():
out_filename = output + '_mqc.yaml'
results = OrderedDict()
#results['Pipeline'] = '<span style="color:#999999;\">N/A</span>'
#results['Nextflow'] = '<span style="color:#999999;\">N/A</span>'
results['Pipeline'] = '<span style="color:#999999;\">N/A</span>'
results['Nextflow'] = '<span style="color:#999999;\">N/A</span>'
results['python'] = '<span style="color:#999999;\">N/A</span>'
results['pigz'] = '<span style="color:#999999;\">N/A</span>'
#results['pigz'] = '<span style="color:#999999;\">N/A</span>'
results['cellranger'] = '<span style="color:#999999;\">N/A</span>'
results['bcl2fastq'] = '<span style="color:#999999;\">N/A</span>'
results['fastqc'] = '<span style="color:#999999;\">N/A</span>'
......@@ -116,4 +116,4 @@ def main():
if __name__ == '__main__':
main()
\ No newline at end of file
main()
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment