Skip to content
Snippets Groups Projects
Commit 4e731fe0 authored by Gervaise Henry's avatar Gervaise Henry :cowboy:
Browse files

Merge branch '1-update' into 'develop'

Resolve "Migrate cellranger_count basics"

See merge request !2
parents f335209d 5dfa2268
Branches
Tags
3 merge requests!5Develop,!3Develop,!2Resolve "Migrate cellranger_count basics"
Pipeline #3346 passed with stages
in 57 seconds
......@@ -24,6 +24,120 @@ wheels/
.installed.cfg
*.egg
# PyInstaller
# Created by https://www.gitignore.io/api/r,perl,macos,linux,python,windows
# Edit at https://www.gitignore.io/?templates=r,perl,macos,linux,python,windows
### Linux ###
*~
# temporary files which can be created if a process still has a handle open of a deleted file
.fuse_hidden*
# KDE directory preferences
.directory
# Linux trash folder which might appear on any partition or disk
.Trash-*
# .nfs files are created when an open file is removed but is still being accessed
.nfs*
### macOS ###
# General
.DS_Store
.AppleDouble
.LSOverride
# Icon must end with two \r
Icon
# Thumbnails
._*
# Files that might appear in the root of a volume
.DocumentRevisions-V100
.fseventsd
.Spotlight-V100
.TemporaryItems
.Trashes
.VolumeIcon.icns
.com.apple.timemachine.donotpresent
# Directories potentially created on remote AFP share
.AppleDB
.AppleDesktop
Network Trash Folder
Temporary Items
.apdisk
### Perl ###
!Build/
.last_cover_stats
/META.yml
/META.json
/MYMETA.*
*.o
*.pm.tdy
*.bs
# Devel::Cover
cover_db/
# Devel::NYTProf
nytprof.out
# Dizt::Zilla
/.build/
# Module::Build
_build/
Build
Build.bat
# Module::Install
inc/
# ExtUtils::MakeMaker
/blib/
/_eumm/
/*.gz
/Makefile
/Makefile.old
/MANIFEST.bak
/pm_to_blib
/*.zip
### Python ###
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
......@@ -37,6 +151,7 @@ pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
......@@ -44,6 +159,7 @@ nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/
# Translations
*.mo
......@@ -52,6 +168,7 @@ coverage.xml
# Django stuff:
*.log
local_settings.py
db.sqlite3
# Flask stuff:
instance/
......@@ -69,6 +186,10 @@ target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
.python-version
......@@ -84,6 +205,8 @@ celerybeat-schedule
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
......@@ -97,16 +220,95 @@ ENV/
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
### Python Patch ###
.venv/
### R ###
# History files
.Rhistory
.Rapp.history
# Session Data files
.RData
# Example code in package build process
*-Ex.R
# Output files from R CMD build
/*.tar.gz
# Output files from R CMD check
/*.Rcheck/
# RStudio files
.Rproj.user/
# produced vignettes
vignettes/*.html
vignettes/*.pdf
# OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3
.httr-oauth
# knitr and R markdown default cache directories
/*_cache/
/cache/
# Temporary files created by R markdown
*.utf8.md
*.knit.md
### R.Bookdown Stack ###
# R package: bookdown caching files
/*_files/
### Windows ###
# Windows thumbnail cache files
Thumbs.db
ehthumbs.db
ehthumbs_vista.db
# Dump file
*.stackdump
# Folder config file
[Dd]esktop.ini
# Recycle Bin used on file shares
$RECYCLE.BIN/
# Windows Installer files
*.cab
*.msi
*.msix
*.msm
*.msp
# Windows shortcuts
*.lnk
# End of https://www.gitignore.io/api/r,perl,macos,linux,python,windows
# nextflow analysis folders/files
/workflow/.nextflow/
/workflow/work
/workflow/output/design/
/workflow/output/bcl/
/workflow/output/fastq/
/test_data/*
/workflow/.nextflow/*
/workflow/work/*
/workflow/output/*
/.nextflow/*
/data/*
/work/*
/output/*
pipeline_trace*.txt*
.nextflow*.log*
report.html*
report*.html*
timeline*.html*
*~
!.gitkeep
before_script:
- module load astrocyte
- module load python/3.6.1-2-anaconda
- module load nextflow/0.27.6
- ln -s /project/shared/bicf_workflow_ref/workflow_testdata/cellranger_mkfastq/*tar.gz test_data/
- module load nextflow/0.31.1_Ignite
- mkdir test_data/simple
- ln -s /project/shared/bicf_workflow_ref/workflow_testdata/cellranger/cellranger_mkfastq/simple/* test_data/simple/
stages:
- integration
- astrocyte
- simple
astrocyte_check:
stage: astrocyte
script:
- astrocyte_cli check ../cellranger_mkfastq
simple_test:
stage: integration
stage: simple
script:
- nextflow run workflow/main.nf
- nextflow run workflow/main.nf --bcl test_data/simple/*.tar.gz --designFile test_data/simple/cellranger-tiny-bcl-simple-1_2_0.csv
|*master*|*develop*|
|:-:|:-:|
|[![Build Status](https://git.biohpc.swmed.edu/BICF/Astrocyte/cellranger_mkfastq/badges/master/build.svg)](https://git.biohpc.swmed.edu/BICF/Astrocyte/cellranger_mkfastq/commits/master)|[![Build Status](https://git.biohpc.swmed.edu/BICF/Astrocyte/cellranger_mkfastq/badges/develop/build.svg)](https://git.biohpc.swmed.edu/BICF/Astrocyte/cellranger_mkfastq/commits/develop)|
10x Genomics scRNA-Seq (cellranger) mkfastq Pipeline
========================================
Introduction
------------
This pipeline is a wrapper for the cellranger mkfastq tool from 10x Genomics, which is a wrapper for Illumina's bcl2fastq tool. It takes tarballed bcl files from 10x Genomics Single Cell Gene Expression libraries, and untar's and deconvolves them into fastq's.
This pipeline is a wrapper for the cellranger mkfastq tool from 10x Genomics. It takes bcl files from sequencing of 10x Genomics Single Cell Gene Expression libraries, and deconvolutles the reads by the samples' barcodes.
The pipeline uses Nextflow, a bioinformatics workflow tool.
This pipeline is primarily used with a SLURM cluster on the BioHPC Cluster. However, the pipeline should be able to run on any system that Nextflow supports.
Additionally, the pipeline is designed to work with Astrocyte Workflow System using a simple web interface.
\ No newline at end of file
Additionally, the pipeline is designed to work with Astrocyte Workflow System using a simple web interface.
......@@ -111,5 +111,4 @@ vizapp_cran_packages:
# List of any Bioconductor packages, not provided by the modules,
# that must be made available to the vizapp
vizapp_bioc_packages:
- chipseq
vizapp_bioc_packages: []
File moved
# Astrocyte CellRanger 10x Workflow Package
10x Genomics scRNA-Seq (cellranger) mkfastq Pipeline
====================================================
## Workflow SOP
Introduction
------------
This pipeline is a wrapper for the cellranger count tool from 10x Genomics. It takes fastq files from 10x Genomics Single Cell Gene Expression libraries, performs alignment, filtering, barcode counting, and UMI counting. It uses the Chromium cellular barcodes to generate gene-barcode matrices, determine clusters, and perform gene expression analysis.
The pipeline uses Nextflow, a bioinformatics workflow tool.
Credits
-------
This worklow is was developed jointly with the [Bioinformatic Core Facility (BICF), Department of Bioinformatics](http://www.utsouthwestern.edu/labs/bioinformatics/)
Please cite in publications: Pipeline was developed by BICF from funding provided by **Cancer Prevention and Research Institute of Texas (RP150596)**.
profiles {
standard {
includeConfig 'workflow/conf/biohpc.config'
}
}
......@@ -8,11 +8,12 @@ process {
executor = 'local'
}
$untarBCL {
cpus = 32
module = ['pigz/2.4']
queue = 'super'
}
$mkfastq {
module = ['cellranger/2.1.1', 'bcl2fastq/2.17.1.14']
cpus = 32
module = ['cellranger/3.0.2', 'bcl2fastq/2.19.1']
queue = 'super'
}
}
......
......@@ -6,32 +6,36 @@
// Define Input variables
params.bcl = "$baseDir/../test_data/*.tar.gz"
params.designFile = "$baseDir/../test_data/design.csv"
params.outDir = "$baseDir/output"
// Define List of Files
tarList = Channel.fromPath( params.bcl )
// Define regular variables
designLocation = Channel
.fromPath(params.designFile)
.ifEmpty { exit 1, "design file not found: ${params.designFile}" }
outDir = params.outDir
process checkDesignFile {
publishDir "$baseDir/output/design", mode: 'copy'
publishDir "$outDir/${task.process}", mode: 'copy'
input:
params.designFile
file designLocation
output:
file("design.csv") into designPaths
file("design.checked.csv") into designPaths
script:
"""
hostname
ulimit -a
module load python/3.6.1-2-anaconda
python3 $baseDir/scripts/check_design.py -d $params.designFile
python3 $baseDir/scripts/check_design.py -d $designLocation
"""
}
......@@ -39,7 +43,7 @@ process checkDesignFile {
process untarBCL {
tag "$tar"
publishDir "$baseDir/output/bcl", mode: 'copy'
publishDir "$outDir/${task.process}", mode: 'copy'
input:
......@@ -52,6 +56,8 @@ process untarBCL {
script:
"""
hostname
ulimit -a
module load pigz/2.4
tar -xvf $tar -I pigz
"""
......@@ -59,9 +65,9 @@ process untarBCL {
process mkfastq {
tag "${bcl.baseName}"
publishDir "$baseDir/output/fastq/${bcl.baseName}", mode: 'copy'
publishDir "$outDir/${task.process}", mode: 'copy'
input:
......@@ -75,8 +81,10 @@ process mkfastq {
script:
"""
module load cellranger/2.1.1
module load bcl2fastq/2.17.1.14
hostname
ulimit -a
module load cellranger/3.0.2
module load bcl2fastq/2.19.1
cellranger mkfastq --id="${bcl.baseName}" --run=$bcl --csv=$designPaths
"""
}
......@@ -70,7 +70,7 @@ def main():
# Check design file
new_design_df = check_design_headers(design_df)
new_design_df.to_csv('design.csv', header=True, sep=',', index=False)
new_design_df.to_csv('design.checked.csv', header=True, sep=',', index=False)
if __name__ == '__main__':
main()
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment