Skip to content
Snippets Groups Projects
Commit 32e9b6a5 authored by Felix Perez's avatar Felix Perez
Browse files

Merge branch 'fpwork' into 'main'

Replace the shell block with a script block in the runSource process.

See merge request !8
parents df81fab0 68ed6324
Branches fpwork
1 merge request!8Replace the shell block with a script block in the runSource process.
...@@ -101,7 +101,7 @@ ENV/ ...@@ -101,7 +101,7 @@ ENV/
# Mac OS # Mac OS
.DS_Store .DS_Store
# nextflow analysis folders/files # Nextflow analysis folders/files
workflow/dag*.dot* workflow/dag*.dot*
workflow/trace*.txt* workflow/trace*.txt*
workflow/.nextflow/ workflow/.nextflow/
......
...@@ -42,11 +42,11 @@ minimum_astrocyte_version: '2.0.1' ...@@ -42,11 +42,11 @@ minimum_astrocyte_version: '2.0.1'
# The Nextflow version that requires to run this workflow. For old pipelines, which do not have this label # The Nextflow version that requires to run this workflow. For old pipelines, which do not have this label
# a default value of 0.31.0 will be assigned automatically. Please make sure the requested nextflow version is available # a default value of 0.31.0 will be assigned automatically. Please make sure the requested nextflow version is available
# in the module list. # in the module list.
nextflow_version: '0.31.0' nextflow_version: '22.04.5'
# (Optional) The Nextflow config file to use for this workflow. If provided, the file should exist in workflow/configs # (Optional) The Nextflow config file to use for this workflow. If provided, the file should exist in workflow/configs
nextflow_config: 'biohpc.config' nextflow_config: 'biohpc.config'
# The container to use for this workflow, none/singularity. If omitted, the default value 'none' will be used. # The container to use for this workflow, none/singularity. If omitted, the default value 'none' will be used.
container: 'singularity' container: 'none'
# The version of singularity to use. This is required if container == 'singularity' # The version of singularity to use. This is required if container == 'singularity'
singularity_version: '3.9.9' singularity_version: '3.9.9'
...@@ -73,16 +73,8 @@ documentation_files: ...@@ -73,16 +73,8 @@ documentation_files:
# A list of cluster environment modules that this workflow requires to run. # A list of cluster environment modules that this workflow requires to run.
# Specify versioned module names to ensure reproducability. # Specify versioned module names to ensure reproducability.
workflow_modules: workflow_modules:
- Test - 'python/3.8.x-anaconda'
- 'openjdk/18'
# A list of container images required to run this workflow.
# Specify full path and version names to ensure reproducibility.
# This keyword is required when 'container' is specified in Astrocyte 0.4.1 and above.
# Singularity supports different registries, please specify the protocol to use.
# Such as, "docker://", "shub://", "library://", etc. We encourage you to use the GitLab
# container registry of BioHPC to save and manage your container images.
#workflow_containers:
# - docker://git.biohpc.swmed.edu:5050/s219741/astrocyte-atac-source/atac:0.0.1
# A list of parameters used by the workflow, defining how to present them, # A list of parameters used by the workflow, defining how to present them,
# options etc in the web interface. For each parameter: # options etc in the web interface. For each parameter:
......
...@@ -8,7 +8,6 @@ This repo is used to wrap the existing ATAC-seq pipeline listed below (Runner), ...@@ -8,7 +8,6 @@ This repo is used to wrap the existing ATAC-seq pipeline listed below (Runner),
- The ATAC-seq Runner workflow, 'astrocyte-atac-runner] (https://git.biohpc.swmed.edu/s219741/astrocyte-atac-runner). This repo contains the original ATAC-seq pipeline developed by the ENCODE team. - The ATAC-seq Runner workflow, 'astrocyte-atac-runner] (https://git.biohpc.swmed.edu/s219741/astrocyte-atac-runner). This repo contains the original ATAC-seq pipeline developed by the ENCODE team.
## The ATAC-seq Runner workflow ## The ATAC-seq Runner workflow
<!-- TODO: Fill out intro to ATAC-seq pipeline. Which commands do we use to run this pipeline? What will the NextFlow script use? (AS) (DONE)-->
This pipeline is designed for automated end-to-end quality control and processing of ATAC-seq. The pipeline can be run end-to-end, starting from raw FASTQ files all the way to peak calling and signal track generation using a single caper submit command. One can also start the pipeline from intermediate stages (for example, using alignment files as input). The pipeline supports both single-end and paired-end data as well as replicated or non-replicated datasets. The outputs produced by the pipeline include 1) formatted HTML reports that include quality control measures specifically designed for ATAC-seq and DNase-seq data, 2) analysis of reproducibility, 3) stringent and relaxed thresholding of peaks, 4) fold-enrichment and pvalue signal tracks. This pipeline is designed for automated end-to-end quality control and processing of ATAC-seq. The pipeline can be run end-to-end, starting from raw FASTQ files all the way to peak calling and signal track generation using a single caper submit command. One can also start the pipeline from intermediate stages (for example, using alignment files as input). The pipeline supports both single-end and paired-end data as well as replicated or non-replicated datasets. The outputs produced by the pipeline include 1) formatted HTML reports that include quality control measures specifically designed for ATAC-seq and DNase-seq data, 2) analysis of reproducibility, 3) stringent and relaxed thresholding of peaks, 4) fold-enrichment and pvalue signal tracks.
......
/*
singularity {
enabled = true
runOptions = ''
// Below connects the experimental atac container to BioHPC's Slurm job scheduler.
// runOptions = '\
// --bind /cm/shared/apps/slurm/16.05.8,/etc/slurm,/cm/shared/apps/slurm/var/etc/,/usr/lib64/libreadline.so.6 \
// --bind /usr/lib64/libhistory.so.6,/usr/lib64/libtinfo.so.5,/var/run/munge,/usr/lib64/libmunge.so.2 \
// --bind /usr/lib64/libmunge.so.2.0.0,/cm/shared/apps/slurm/16.05.8/lib64/slurm/'
// Please do NOT use "--disable-cache" in this runOptions.
// Starting from version 2.0.0, the astrocyte_cli will clean up the cache automatically.
// runOptions = '--bind /vagrant:/vagrant' // Use this one for vagrant development env only
cacheDir = "$baseDir/images/singularity" // Singularity images specified in `workflow_containers` of astrocyte_pkg.yml will be saved to
// this folder automatically, before running the workflow. The images will be renamed as
// "NAME-TAG.img", e.g. ubuntu-latest.img, centos-centos8.img, r-4.1.1.img, etc.
}*/
process { process {
executor = 'slurm' executor = 'slurm'
clusterOptions = '--hold --no-kill' clusterOptions = '--hold --no-kill'
queue = 'super' queue = 'super'
beforeScript = 'ulimit -Ss unlimited'
withName:runSource { withName:runSource {
// Experimental containerized version of the caper software. module = ['python/3.8.x-anaconda', 'openjdk/18']
// container = 'docker://git.biohpc.swmed.edu:5050/s219741/astrocyte-atac-source/atac:0.0.1'
executor = 'local' executor = 'local'
} }
}
}
\ No newline at end of file
...@@ -26,15 +26,14 @@ process runSource { ...@@ -26,15 +26,14 @@ process runSource {
output: output:
file '*' file '*'
shell: script:
''' """
export PATH="/cm/shared/apps/openjdk/18/jdk-18/bin:/cm/shared/apps/python/3.8.x-anaconda/condabin:/cm/shared/apps/python/3.8.x-anaconda/bin:$PATH" module load python/3.8.x-anaconda
python --version > python_version.txt module load openjdk/18
java -version 2> java_version_before.txt
# Enable the use of bash-specific conda commands in this shell. # Enable the use of bash-specific conda commands in this shell.
eval "$(conda shell.bash hook)" eval "\$(conda shell.bash hook)"
# Create a temporary conda environment for caper. # Create a temporary conda environment for caper.
conda create -y -c bioconda -c defaults -c conda-forge --name astrocyte-atac-caper python=3.8.18 conda create -y -c bioconda -c defaults -c conda-forge --name astrocyte-atac-caper python=3.8.18
...@@ -51,20 +50,20 @@ process runSource { ...@@ -51,20 +50,20 @@ process runSource {
caper --version > caper_version.txt caper --version > caper_version.txt
# Launch the ATAC-seq leader job. # Launch the ATAC-seq leader job.
jobsubmit=$(caper hpc submit !{baseDir}/external_repo/astrocyte-atac-runner/atac.wdl -i !{inputJson} --singularity --leader-job-name atac-source) jobsubmit=\$(caper hpc submit $baseDir/external_repo/astrocyte-atac-runner/atac.wdl -i $inputJson --singularity --leader-job-name atac-source)
# Monitor the state of the leader job; if it enters the COMPLETED, FAILED, or CANCELLED state, then finish the workflow process. # Monitor the state of the leader job; if it enters the COMPLETED, FAILED, or CANCELLED state, then finish the workflow process.
state=$(bash !{baseDir}/scripts/checkJobState.sh "${jobsubmit}") state=\$(bash $baseDir/scripts/checkJobState.sh "\$jobsubmit")
echo "Lead Job state check $(date) - State: $state" >> lead_job_check.txt echo "Lead Job state check \$(date) - State: \$state" >> lead_job_check.txt
while [[ "$state" != *"COMPLETED"* ]] && [[ "$state" != *"FAILED"* ]] && [[ "$state" != *"CANCELLED"* ]]; do while [[ "\$state" != *"COMPLETED"* ]] && [[ "\$state" != *"FAILED"* ]] && [[ "\$state" != *"CANCELLED"* ]]; do
sleep 15 sleep 15
state=$(bash !{baseDir}/scripts/checkJobState.sh "${jobsubmit}") state=\$(bash $baseDir/scripts/checkJobState.sh "\$jobsubmit")
echo "Lead Job state check $(date) - State: $state" >> lead_job_check.txt echo "Lead Job state check \$(date) - State: \$state" >> lead_job_check.txt
done done
# Deactivate the temporary caper conda environment and delete it. # Deactivate the temporary caper conda environment and delete it.
conda deactivate conda deactivate
conda remove --name astrocyte-atac-caper --all conda remove --name astrocyte-atac-caper --all
rm -rf ~/.caper/ rm -rf ~/.caper/
''' """
} }
#!/bin/bash #!/bin/bash
# Get the jobID of the caper lead job from the input txt file. # Get the jobID of the caper lead job from the given input string.
read -ra line <<< "$1" read -ra line <<< "$1"
jobID=${line[3]} jobID=${line[3]}
......
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment