Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
RNA-seq
Manage
Activity
Members
Labels
Plan
Issues
12
Issue boards
Milestones
Iterations
Wiki
Requirements
Code
Merge requests
0
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Test cases
Artifacts
Deploy
Releases
Container Registry
Operate
Environments
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Insights
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
GUDMAP_RBK
RNA-seq
Commits
2b89cc74
Commit
2b89cc74
authored
5 years ago
by
Gervaise Henry
Browse files
Options
Downloads
Patches
Plain Diff
Add align sampled data
parent
0f30eb62
2 merge requests
!37
v0.0.1
,
!28
Resolve "Move inference to start of pipeline"
Pipeline
#6500
canceled with stages
in 1 hour, 19 minutes, and 1 second
Changes
3
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
workflow/conf/biohpc.config
+6
-0
6 additions, 0 deletions
workflow/conf/biohpc.config
workflow/nextflow.config
+6
-0
6 additions, 0 deletions
workflow/nextflow.config
workflow/rna-seq.nf
+55
-5
55 additions, 5 deletions
workflow/rna-seq.nf
with
67 additions
and
5 deletions
workflow/conf/biohpc.config
+
6
−
0
View file @
2b89cc74
...
...
@@ -12,6 +12,9 @@ process {
withName
:
parseMetadata
{
executor
=
'local'
}
withName
:
getRefInfer
{
executor
=
'local'
}
withName
:
getRef
{
executor
=
'local'
}
...
...
@@ -21,6 +24,9 @@ process {
withName
:
downsampleData
{
executor
=
'local'
}
withName
:
alignSampleData
{
queue
=
'super'
}
withName
:
alignData
{
queue
=
'256GB,256GBv1'
}
...
...
This diff is collapsed.
Click to expand it.
workflow/nextflow.config
+
6
−
0
View file @
2b89cc74
...
...
@@ -20,6 +20,9 @@ process {
withName
:
parseMetadata
{
container
=
'bicf/python3:1.3'
}
withName
:
getRefInfer
{
container
=
'bicf/awscli:1.1'
}
withName
:
getRef
{
container
=
'bicf/awscli:1.1'
}
...
...
@@ -29,6 +32,9 @@ process {
withName
:
downsampleData
{
container
=
'bicf/seqtk:2.0.0'
}
withName
:
alignSampleData
{
container
=
'bicf/gudmaprbkaligner:2.0.0'
}
withName
:
alignData
{
container
=
'bicf/gudmaprbkaligner:2.0.0'
}
...
...
This diff is collapsed.
Click to expand it.
workflow/rna-seq.nf
+
55
−
5
View file @
2b89cc74
...
...
@@ -118,7 +118,7 @@ process getData {
"""
}
// Replicate raw fastqs for multiple process inputs
// Replicate raw fastq
'
s for multiple process inputs
fastqs.into {
fastqs_downsampleData
fastqs_trimData
...
...
@@ -195,6 +195,7 @@ metadata.splitCsv(sep: ",", header: false).separate(
// Replicate metadata for multiple process inputs
endsManual.into {
endsManual_downsampleData
endsManual_alignSampleData
endsManual_trimData
endsManual_alignData
endsManual_featureCounts
...
...
@@ -221,7 +222,7 @@ process getRefInfer {
val referenceInfer
output:
tuple val (
"${
referenceInfer
}"
), path ("hisat2", type: 'dir'), path ("bed", type: 'dir'), path ("*.fna"), path ("*.gtf") into refInfer
tuple val (referenceInfer), path ("hisat2", type: 'dir'), path ("bed", type: 'dir'), path ("*.fna"), path ("*.gtf") into refInfer
path ("${repRID}.getRefInfer.{out,err}")
script:
...
...
@@ -359,7 +360,7 @@ process trimData {
hostname > ${repRID}.trimData.err
ulimit -a >> ${repRID}.trimData.err
#Trim fastqs using trim_galore
#Trim fastq
'
s using trim_galore
if [ "${endsManual_trimData}" == "se" ]
then
echo "LOG: running trim_galore using single-end settings" >> ${repRID}.trimData.err
...
...
@@ -372,7 +373,7 @@ process trimData {
"""
}
// Replicate trimmed fastqs
// Replicate trimmed fastq
'
s
fastqsTrim.into {
fastqsTrim_downsampleData
fastqsTrim_alignData
...
...
@@ -390,7 +391,8 @@ process downsampleData {
path fastq from fastqsTrim_downsampleData
output:
path ("sampled.{1,2}.fq") into fastqsSample
path ("sampled.1.fq") into fastqs1Sample
path ("sampled.2.fq") optional true into fastqs2Sample
path ("${repRID}.downsampleData.{out,err}")
script:
...
...
@@ -413,6 +415,54 @@ process downsampleData {
"""
}
// Replicate the dowsampled fastq's and attatched to the references
inferInput = endsManual_alignSampleData.combine(refInfer.combine(fastqs1Sample.collect().combine(fastqs2Sample.collect())))
/*
* alignSampleData: aligns the downsampled reads to a reference database
*/
process alignSampleData {
tag "${ref}"
publishDir "${logsDir}", mode: "copy", pattern: "${repRID}.alignSampleData.{out,err}"
input:
tuple val (ends), val (ref), path (hisat2), path (bed), path (fna), path (gtf), path (fastq1), path (fastq2) from inferInput
output:
tuple val (ref), path ("sampled.sorted.bam"), path ("sampled.sorted.bam.bai"), path (bed) into sampleBam
path ("*.alignSampleSummary.txt") into alignSampleQC
path ("${repRID}.alignSampleData.{out,err}")
script:
"""
hostname > ${repRID}.alignSampleData.err
ulimit -a >> ${repRID}.alignSampleData.err
#Align the reads with Hisat 2
if [ "${ends}" == "se" ]
then
echo "LOG: running Hisat2 with single-end settings" >> ${repRID}.align.err
hisat2 -p `nproc` --add-chrname -S sampled.sam -x hisat2/genome -U ${fastq1} --summary-file ${repRID}.alignSampleSummary.txt --new-summary 1>> ${repRID}.alignSampleData.out 2>> ${repRID}.alignSampleData.err
elif [ "${ends}" == "pe" ]
then
echo "LOG: running Hisat2 with paired-end settings" >> ${repRID}.align.err
hisat2 -p `nproc` --add-chrname -S sampled.sam -x hisat2/genome --no-mixed --no-discordant -1 ${fastq1} -2 ${fastq2} --summary-file ${repRID}.alignSampleSummary.txt --new-summary 1>> ${repRID}.alignSampleData.out 2>> ${repRID}.alignSampleData.err
fi
#Convert the output sam file to a sorted bam file using Samtools
echo "LOG: converting from sam to bam" >> ${repRID}.alignSampleData.err
samtools view -1 -@ `nproc` -F 4 -F 8 -F 256 -o sampled.bam sampled.sam 1>> ${repRID}.alignSampleData.out 2>> ${repRID}.alignSampleData.err;
#Sort the bam file using Samtools
echo "LOG: sorting the bam file" >> ${repRID}.alignSampleData.err
samtools sort -@ `nproc` -O BAM -o sampled.sorted.bam sampled.bam 1>> ${repRID}.alignSampleData.out 2>> ${repRID}.alignSampleData.err;
#Index the sorted bam using Samtools
echo "LOG: indexing sorted bam file" >> ${repRID}.alignSampleData.err
samtools index -@ `nproc` -b sampled.sorted.bam sampled.sorted.bam.bai 1>> ${repRID}.alignSampleData.out 2>> ${repRID}.alignSampleData.err;
"""
}
/*
* alignData: aligns the reads to a reference database
*/
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment