Merge branch 'develop' into 'master'

Develop See merge request !81

Merge branch 'develop' into 'master'
Develop See merge request !81
00743aaa · Gervaise Henry · d51028bf · 0d36671f · 00743aaa · 00743aaa
Commit 00743aaa authored 4 years ago by Gervaise Henry
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -1091,35 +1091,38 @@ aws:
    - export AWS_ACCESS_KEY_ID=${aws_accesskeyid}
    - export AWS_SECRET_ACCESS_KEY=${aws_secretaccesskey}
    - aws configure set region ${aws_region}
-    - aws s3 cp ./test_data/auth/ s3://bicf-output/ci-env/auth/ --exclude "*" --include "c*" --recursive
-    - aws s3 cp ./test_data/fastq/xsmall/ s3://bicf-output/ci-env/input/ --exclude "*" --include "Q-Y5F6_10K.R*.fastq.gz" --recursive
+    - aws s3 cp ./test_data/auth/ s3://bicf-nf-output/ci-env/auth/ --exclude "*" --include "c*" --recursive
+    - aws s3 cp ./test_data/fastq/xsmall/ s3://bicf-nf-output/ci-env/input/ --exclude "*" --include "Q-Y5F6_10K.R*.fastq.gz" --recursive
    - >
      id=$(aws batch submit-job\
        --job-name nf-GUDMAP_RBK_ci-env\
        --job-queue default-bicf\
-        --job-definition nextflow-bicf-nextflow\
+        --job-definition nextflow-nf\
        --container-overrides command=$(envsubst < ./docs/nxf_aws-ci-test.json))
      id=$(echo ${id}| grep -oP "jobId\K.*" | tr -d '"' | tr -d ":" | tr -d " " | tr -d "}")
    - >
      status=$(aws batch describe-jobs --jobs ${id} | grep -oP "status\": \K.*" | tr -d '"' | tr -d ',' | tr -d " " ) &&
      until [[ "${status}" == "SUCCEEDED" || "${status}" == "FAILED" ]]; do
        status=$(aws batch describe-jobs --jobs ${id} | grep -oP "status\": \K.*" | tr -d '"' | tr -d ',' | tr -d " " ) &&
-        echo ${status} &&
-        sleep 5m
+        echo ${status}
+        if [ "${status}" != "SUCCEEDED" ] && [ "${status}" != "FAILED" ]; then
+          sleep 1m
+        fi
      done
    - >
      if [ "${status}" == "SUCCEEDED" ]; then
        curl --request GET https://img.shields.io/badge/Envronment%3A%20AWS-run%20succesful-success?style=flat > ./badges/env/aws.svg
      else
        curl --request GET https://img.shields.io/badge/Envronment%3A%20AWS-run%20failed-critical?style=flat > ./badges/env/aws.svg
+        exit 1
      fi
  after_script:
    - module load awscli/1.11.139
    - export AWS_ACCESS_KEY_ID=${aws_accesskeyid}
    - export AWS_SECRET_ACCESS_KEY=${aws_secretaccesskey}
    - aws configure set region ${aws_region}
-    - aws s3 rm s3://bicf-output/ci-env/auth/ --recursive
-    - aws s3 rm s3://bicf-output/ci-env/input/ --recursive
+    - aws s3 rm s3://bicf-nf-output/ci-env/auth/ --recursive
+    - aws s3 rm s3://bicf-nf-output/ci-env/input/ --recursive
  artifacts:
    when: always
    paths:

--- a/README.md
+++ b/README.md
@@ -75,6 +75,7 @@ FULL EXAMPLE:
 nextflow run workflow/rna-seq.nf --repRID Q-Y5JA --source production --deriva ./data/credential.json --bdbag ./data/cookies.txt --dev false --upload true -profile biohpc
 ```
 <hr>
+
 Cloud Compatibility:
 --------------------
 This pipeline is also capable of being run on AWS and DNAnexus. To do so:
@@ -88,6 +89,7 @@ This pipeline is also capable of being run on AWS and DNAnexus. To do so:
  }
  ```
  This is required for the use of `nextflow run` or `nextflow pull` pointed directly to the git repo, but also the use in AWS or DNAnexus environments as those both use `nextflow run` directly to that repo. To get around this requirement, there is a clone of the repo hosted on [GitHub](https://github.com/utsw-bicf/gudmap_rbk.rna-seq) which can be used... but the currency of that clone cannot be guarnteed!
+
 ### [AWS](https://aws.amazon.com/)
 * Build a AWS batch queue and environment either manually or with a template, such as: [Genomics Workflows on AWS](https://docs.opendata.aws/genomics-workflows/)
 * The user must have awscli configured with an appropriate authentication (with `aws configure` and access keys) in the environment which nextflow
@@ -106,6 +108,7 @@ This pipeline is also capable of being run on AWS and DNAnexus. To do so:
    --job-definition [Job Definition]\
    --container-overrides command=$(envsubst < ./docs/nxf_aws-ci-test.json)
  ```
+
 ### [DNAnexus](https://dnanexus.com/) (utilizes the [DNAnexus extension package for Nextflow (XPACK-DNANEXUS)](https://github.com/seqeralabs/xpack-dnanexus))
 * Follow the istructions from [XPACK-DNANEXUS](https://github.com/seqeralabs/xpack-dnanexus) about installing and authenticating (a valid license must be available for the extension package from Seqera Labs, as well as a subsription with DNAnexus)
 * Follow the instructions from [XPACK-DNANEXUS](https://github.com/seqeralabs/xpack-dnanexus) about launching runs. A template *json* file has been included ([dnanexusExample.json](docs/dnanexusExample.json))
@@ -122,14 +125,17 @@ This pipeline is also capable of being run on AWS and DNAnexus. To do so:
    --instance-type mem1_ssd1_v2_x16 \
    --input-json "$(envsubst < ./docs/nxf_dnanexus-ci-test.json)"
  ```
+
 ### NOTE:
 * File locations used in cloud deployments (auth files and output folder) need to be accessible in that environment (eg s3 location, or DNAnexus location). Local paths cannot be read local locations.
 <hr>
+
 To generate you own references or new references:
 ------------------------------------------
 Download the [reference creation script](https://git.biohpc.swmed.edu/gudmap_rbk/rna-seq/-/snippets/31).
 This script will auto create human and mouse references from GENCODE. It can also create ERCC92 spike-in references as well as concatenate them to GENCODE references automatically. In addition, it can create references from manually downloaded FASTA and GTF files.
 <hr>
+
 Errors:
 -------
 Error reported back to the data-hub are (they aren't thrown on the command line by the pipeline, but rather are submitted (if `--upload true`) to the data-hub for that replicate in the execution run submission):

--- a/conf/aws.config
+++ b/conf/aws.config
-params {
-  refSource = "aws"
-}
-
-workDir = 's3://gudmap-rbk.output/work'
-aws.client.storageEncryption = 'AES256'
-aws {
-  region = 'us-east-2'
-  batch {
-    cliPath = '/home/ec2-user/miniconda/bin/aws'
-  }
-}
-
-process {
-  executor = 'awsbatch'
-  cpus = 1
-  memory = '1 GB'
-
-  withName:trackStart {
-    cpus = 1
-    memory = '1 GB'
-  }
-  withName:getBag {
-    cpus = 1
-    memory = '1 GB'
-  }
-  withName:getData {
-    cpus = 1
-    memory = '1 GB'
-  }
-  withName:parseMetadata {
-    cpus = 15
-    memory = '1 GB'
-  }
-  withName:trimData {
-    cpus = 20
-    memory = '2 GB'
-  }
-  withName:getRefInfer {
-    cpus = 1
-    memory = '1 GB'
-  }
-  withName:downsampleData {
-    cpus = 1
-    memory = '1 GB'
-  }
-  withName:alignSampleData {
-    cpus = 50
-    memory = '5 GB'
-  }
-  withName:inferMetadata {
-    cpus = 5
-    memory = '1 GB'
-  }
-  withName:checkMetadata {
-    cpus = 1
-    memory = '1 GB'
-  }
-  withName:getRef {
-    cpus = 1
-    memory = '1 GB'
-  }
-  withName:alignData {
-    cpus = 50
-    memory = '10 GB'
-  }
-  withName:dedupData {
-    cpus = 5
-    memory = '20 GB'
-  }
-  withName:countData {
-    cpus = 2
-    memory = '5 GB'
-  }
-  withName:makeBigWig {
-    cpus = 15
-    memory = '5 GB'
-  }
-  withName:fastqc {
-    cpus = 1
-    memory = '1 GB'
-  }
-  withName:dataQC {
-    cpus = 15
-    memory = '2 GB'
-  }
-  withName:aggrQC {
-    cpus = 2
-    memory = '1 GB'
-  }
-  withName:uploadInputBag {
-    cpus = 1
-    memory = '1 GB'
-  }
-  withName:uploadExecutionRun {
-    cpus = 1
-    memory = '1 GB'
-  }
-  withName:uploadQC {
-    cpus = 1
-    memory = '1 GB'
-  }
-  withName:uploadProcessedFile {
-    cpus = 1
-    memory = '1 GB'
-  }
-  withName:uploadOutputBag {
-    cpus = 1
-    memory = '1 GB'
-  }
-  withName:finalizeExecutionRun {
-    cpus = 1
-    memory = '1 GB'
-  }
-  withName:failPreExecutionRun {
-    cpus = 1
-    memory = '1 GB'
-  }
-  withName:failExecutionRun {
-    cpus = 1
-    memory = '1 GB'
-  }
-  withName:uploadQC_fail {
-    cpus = 1
-    memory = '1 GB'
-  }
-}
--- a/conf/biohpc.config
+++ b/conf/biohpc.config
-params {
-  refSource = "biohpc"
-}
-
-process {
-  executor = 'slurm'
-  queue = 'super'
-  clusterOptions = '--hold'
-  time = '4h'
-  errorStrategy = 'retry'
-  maxRetries = 1
-
-  withName:trackStart {
-    executor = 'local'
-  }
-  withName:getBag {
-    executor = 'local'
-  }
-  withName:getData {
-    queue = 'super'
-  }
-  withName:parseMetadata {
-    executor = 'local'
-  }
-  withName:trimData {
-    queue = 'super'
-  }
-  withName:getRefInfer {
-    queue = 'super'
-  }
-  withName:downsampleData {
-    executor = 'local'
-  }
-  withName:alignSampleData {
-    queue = '128GB,256GB,256GBv1,384GB'
-  }
-  withName:inferMetadata {
-    queue = 'super'
-  }
-  withName:checkMetadata {
-    executor = 'local'
-  }
-  withName:getRef {
-    queue = 'super'
-  }
-  withName:alignData {
-    queue = '256GB,256GBv1'
-  }
-  withName:dedupData {
-    queue = 'super'
-  }
-  withName:countData {
-    queue = 'super'
-  }
-  withName:makeBigWig {
-    queue = 'super'
-  }
-  withName:fastqc {
-    queue = 'super'
-  }
-  withName:dataQC {
-    queue = 'super'
-  }
-  withName:aggrQC {
-    executor = 'local'
-  }
-  withName:uploadInputBag {
-    executor = 'local'
-  }
-  withName:uploadExecutionRun {
-    executor = 'local'
-  }
-  withName:uploadQC {
-    executor = 'local'
-  }
-  withName:uploadProcessedFile {
-    executor = 'local'
-  }
-  withName:uploadOutputBag {
-    executor = 'local'
-  }
-  withName:finalizeExecutionRun {
-    executor = 'local'
-  }
-  withName:failPreExecutionRun {
-    executor = 'local'
-  }
-  withName:failExecutionRun {
-    executor = 'local'
-  }
-  withName:uploadQC_fail {
-    executor = 'local'
-  }
-}
-
-singularity {
-  enabled = true
-  cacheDir = '/project/BICF/BICF_Core/shared/gudmap/singularity_cache/'
-}
-
-env {
-  http_proxy = 'http://proxy.swmed.edu:3128'
-  https_proxy = 'http://proxy.swmed.edu:3128'
-  all_proxy = 'http://proxy.swmed.edu:3128'
-}
--- a/conf/biohpc_max.config
+++ b/conf/biohpc_max.config
-process {
-  executor = 'slurm'
-  queue = '256GB,256GBv1,384GB,128GB'
-  clusterOptions = '--hold'
-}
-
-singularity {
-  enabled = true
-  cacheDir = '/project/BICF/BICF_Core/shared/gudmap/singularity_cache/'
-}
-
-env {
-  http_proxy = 'http://proxy.swmed.edu:3128'
-  https_proxy = 'http://proxy.swmed.edu:3128'
-  all_proxy = 'http://proxy.swmed.edu:3128'
-}
--- a/conf/ondemand.config
+++ b/conf/ondemand.config
-process {
-  queue = 'highpriority-0ef8afb0-c7ad-11ea-b907-06c94a3c6390'
-}
--- a/docs/nxf_aws-ci-test.json
+++ b/docs/nxf_aws-ci-test.json
-["utsw-bicf/gudmap_rbk.rna-seq","-r","v2.0.0","-profile","aws","--deriva","s3://bicf-output/ci-env/auth/credential.json","--bdbag","s3://bicf-output/ci-env/auth/cookies.txt","--repRID","Q-Y5F6","--source","staging","--upload","false","--dev","false","--ci","true","--track","false","-with-report","s3://bicf-output/ci-env/output/Q-Y5F6_fastqoverride_report.html","--refSource","datahub","--outDir","s3://bicf-output/ci-env/output/Q-Y5F6_fastqoverride","--fastqsForce","s3://bicf-output/ci-env/input/*.fastq.gz"]
+["utsw-bicf/gudmap_rbk.rna-seq","-r","v2.0.0","-profile","aws","--deriva","s3://bicf-nf-output/ci-env/auth/credential.json","--bdbag","s3://bicf-nf-output/ci-env/auth/cookies.txt","--repRID","Q-Y5F6","--source","staging","--upload","false","--dev","false","--ci","true","--refSource","datahub","--outDir","s3://bicf-nf-output/ci-env/output/Q-Y5F6_fastqoverride","--fastqsForce","s3://bicf-nf-output/ci-env/input/*.fastq.gz"]
\ No newline at end of file
--- a/nextflow.config
+++ b/nextflow.config
@@ -106,22 +106,6 @@ process {
  }
 }

-trace {
-  enabled = false
-  file = 'trace.txt'
-  fields = 'task_id,native_id,process,name,status,exit,submit,start,complete,duration,realtime,%cpu,%mem,rss'
-}
-
-timeline {
-  enabled = false
-  file = 'timeline.html'
-}
-
-report {
-  enabled = false
-  file = 'report.html'
-}
-
 tower {
  accessToken = '3ade8f325d4855434b49aa387421a44c63e3360f'
  enabled = true

--- a/nextflowConf/aws.config
+++ b/nextflowConf/aws.config
@@ -4,12 +4,10 @@ params {

 process {
  withName:trackStart {
-    executor = 'local'
    cpus = 1
    memory = '1 GB'
  }
  withName:getBag {
-    executor = 'local'
    cpus = 1
    memory = '1 GB'
  }
@@ -18,7 +16,6 @@ process {
    memory = '32 GB'
  }
  withName:parseMetadata {
-    executor = 'local'
    cpus = 1
    memory = '1 GB'
  }
@@ -35,7 +32,6 @@ process {
    memory = '32 GB'
  }
  withName:seqwho {
-    executor = 'local'
    cpus = 1
    memory = '1 GB'
  }
@@ -44,7 +40,6 @@ process {
    memory = '32 GB'
  }
  withName:downsampleData {
-    executor = 'local'
    cpus = 1
    memory = '1 GB'
  }
@@ -61,7 +56,6 @@ process {
    memory = '32 GB'
  }
  withName:checkMetadata {
-    executor = 'local'
    cpus = 1
    memory = '1 GB'
  }
@@ -86,52 +80,42 @@ process {
    memory = '32 GB'
  }
  withName:aggrQC {
-    executor = 'local'
    cpus = 1
    memory = '1 GB'
  }
  withName:uploadInputBag {
-    executor = 'local'
    cpus = 1
    memory = '1 GB'
  }
  withName:uploadExecutionRun {
-    executor = 'local'
    cpus = 1
    memory = '1 GB'
  }
  withName:uploadQC {
-    executor = 'local'
    cpus = 1
    memory = '1 GB'
  }
  withName:uploadProcessedFile {
-    executor = 'local'
    cpus = 1
    memory = '1 GB'
  }
  withName:uploadOutputBag {
-    executor = 'local'
    cpus = 1
    memory = '1 GB'
  }
  withName:finalizeExecutionRun {
-    executor = 'local'
    cpus = 1
    memory = '1 GB'
  }
  withName:failPreExecutionRun {
-    executor = 'local'
    cpus = 1
    memory = '1 GB'
  }
  withName:failExecutionRun {
-    executor = 'local'
    cpus = 1
    memory = '1 GB'
  }
  withName:uploadQC_fail {
-    executor = 'local'
    cpus = 1
    memory = '1 GB'
  }

--- a/workflow/scripts/bdbag_fetch.sh
+++ b/workflow/scripts/bdbag_fetch.sh
@@ -14,10 +14,6 @@ then
        sleep 15
    done
 fi
-if [ "${validate}" != "is valid" ]
-then
-    exit 1
-fi
 count=$(find */ -name "*[_.]R[1-2].fastq.gz" | wc -l)
 for i in $(find */ -name "*[_.]R[1-2].fastq.gz")
 do
@@ -25,3 +21,7 @@ do
    cp ${i} ./${path}
 done
 echo ${count}
+if [ "${validate}" != "is valid" ]
+then
+    exit 1
+fi