Skip to content
Snippets Groups Projects
Commit c018e25c authored by Venkat Malladi's avatar Venkat Malladi
Browse files

Remove split study from repository. Moved to indepednent repository. Close #87.

parent 1ecf02a3
Branches
Tags
2 merge requests!58Develop,!57Prep of 1.0.0 release
...@@ -80,12 +80,6 @@ To generate you own references or new references: ...@@ -80,12 +80,6 @@ To generate you own references or new references:
Download the [reference creation script](https://git.biohpc.swmed.edu/gudmap_rbk/rna-seq/-/snippets/31). Download the [reference creation script](https://git.biohpc.swmed.edu/gudmap_rbk/rna-seq/-/snippets/31).
This script will auto create human and mouse references from GENCODE. It can also create ERCC92 spike-in references as well as concatenate them to GENCODE references automatically. In addition, it can create references from manually downloaded FASTA and GTF files. This script will auto create human and mouse references from GENCODE. It can also create ERCC92 spike-in references as well as concatenate them to GENCODE references automatically. In addition, it can create references from manually downloaded FASTA and GTF files.
To run a set of replicates from study RID:
------------------------------------------
Run in repo root dir:
* `sh workflow/scripts/splitStudy.sh [studyRID]`
It will run in parallel in batches of 5 replicatesRID with 30 second delays between launches.\
NOTE: Nextflow "local" processes for all replicates will run on the node/machine the bash script is launched from... consider running the study script on the BioHPC's SLURM cluster (use `sbatch`).
Errors: Errors:
------- -------
......
#!/usr/bin/env python3
import argparse
import pandas as pd
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
def get_args():
parser = argparse.ArgumentParser()
parser.add_argument('-s', '--studyRID',
help="The study RID.", required=True)
args = parser.parse_args()
return args
def main():
args = get_args()
studyRID = pd.read_json(args.studyRID+"_studyRID.json")
if studyRID["RID"].count() > 0:
studyRID["RID"].to_csv(
args.studyRID+"_studyRID.csv", header=False, index=False)
else:
raise Exception("No associated replicates found: %s" %
studyRID)
if __name__ == '__main__':
main()
#!/bin/bash
#SBATCH -p super
#SBATCH --job-name GUDMAP-RBK_Study
#SBATCH -t 7-0:0:0
# query GUDMAP/RBK for study RID
echo "curl --location --request GET 'https://www.gudmap.org/ermrest/catalog/2/entity/RNASeq:Replicate/Study_RID="${1}"'" | bash > $1_studyRID.json
# extract replicate RIDs
module load python/3.6.4-anaconda
python3 ./workflow/scripts/split_study.py -s $1
# run pipeline on replicate RIDs in parallel
module load nextflow/20.01.0
module load singularity/3.5.3
while read repRID; do echo ${repRID}; sleep 30; done < "$1_studyRID.csv" | xargs -P 5 -I {} nextflow -q run workflow/rna-seq.nf --repRID {} --source production --deriva /project/BICF/BICF_Core/shared/gudmap/test_data/auth/credential.json --bdbag /project/BICF/BICF_Core/shared/gudmap/test_data/auth/cookies.txt --dev false --upload true --email gervaise.henry@utsouthwestern.edu -with-report ./output/{}_report.html -with-timeline ./output/{}_timeline.html
# cleanup study RID files
rm $1_studyRID.json
#rm $1_studyRID.csv
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment