update documentation

f75b89bf · Brandi Cantarel · 9a0ea000 · f75b89bf · f75b89bf
Commit f75b89bf authored 8 years ago by Brandi Cantarel
--- a/astrocyte_pkg.yml
+++ b/astrocyte_pkg.yml
@@ -16,10 +16,7 @@ email: 'biohpc-help@utsouthwestern.edu'
 title: 'BICF RNASeq Variant Analysis Workflow'
 # A summary of the workflow package in plain text
 description: |
-  This is a workflow package for the BICF RNASeq Germline Variant workflow system.
-  It implements a simple germline variant analysis workflow using TrimGalore, HiSAT,
-  Speedseq, GATK, Samtools and FeatureCount. SNPs and Indels are integrated using BAYSIC;
-  then annotated using SNPEFF and SnpSift.
+	 THIS WORKFLOW IS OBSOLETE!  The Main BICF workflow includes variant analysis and differential expression analysis as one easy to use workflow.

 # -----------------------------------------------------------------------------
 # DOCUMENTATION

--- a/docs/index.md
+++ b/docs/index.md
 # Astrocyte Germline Variant Calling Workflow Package

-This workflow carries out a Germline Exome Analysis pipeline, including the integration of variants from various callers and basic annotation.
-
-    1) RNA Alignments are then recalibrated and realigned using GATK3 (DePristo et al 2011;McKenna et al 2010)
-
-    2) To detect genome germline variants, GATK3 (DePristo et al 2011, McKenna et al 2010), Platypus (Rimmer et al 2014), Samtools version 1.3 and FreeBayes version 0.9.7 (Garrison and Marth 2012) are used. 
-
-    3) Integration of predicted SNPs and INDELs from these algorithms is performed using BAYSIC (Cantarel et al 2014).
-
-    4) Effect of SNPs and INDELs on genes is predicted using snpEff (Cingolani et al 2012) using the gencode gene annotations. For GRCH38 Only: allele frequency in the general population is determined by comparison to ExAC (The ExAC Consortium 2015). Additionally for this build, discovered variants are annotated using SnpSift (Cingolani et al 2012) using the dbSNP, COSMIC (Forbes et al. 2009), CLINVAR (Landrum et al 2014), GWAS Catalog (Welter et al 2014) and dbNSFP (Liu et al 2011) databases.
-
-    5) Features (genes, transcripts and exons) are counted using featureCounts (Liao et al 2014) using the Gencode feature table(Harrow et al. 2012)
-
-##Workflow Parameters
-
-    rnabam - Choose the alignments of your RNASeq data (generated by RNASEq Differential Expression Pipeline).
- 
-    dnabam - Choose the bamfiles from genomic data that should be used for gene fusion
-
-    genome - Choose a genomic reference (genome).
-
-    pairs - Choose if pair-ended or single-end sequences
-
-    incdna - Choose whether GeneFusion analysis should include evidence from genomic data from the same sample
-
-    design - This file matches the fastq files to data about the sample
-
- The following columns are necessary, must be named as in template and can be in any order:
-
-    SampleID
-        This ID should match the name in the fastq file ie S0001.R1.fastq.gz the sample ID is S0001
-    SampleName
-        This ID can be the identifier of the researcher or clinician
-    SubjectID
-        Used in order to link samples from the same patient
-    Phenotype
-	2= Case or Diseaes Phenotype, 1= Healthy Control
-    Gender
-	1=male, 2=female
-    FullPathToFqR1
-	Name of the fastq file R1
-    FullPathToFqR2
-	Name of the fastq file R2
-
-There are some optional columns that might help with the analysis:
-      SequenceRun
-      Organism
-      FamilyID
-      CellPopulation
-      Treatment
-      GeneticFeature (WT or KO)
-      Race
-      Ethnicity
-      Age
-      
-
-### Test Data
-
-
-### Credits
-This example worklow is derived from original scripts kindly contributed by the Bioinformatic Core Facility (BICF), Department of Bioinformatics and Clinical Sciences.
-
-### References
-
-    Andy Rimmer, Hang Phan, Iain Mathieson, Zamin Iqbal, Stephen R. F. Twigg, WGS500 Consortium, Andrew O. M. Wilkie, Gil McVean, Gerton Lunter. Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications. Nature Genetics (2014) doi:10.1038/ng.3036
-    Bernstein, B. E., Birney, E., Dunham, I., Green, E. D., Gunter, C., & Snyder, M. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature, 489, 57–74. doi:10.1038/nature11247
-    Cantarel, B. L., Weaver, D., McNeill, N., Zhang, J., Mackey, A. J., & Reese, J. (2014). BAYSIC: a Bayesian method for combining sets of genome variants with improved specificity and sensitivity. BMC Bioinformatics, 15, 104. doi:10.1186/1471-2105-15-104
-    Cingolani, P., Platts, A., Wang, L. L., Coon, M., Nguyen, T., Wang, L., ? Ruden, D. M. (2012). A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly. doi:10.4161/fly.19695
-    Cingolani, P., Patel, V. M., Coon, M., Nguyen, T., Land, S. J., Ruden, D. M., & Lu, X. (2012). Using Drosophila melanogaster as a Model for Genotoxic Chemical Mutational Studies with a New Program, SnpSift. Frontiers in Genetics. doi:10.3389/fgene.2012.00035
-    Challis, D., Yu, J., Evani, U. S., Jackson, A. R., Paithankar, S., Coarfa, C., ? Yu, F. (2012). An integrative variant analysis suite for whole exome next-generation sequencing data. BMC Bioinformatics. doi:10.1186/1471-2105-13-8
-    DePristo, M. A., Banks, E., Poplin, R., Garimella, K. V, Maguire, J. R., Hartl, C., ? Daly, M. J. (2011). A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genetics, 43, 491–498. doi:10.1038/ng.806
-    Exome Variant Server, NHLBI GO Exome Sequencing Project (ESP), Seattle, WA (URL: http://evs.gs.washington.edu/EVS/) [01 (01, 2016) accessed].
-    Forbes, S. A., Tang, G., Bindal, N., Bamford, S., Dawson, E., Cole, C., ? Futreal, P. A. (2009). COSMIC (the Catalogue of Somatic Mutations In Cancer): A resource to investigate acquired mutations in human cancer. Nucleic Acids Research, 38. doi:10.1093/nar/gkp995
-    Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. arXiv preprint arXiv:1207.3907 [q-bio.GN] 2012
-    Hansen NF, Gartner JJ, Mei L, Samuels Y, Mullikin JC. Shimmer: detection of genetic alterations in tumors using next-generation sequence data. Bioinformatics. 2013 Jun 15;29(12):1498-503. doi: 10.1093/bioinformatics/btt183. Epub 2013 Apr 24. PubMed PMID: 23620360; PubMed Central PMCID: PMC3673219.
-    Kim S, Jeong K, Bhutani K, Lee J, Patel A, Scott E, Nam H, Lee H, Gleeson JG, Bafna V. Virmid: accurate detection of somatic mutations with sample impurity inference. Genome Biol. 2013 Aug 29;14(8):R90. doi: 10.1186/gb-2013-14-8-r90. PubMed PMID: 23987214; PubMed Central PMCID: PMC4054681.
-    Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, Miller CA, Mardis ER, Ding L, Wilson RK. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 2012 Mar;22(3):568-76. doi: 10.1101/gr.129684.111. Epub 2012 Feb 2. PubMed PMID: 22300766; PubMed Central PMCID: PMC3290792.
-    Landrum, M. J., Lee, J. M., Riley, G. R., Jang, W., Rubinstein, W. S., Church, D. M., & Maglott, D. R. (2014). ClinVar: Public archive of relationships among sequence variation and human phenotype. Nucleic Acids Research, 42. doi:10.1093/nar/gkt1113
-    Li, H. (2013). Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv Preprint arXiv, 00, 3. doi:arXiv:1303.3997 [q-bio.GN]
-    Liu X, Jian X, Boerwinkle E. dbNSFP: a lightweight database of human nonsynonymous SNPs and their functional predictions. Hum Mutat. 2011 Aug;32(8):894-9. doi: 10.1002/humu.21517. PubMed PMID: 21520341
-    McKenna, A., Hanna, M., Banks, E., Sivachenko, A., Cibulskis, K., Kernytsky, A., ? DePristo, M. A. (2010). The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research, 20, 1297–1303. doi:10.1101/gr.107524.110
-    The 1000 Genome Consortium. An integrated map of genetic variation from 1,092 human genomes. (2012). Nature, 491(7422), 56–65. Retrieved from http://dx.doi.org/10.1038/nature11632.
-    Saunders CT, Wong WS, Swamy S, Becq J, Murray LJ, Cheetham RK. Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs. Bioinformatics. 2012 Jul 15;28(14):1811-7. doi: 10.1093/bioinformatics/bts271. Epub 2012 May 10. PubMed PMID: 22581179.
-    Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, Klemm A, Flicek P, Manolio T, Hindorff L, and Parkinson H. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Research, 2014, Vol. 42 (Database issue): D1001-D1006.
+THIS WORKFLOW IS OBSOLETE!  The Main BICF workflow includes variant analysis and differential expression analysis as one easy to use workflow.