diff --git a/README.md b/README.md index 3be132add5e564245493866d6165cb0f80d9c753..a284aaa5729ffec03689048a7412a22f2d437135 100644 --- a/README.md +++ b/README.md @@ -12,7 +12,7 @@ ## Introduction -BICF ChIPseq is a bioinformatics best-practice analysis pipeline used for ChIP-seq (chromatin immunoprecipitation sequencing) data analysis at [BICF](http://www.utsouthwestern.edu/labs/bioinformatics/) at [UT Southwestern Dept. of Bioinformatics](http://www.utsouthwestern.edu/departments/bioinformatics/). +BICF ChIPseq is a bioinformatics best-practice analysis pipeline used for ChIP-seq (chromatin immunoprecipitation sequencing) data analysis at [BICF](http://www.utsouthwestern.edu/labs/bioinformatics/) at [UT Southwestern Department of Bioinformatics](http://www.utsouthwestern.edu/departments/bioinformatics/). The pipeline uses [Nextflow](https://www.nextflow.io), a bioinformatics workflow tool. It pre-processes raw data from FastQ inputs, aligns the reads and performs extensive quality-control on the results. @@ -45,14 +45,13 @@ $ git clone git@git.biohpc.swmed.edu:BICF/Astrocyte/chipseq_analysis.git 9. fastq_read2 name of fastq file 2 for PE data - + See [HERE](https://git.biohpc.swmed.edu/BICF/Astrocyte/chipseq_analysis/blob/master/test_data/design_ENCSR729LGA_PE.txt) for an example design file, paired-end + See [HERE](test_data/design_ENCSR729LGA_PE.txt) for an example design file, paired-end - + See [HERE](https://git.biohpc.swmed.edu/BICF/Astrocyte/chipseq_analysis/blob/master/test_data/design_ENCSR238SGC_SE.txt) for an example design file, single-end + + See [HERE](test_data/design_ENCSR238SGC_SE.txt) for an example design file, single-end ##### 3) Bash Script + You will need to create a bash script to run the CHIPseq pipeline on [BioHPC](https://portal.biohpc.swmed.edu/content/) + This pipeline has been optimized for the correct partition - + See [HERE](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/CHIPseq.sh) for an example bash script + + See [HERE](docs/CHIPseq.sh) for an example bash script + The parameters that must be specified are: - --reads '/path/to/files/name.fastq.gz' - --designFile '/path/to/file/design.txt', @@ -74,7 +73,7 @@ $ git clone git@git.biohpc.swmed.edu:BICF/Astrocyte/chipseq_analysis.git 10. Calculate Differential Binding Activity 11. Motif Search Peaks -See [FLOWCHART](https://git.biohpc.swmed.ed/bchen4/chipseq_analysis/raw/master/docs/flowchar.pdf) +See [FLOWCHART](docs/flowchar.pdf) ## Output Files Folder | File | Description @@ -97,7 +96,7 @@ experimentQC | heatmeap_SpearmanCorr.pdf | plot of Spearman correlation between experimentQC | heatmeap_PearsonCorr.pdf | plot of Pearson correlation between samples experimentQC | sample_mbs.npz | array of multiple BAM summaries crossReads | *.filt.nodup.tagAlign.15.tagAlign.gz.cc.plot.pdf | plot of cross-correlation to assess signal-to-noise ratios -crossReads | *.filt.nodup.tagAlign.15.tagAlign.gz.cc.qc | cross-correlation metrics. File [HEADER](https://git.biohpc.swmed.ed/bchen4/chipseq_analysis/raw/master/docs/xcor_header.txt) +crossReads | *.filt.nodup.tagAlign.15.tagAlign.gz.cc.qc | cross-correlation metrics. File [HEADER](docs/xcor_header.txt) callPeaksMACS | *.fc_signal.bw | bigwig data file; raw fold enrichment of sample/control callPeaksMACS | *.pvalue_signal.bw | bigwig data file; sample/control signal adjusted for pvalue significance callPeaksMACS | *_peaks.narrowPeak | peaks file; see [HERE](https://genome.ucsc.edu/FAQ/FAQformat.html#format12) for ENCODE narrowPeak header format @@ -130,26 +129,26 @@ diffPeaks | *_diffbind.csv | Use only for replicated samples; CSV file of peaks If you find an error, please let the [BICF](mailto:BICF@UTSouthwestern.edu) know and we will add it here. ## Programs and Versions - + python/3.6.1-2-anaconda [website](https://www.anaconda.com/download/#linux) [citation](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/references.txt) - + trimgalore/0.4.1 [website](https://github.com/FelixKrueger/TrimGalore) [citation](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/references.txt) - + cutadapt/1.9.1 [website](https://cutadapt.readthedocs.io/en/stable/index.html) [citation](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/references.txt) - + bwa/intel/0.7.12 [website](http://bio-bwa.sourceforge.net/) [citation](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/references.txt) - + samtools/1.6 [website](http://samtools.sourceforge.net/) [citation](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/references.txt) - + sambamba/0.6.6 [website](http://lomereiter.github.io/sambamba/) [citation](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/references.txt) - + bedtools/2.26.0 [website](https://bedtools.readthedocs.io/en/latest/) [citation](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/references.txt) - + deeptools/2.5.0.1 [website](https://deeptools.readthedocs.io/en/develop/) [citation](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/references.txt) - + phantompeakqualtools/1.2 [website](https://github.com/kundajelab/phantompeakqualtools) [citation](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/references.txt) - + macs/2.1.0-20151222 [website](http://liulab.dfci.harvard.edu/MACS/) [citation](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/references.txt) - + UCSC_userApps/v317 [website](https://genome.ucsc.edu/util.html) [citation](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/references.txt) - + R/3.3.2-gccmkl [website](https://www.r-project.org/) [citation](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/references.txt) - + meme/4.11.1-gcc-openmpi [website](http://meme-suite.org/doc/install.html?man_type=web) [citation](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/references.txt) - + ChIPseeker [website](https://bioconductor.org/packages/release/bioc/html/ChIPseeker.html) [citation](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/references.txt) - + DiffBind [website](https://bioconductor.org/packages/release/bioc/html/DiffBind.html) [citation](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/references.txt) + + python/3.6.1-2-anaconda [website](https://www.anaconda.com/download/#linux) [citation](docs/references.txt) + + trimgalore/0.4.1 [website](https://github.com/FelixKrueger/TrimGalore) [citation](docs/references.txt) + + cutadapt/1.9.1 [website](https://cutadapt.readthedocs.io/en/stable/index.html) [citation](docs/references.txt) + + bwa/intel/0.7.12 [website](http://bio-bwa.sourceforge.net/) [citation](docs/references.txt) + + samtools/1.6 [website](http://samtools.sourceforge.net/) [citation](docs/references.txt) + + sambamba/0.6.6 [website](http://lomereiter.github.io/sambamba/) [citation](docs/references.txt) + + bedtools/2.26.0 [website](https://bedtools.readthedocs.io/en/latest/) [citation](docs/references.txt) + + deeptools/2.5.0.1 [website](https://deeptools.readthedocs.io/en/develop/) [citation](docs/references.txt) + + phantompeakqualtools/1.2 [website](https://github.com/kundajelab/phantompeakqualtools) [citation](docs/references.txt) + + macs/2.1.0-20151222 [website](http://liulab.dfci.harvard.edu/MACS/) [citation](docs/references.txt) + + UCSC_userApps/v317 [website](https://genome.ucsc.edu/util.html) [citation](docs/references.txt) + + R/3.3.2-gccmkl [website](https://www.r-project.org/) [citation](docs/references.txt) + + meme/4.11.1-gcc-openmpi [website](http://meme-suite.org/doc/install.html?man_type=web) [citation](docs/references.txt) + + ChIPseeker [website](https://bioconductor.org/packages/release/bioc/html/ChIPseeker.html) [citation](docs/references.txt) + + DiffBind [website](https://bioconductor.org/packages/release/bioc/html/DiffBind.html) [citation](docs/references.txt) ## Credits This example worklow is derived from original scripts kindly contributed by the Bioinformatic Core Facility ([BICF](https://www.utsouthwestern.edu/labs/bioinformatics/)), in the [Department of Bioinformatics](https://www.utsouthwestern.edu/departments/bioinformatics/). ## Citation -Please cite individual programs and versions used [HERE](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/references.txt). Also, please look out for our pipeline to be published in the future [HERE](https://zenodo.org/). +Please cite individual programs and versions used [HERE](docs/references.txt). Also, please look out for our pipeline to be published in the future [HERE](https://zenodo.org/).