diff --git a/README.md b/README.md index 0cbc9d47de6d7ba639749a8353f6fa121c825022..1a1b42a8995c330b044fbd37ad89aabc1193a471 100644 --- a/README.md +++ b/README.md @@ -5,6 +5,7 @@ [](https://www.nextflow.io/) [](https://astrocyte-test.biohpc.swmed.edu/static/docs/index.html) +[](https://doi.org/10.5281/zenodo.2648845) ## Introduction diff --git a/docs/index.md b/docs/index.md index b6c66cec174ba4aeca2216c39f18dda3bfa89010..bb05e57ec7d5f5378ea35df6af8c55b5783528a4 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,4 +1,4 @@ -# Astrocyte ChIPseq analysis Workflow Package +# BICF ChIP-seq Analysis Workflow ## Introduction **ChIP-seq Analysis** is a bioinformatics best-practice analysis pipeline used for chromatin immunoprecipitation (ChIP-seq) data analysis. @@ -7,16 +7,16 @@ The pipeline uses [Nextflow](https://www.nextflow.io), a bioinformatics workflow ### Pipeline Steps -1) Trim adaptors TrimGalore! -2) Align with BWA -3) Filter reads with Sambamba S -4) Quality control with DeepTools -5) Calculate Cross-correlation using SPP and PhantomPeakQualTools -6) Signal profiling using MACS2 -7) Call consenus peaks -8) Annotate all peaks using ChipSeeker -9) Use MEME-ChIP to find motifs in original peaks -10) Find differential expressed peaks using DiffBind (If more than 1 experiment) + 1) Trim adaptors TrimGalore! + 2) Align with BWA + 3) Filter reads with Sambamba S + 4) Quality control with DeepTools + 5) Calculate Cross-correlation using SPP and PhantomPeakQualTools + 6) Signal profiling using MACS2 + 7) Call consenus peaks + 8) Annotate all peaks using ChipSeeker + 9) Use MEME-ChIP to find motifs in original peaks + 10) Find differential expressed peaks using DiffBind (If more than 1 experiment) ## Workflow Parameters @@ -25,41 +25,35 @@ The pipeline uses [Nextflow](https://www.nextflow.io), a bioinformatics workflow pairedEnd - Choose True/False if data is paired-end design - Choose the file with the experiment design information. TSV format genome - Choose a genomic reference (genome). + skipDiff - Choose True/False if data if you want to run Differential Peaks + skipMotif - Choose True/False if data if you want to run Motif Calling ## Design file - The following columns are necessary, must be named as in template. An design file template can be downloaded [HERE](https://git.biohpc.swmed.edu/bchen4/chipseq_analysis/raw/master/docs/design_example.csv) - - SampleID - The id of the sample. This will be the header in output files, please make sure it is concise - Tissue - Tissue of the sample - Factor - Factor of the experiment - Condition - This is the group that will be used for pairwise differential expression analysis - Replicate - Replicate id - Peaks - The file name of the peak file for this sample - bamReads - The file name of the IP BAM for this sample - bamControl - The file name of the control BAM for this sample - ContorlID - The id of the control sample - PeakCaller - The peak caller used - + The following columns are necessary, must be named as in template. An design file template can be downloaded [HERE](https://git.biohpc.swmed.edu/BICF/Astrocyte/chipseq_analysis/blob/master/docs/design_example.txt) + + sample_id + The id of the sample. This will be the name used in output files, please make sure it is concise and informative. + experiment_id + The id of the experiment. Used for grouping replicates. + biosample + The name of the biological sample. + factor + Factor of the experiment. + treatment + Treatment used in experiment. + replicate + Replicate number. + control_id + The sample_id of the control used for this sample. + fastq_read1 + File name of fastq file, if paired-end this is read1. + fastq_read2 + File name of read2 (for paired-end), not needed for single-end data. ### Credits -This example worklow is derived from original scripts kindly contributed by the Bioinformatic Core Facility (BICF), Department of Bioinformatics - -### References +This worklow is was developed jointly with the [Bioinformatic Core Facility (BICF), Department of Bioinformatics](http://www.utsouthwestern.edu/labs/bioinformatics/) -* ChipSeeker: http://bioconductor.org/packages/release/bioc/html/ChIPseeker.html -* DiffBind: http://bioconductor.org/packages/release/bioc/html/DiffBind.html -* Deeptools: https://deeptools.github.io/ -* MEME-ChIP: http://meme-suite.org/doc/meme-chip.html +Please cite in publications: Pipeline was developed by BICF from funding provided by **Cancer Prevention and Research Institute of Texas (RP150596)**. diff --git a/docs/references.md b/docs/references.md index a5eba7df9025ddbd0ca98ee0a1b7c0e89f8c9f2d..ea99ec7cb1a6356dfea0062f965f418d4cc58b15 100644 --- a/docs/references.md +++ b/docs/references.md @@ -50,3 +50,8 @@ 16. **MultiQc**: * Ewels P., Magnusson M., Lundin S. and Käller M. 2016. MultiQC: Summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32(19): 3047–3048. doi:[10.1093/bioinformatics/btw354 ](https://dx.doi.org/10.1093/bioinformatics/btw354) + +17. **BICF ChIP-seq Analysis Workflow**: + * Venkat S. Malladi and Beibei Chen. (2019). BICF ChIP-seq Analysis Workflow (publish_1.0.0). Zenodo. doi:[10.5281/zenodo.2648845](https://doi.org/10.5281/zenodo.2648845) + +Please cite in publications: Pipeline was developed by BICF from funding provided by **Cancer Prevention and Research Institute of Texas (RP150596)**.