-
Venkat Malladi authored
# Conflicts: # docs/index.md
5f54d2ec
Code owners
Assign users and groups as approvers for specific file changes. Learn more.
index.md 10.40 KiB
BICF ChIP-seq Analysis Workflow
Introduction
ChIP-seq Analysis is a bioinformatics best-practice analysis pipeline used for chromatin immunoprecipitation (ChIP-seq) data analysis.
The pipeline uses Nextflow, a bioinformatics workflow tool. It pre-processes raw data from FastQ inputs, aligns the reads and performs extensive quality-control on the results.
Report issues to the Bioinformatic Core Facility BICF
Pipeline Steps
- There are 11 steps to the pipeline
- Check input files
- Trim adaptors TrimGalore!
- Aligned trimmed reads with bwa, and sorts/converts to bam with samtools
- Mark duplicates with Sambamba, and filter reads with samtools
- Quality metrics with deep tools
- Calculate cross-correlation using PhantomPeakQualTools
- Call peaks with MACS
- Calculate consensus peaks
- Annotate all peaks using ChipSeeker
- Calculate Differential Binding Activity with DiffBind (If more than 1 rep in more than 1 experiment)
- Use MEME-ChIP to find motifs in original peaks
Workflow Parameters
1. One or more input FASTQ files from a ChIP-seq expereiment and a design file with the link bewetwen the same file name and sample id (required) - Choose all ChIP-seq fastq files for analysis.
2. In single-end sequencing, the sequencer reads a fragment from only one end to the other, generating the sequence of base pairs. In paired-end reading it starts at one read, finishes this direction at the specified read length, and then starts another round of reading from the opposite end of the fragment. (Paired-end: True, Single-end: False) (required)
3. A design file listing sample id, fastq files, corresponding control id and additional information about the sample.
genome - Choose a genomic reference (genome).
4. Reference species and genome used for alignment and subsequent analysis. (required)
5. Run differential peak analysis (required). Must have at least 2 replicates per experiment and at least 2 experiments.
6. Run motif calling (required). Top 600 peaks sorted by p-value.
7. Ensure configuraton for astrocyte. (required; always true)