Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.
index.md 10.40 KiB

BICF ChIP-seq Analysis Workflow

Introduction

ChIP-seq Analysis is a bioinformatics best-practice analysis pipeline used for chromatin immunoprecipitation (ChIP-seq) data analysis.

The pipeline uses Nextflow, a bioinformatics workflow tool. It pre-processes raw data from FastQ inputs, aligns the reads and performs extensive quality-control on the results.

Report issues to the Bioinformatic Core Facility BICF

Pipeline Steps

  • There are 11 steps to the pipeline
    1. Check input files
    2. Trim adaptors TrimGalore!
    3. Aligned trimmed reads with bwa, and sorts/converts to bam with samtools
    4. Mark duplicates with Sambamba, and filter reads with samtools
    5. Quality metrics with deep tools
    6. Calculate cross-correlation using PhantomPeakQualTools
    7. Call peaks with MACS
    8. Calculate consensus peaks
    9. Annotate all peaks using ChipSeeker
    10. Calculate Differential Binding Activity with DiffBind (If more than 1 rep in more than 1 experiment)
    11. Use MEME-ChIP to find motifs in original peaks

Workflow Parameters

1. One or more input FASTQ files from a ChIP-seq expereiment and a design file with the link bewetwen the same file name and sample id (required) - Choose all ChIP-seq fastq files for analysis.
2. In single-end sequencing, the sequencer reads a fragment from only one end to the other, generating the sequence of base pairs. In paired-end reading it starts at one read, finishes this direction at the specified read length, and then starts another round of reading from the opposite end of the fragment. (Paired-end: True, Single-end: False) (required)
3. A design file listing sample id, fastq files, corresponding control id and additional information about the sample. 
genome - Choose a genomic reference (genome).
4. Reference species and genome used for alignment and subsequent analysis. (required)
5. Run differential peak analysis (required). Must have at least 2 replicates per experiment and at least 2 experiments.
6. Run motif calling (required). Top 600 peaks sorted by p-value.
7. Ensure configuraton for astrocyte. (required; always true)

Design file