Update Readme
Update reamde to include and copy to docs
Current version of the software and issue reports are at https://git.biohpc.swmed.edu/BICF/Astrocyte/chipseq_analysis
To download the current version of the software
$ git clone git@git.biohpc.swmed.edu:BICF/Astrocyte/methylation_analysis.git
Input files
1) Fastq Files
- You will need the full path to the files for the Bash Scipt
2) Design File
-
The Design file is a tab-delimited file with 8 columns for Single-End and 9 columns for Paired-End. Letter, numbers, and underlines can be used in the names. However, the names can only begin with a letter. Columns must be as follows:
- sample_id a short, unique, and concise name used to label output files; will be used as a control_id if it is the control sample
- experiment_id biosample_treatment_factor; same name given for all replicates of treatment. Will be used for the consensus header.
- biosample symbol for tissue type or cell line
- factor symbol for antibody target
- treatment symbol of treatment applied
- replicate a number, usually from 1-3 (i.e. 1)
- control_id sample_id name that is the control for this sample
- fastq_read1 name of fastq file 1 for SE or PC data
- fastq_read2 name of fastq file 2 for PE data
-
See HERE for an example design file, paired-end
-
See HERE for an example design file, single-end
3) Bash Script
- You will need to create a bash script to run the Methylation pipeline on BioHPC
- This pipeline has been optimized for the correct partition
- See HERE for an example bash script
- The parameters that must be specified are:
Pipeline (Details output and steps)
Add flowchart See FLOWCHART
Output Files
Folder | File | Description |
---|---|---|
d |
Common Quality Control Metrics
- These are the list of files that should be reviewed before continuing on with the CHIPseq experiment. If your experiment fails any of these metrics, you should pause and re-evaluate whether the data should remain in the study.
- multiqcReport/multiqc_report.html: follow the ChiP-seq standards HERE;
- experimentQC/*_fingerprint.pdf: make sure the plots information is correct for your antibody/input. See HERE for more details.
- crossReads/*cc.plot.pdf: make sure your sample data has the correct signal intensity and location. See HERE for more details.
- crossReads/*.cc.qc: Column 9 (NSC) should be > 1.1 for experiment and < 1.1 for input. Column 10 (RSC) should be > 0.8 for experiment and < 0.8 for input. See HERE for more details.
- experimentQC/coverage.pdf, experimentQC/heatmeap_SpearmanCorr.pdf, experimentQC/heatmeap_PearsonCorr.pdf: See HERE for more details.
Common Errors
If you find an error, please let the BICF know and we will add it here.
Citation
Please cite individual programs and versions used HERE, and the pipeline doi:. Please cite in publications: Pipeline was developed by BICF from funding provided by Cancer Prevention and Research Institute of Texas (RP150596).
Programs and Versions
Credits
This example worklow is derived from original scripts kindly contributed by the Bioinformatic Core Facility (BICF), in the Department of Bioinformatics.