methylation_analysis issueshttps://git.biohpc.swmed.edu/BICF/Astrocyte/methylation_analysis/-/issues2020-12-29T17:33:34-06:00https://git.biohpc.swmed.edu/BICF/Astrocyte/methylation_analysis/-/issues/5Add back tags per process2020-12-29T17:33:34-06:00Venkat MalladiAdd back tags per process1.0.0https://git.biohpc.swmed.edu/BICF/Astrocyte/methylation_analysis/-/issues/3Add a changelog2020-12-29T17:33:43-06:00Venkat MalladiAdd a changelog1.0.0https://git.biohpc.swmed.edu/BICF/Astrocyte/methylation_analysis/-/issues/2Update Readme2020-12-29T17:33:49-06:00Venkat MalladiUpdate ReadmeUpdate reamde to include and copy to docs
[![pipeline status](https://git.biohpc.swmed.edu/BICF/Astrocyte/methylation_analysis/badges/master/pipeline.svg)](https://git.biohpc.swmed.edu/BICF/Astrocyte/methylation_analysis/commits/master)...Update reamde to include and copy to docs
[![pipeline status](https://git.biohpc.swmed.edu/BICF/Astrocyte/methylation_analysis/badges/master/pipeline.svg)](https://git.biohpc.swmed.edu/BICF/Astrocyte/methylation_analysis/commits/master)
[![coverage report](https://git.biohpc.swmed.edu/BICF/Astrocyte/methylation_analysis/badges/master/coverage.svg)](https://git.biohpc.swmed.edu/BICF/Astrocyte/methylation_analysis/commits/master)|
[![Nextflow](https://img.shields.io/badge/nextflow-%E2%89%A50.31.0-brightgreen)](https://www.nextflow.io/)
[![Astrocyte](https://img.shields.io/badge/astrocyte-%E2%89%A50.3.1-blue)](https://astrocyte-test.biohpc.swmed.edu/static/docs/index.html)
[![DOI]()]()
Current version of the software and issue reports are at
https://git.biohpc.swmed.edu/BICF/Astrocyte/chipseq_analysis
To download the current version of the software
```bash
$ git clone git@git.biohpc.swmed.edu:BICF/Astrocyte/methylation_analysis.git
```
## Input files
##### 1) Fastq Files
+ You will need the full path to the files for the Bash Scipt
##### 2) Design File
+ The Design file is a tab-delimited file with 8 columns for Single-End and 9 columns for Paired-End. Letter, numbers, and underlines can be used in the names. However, the names can only begin with a letter. Columns must be as follows:
1. sample_id a short, unique, and concise name used to label output files; will be used as a control_id if it is the control sample
2. experiment_id biosample_treatment_factor; same name given for all replicates of treatment. Will be used for the consensus header.
3. biosample symbol for tissue type or cell line
4. factor symbol for antibody target
5. treatment symbol of treatment applied
6. replicate a number, usually from 1-3 (i.e. 1)
7. control_id sample_id name that is the control for this sample
8. fastq_read1 name of fastq file 1 for SE or PC data
9. fastq_read2 name of fastq file 2 for PE data
+ See [HERE](test_data/test_design_pe.txt) for an example design file, paired-end
+ See [HERE](test_data/test_design_se.txt) for an example design file, single-end
##### 3) Bash Script
+ You will need to create a bash script to run the Methylation pipeline on [BioHPC](https://portal.biohpc.swmed.edu/content/)
+ This pipeline has been optimized for the correct partition
+ See [HERE](docs/Methylation.sh) for an example bash script
+ The parameters that must be specified are:
## Pipeline (Details output and steps)
+
Add flowchart
See [FLOWCHART](docs/flowchart.pdf)
## Output Files
Folder | File | Description
--- | --- | ---
d
## Common Quality Control Metrics
+ These are the list of files that should be reviewed before continuing on with the CHIPseq experiment. If your experiment fails any of these metrics, you should pause and re-evaluate whether the data should remain in the study.
1. multiqcReport/multiqc_report.html: follow the ChiP-seq standards [HERE](https://www.encodeproject.org/chip-seq/);
2. experimentQC/*_fingerprint.pdf: make sure the plots information is correct for your antibody/input. See [HERE](https://deeptools.readthedocs.io/en/develop/content/tools/plotFingerprint.html) for more details.
3. crossReads/*cc.plot.pdf: make sure your sample data has the correct signal intensity and location. See [HERE](hhttps://ccg.epfl.ch//var/sib_april15/cases/landt12/strand_correlation.html) for more details.
4. crossReads/*.cc.qc: Column 9 (NSC) should be > 1.1 for experiment and < 1.1 for input. Column 10 (RSC) should be > 0.8 for experiment and < 0.8 for input. See [HERE](https://genome.ucsc.edu/encode/qualityMetrics.html) for more details.
5. experimentQC/coverage.pdf, experimentQC/heatmeap_SpearmanCorr.pdf, experimentQC/heatmeap_PearsonCorr.pdf: See [HERE](https://deeptools.readthedocs.io/en/develop/content/list_of_tools.html) for more details.
## Common Errors
If you find an error, please let the [BICF](mailto:BICF@UTSouthwestern.edu) know and we will add it here.
## Citation
Please cite individual programs and versions used [HERE](docs/references.md), and the pipeline doi:[](). Please cite in publications: Pipeline was developed by BICF from funding provided by Cancer Prevention and Research Institute of Texas (RP150596).
## Programs and Versions
## Credits
This example worklow is derived from original scripts kindly contributed by the Bioinformatic Core Facility ([BICF](https://www.utsouthwestern.edu/labs/bioinformatics/)), in the [Department of Bioinformatics](https://www.utsouthwestern.edu/departments/bioinformatics/).1.0.0Spencer BarnesSpencer Barnes