Skip to content
Snippets Groups Projects
Forked from Venkat Malladi / TFSEE
65 commits behind the upstream repository.

Random Forest of Enhancer Prediction http://enhancer.ucsd.edu/renlab/RFECS_enhancer_prediction/

Methods:

  1. Call Peaks H3K27ac, H3K4me1 using threshold of 1e-2
  2. Call Peaks H3K4me3 using threshold of 1e-5
  3. Call Transcript units per cell using GRO-seq and GRO-HMM
  4. Merge H3K27ac, H3K4me1, H3K4me3 Peaks within 500bp and make universe filter for at least 1 RPKM in a cell
  5. Merge GRO-seq transcripts and filter for a. +/- 3kb from TSS of protein coding genes Gencode and H3K4me3 pekas b. Merge overlaping transcripts c. Filter for <=9kb Short-Short-Paired and Short-Unpaired d. make universe filter for at least 1 > RPKM SSP and > 2 RPM SUP in a cell
  6. Make Enhancer regions for motif search a. Short-Short-Paired +/- 500 bp center-overlap b. Short-Unpaired +/- bp TSS of transcript c. Histone data +/- bp 500 bp center of mark
  7. De novo motif analyses were performed using the command-line version of MEME (Bailey et al., 2009). The following parameters were used for motif prediction: (1) zero or one occurrence per sequence (- mod zoops); (2) number of motifs (-nmotifs 15); (3) minimum, maximum width of the motif (- minw 8, -maxw 15); and (4) search for motif in given strand and reverse complement strand (- revcomp). The predicted motifs from MEME were matched to known motifs using TOMTOM (Gupta et al., 2007).