Skip to content
Snippets Groups Projects
Gervaise H. Henry's avatar
Gervaise Henry authored
Merge develop into master

See merge request !2
25690b09

Determining cellular heterogeneity in the human prostate with single-cell RNA sequencing

Data Analysis

  • Requirements:

    • /analysis/DATA/
      • Pd-demultiplex.csv
      • 10x/
        • aggregation_csv.csv
        • GRCh38/
          • "barcodes.tsv"
          • "genes.tsv"
          • "matrix.mtx"
    • 10x CellRanger analyzed data:
      • "filtered_gene_bc_matrice_mex" folder
      • csv file cellranger aggr used for aggregation
      • demultiplex csv file to define subsets of samples
    • R (v3.4.1) packages:
      • methods (v3.4.1)
      • optparse (v1.4.4)
      • Seurat (v2.3.1)
      • readr (v1.1.1)
      • fBasics (v3042.89)
      • pastecs (v1.3.21)
      • qusage (v2.10.0)
      • RColorBrewer (v1.1-2)
      • monocle (v2.6.3)
      • dplyr (v0.7.6)
      • viridis (v0.5.1)
      • and all dependencies
  • HOW TO RUN

  • Pipeline:

    • Link cellranger count/aggr output to analysis
    • Create demultiplex file to add custom sample groups
    • Load R packages
    • Create analysis folders
    • Load analysis parameters (from default or overwrite from command line)
    • Load cellranger data into R/Seurat
    • Label cells based on their cell cycle stated using Seurat based method
    • QC and filter cells/genes
    • If analyzing samples from multiple patients: Align experiments using canonical correlation analysis (CCA)
    • If analyzing samples from one patient: Perform principle component analysis (PCA) using most highly variable genes (HVG) for downstream clustering etc
    • Perform initial "over" clustering
    • Identify "highly stressed" cells using custom PCA based analysis, remove stressed clusters/cells, and re-cluster
    • Correlate cluster gene expression using Quantitative Set Analysis for Gene Expression (QuSAGE) on lineage genesets for identification (epithelia, and stroma)
    • Subset epithelia from stroma for additional analaysis
      • Re-cluster cell types separately
        • Correlate cluster gene expression using QuSAGE on epithelial subtype genesets for identification (basal, luminal and "other")
        • Correlate cluster gene expression using QuSAGE on stromal subtype genesets for identification (fibroblasts, smooth muscle, endothelia and leukocyte)
      • Optional: Correlate cluster gene expression using QuSAGE on additional genesets for analysis
    • Merge epithelial and stromal cells
    • Identify neuroendocrine cells from epithelial cells using custom PCA based analysis
    • Tabulate population cell numbers
    • Generate differentially expressed genelists (DEGs) of populations
  • Genesets:

    • Cell cycle:
    • Stress:
      • "DEG_C2.CGP.M10970.txt" MSigDB C2 Chemical and Genetic Perturbations M10970 CHUANG_OXIDATIVE_STRESS_RESPONSE_UP
      • "genes.deg.Stress.csv" DWS generated DEGs of stressed cells from scRNA-Seq of 3 patient aggregate
    • Lineage:
      • "DEG_Epi_5FC.txt" DWS generated DEGs of epithelia from FACS population (bulk) RNA-sequencing
      • "DEG_FMSt_5FC.txt" DWS generated DEGs of fibromuscular stroma from FACS population (bulk) RNA-sequencing
      • "genes.deg.Epi.csv" DWS generated DEGs of epithelial cells from scRNA-Seq of 3 patient aggregate
      • "genes.deg.St.csv" DWS generated DEGs of stromal cells from scRNA-Seq of 3 patient aggregate
    • Epithelia:
      • "DEG_BE_5FC.txt" DWS generated DEGs of basal epithelia from FACS population (bulk) RNA-sequencing
      • "DEG_LE_5FC.txt" DWS generated DEGs of luminal epithelia from FACS population (bulk) RNA-sequencing
      • "DEG_OE_5FC.txt" DWS generated DEGs of "other" epithelia from FACS population (bulk) RNA-sequencing
      • "genes.deg.BE.csv" DWS generated DEGs of basal epithelial cells from scRNA-Seq of 3 patient aggregate
      • "genes.deg.LE.csv" DWS generated DEGs of luminal epithelial cells from scRNA-Seq of 3 patient aggregate
      • "genes.deg.OE1.csv" DWS generated DEGs of "other" epithelia cluster 1 cells from scRNA-Seq of 3 patient aggregate
      • "genes.deg.OE2.csv" DWS generated DEGs of "other" epithelia cluster 2 cells from scRNA-Seq of 3 patient aggregate
    • Stroma:
      • "DEG_C5.BP.M11704.txt" MSigDB C5 GO Biological Processes M11704 GO_ENDOTHELIAL_CELL_DIFFERENTIATION
      • "DEG_C5.BP.M10794.txt" MSigDB C5 GO Biological Processes M10794 GO_SMOOTH_MUSCLE_CELL_DIFFERENTIATION
      • "DEG_C5.BP.M13024.txt" MSigDB C5 GO Biological Processes M13024 GO_REGULATION_OF_FIBROBLAST_PROLIFERATION
      • "DEG_C5.BP.M10124.txt" MSigDB C5 GO Biological Processes M10124 GO_LEUKOCYTE_ACTIVATION
      • "genes.deg.Fib.csv" DWS generated DEGs of fibroblast cells from scRNA-Seq of 3 patient aggregate
      • "genes.deg.SM.csv" DWS generated DEGs of smooth muscle cells from scRNA-Seq of 3 patient aggregate
      • "genes.deg.Endo.csv" DWS generated DEGs of endothelial ccells from scRNA-Seq of 3 patient aggregate
      • "genes.deg.Leu.csv" DWS generated DEGs of leukocyte cells from scRNA-Seq of 3 patient aggregate
    • Neuroendocrine:
      • "EurUrol.2005.NE.txt" Neuroendocrine markers from Table 1 of Eur Urol. 2005 Feb;47(2):147-55
      • "genes.deg.NE.csv" DWS generated DEGs of neuroendocrine epithelial cells from scRNA-Seq of 3 patient aggregate
    • Lung epithelia from Lung Gene Expression Analysis (LGEA) Web Portal:
    • Lung epithelia from Nature 2018 Aug;560(7718):319
      • "SupTab3_Consensus_Sigs.csv" scRNA-Sequencing DEGs of mouse basal, club, ciliated, tuft, neuroendocrine, ionocyte cells (Supplementary Tables, SupTab3_Consensus_Sigs)
      • "SupTab6_Krt13_Hillock.csv" scRNA-Sequencing DEGs of mouse Krt13+ hillock cells (Supplementary Tables, SupTab6_Krt13_Hillock)
      • "Ensemble.mus-hum.txt" Ensemble export of mouse (GRCm38.p6) to human ortholog mapping
    • General MSigDb
      • "c2.all.v6.1.symbols.gmt" MSigDB C2 Curated Gene Sets MSigDB C2
      • "c2.cp.kegg.v6.1.symbols" MSigDB C2 KEGG Gene Subsets KEGG
      • "c5.all.v6.1.symbols.gmt" MSigDB C5 Gene Ontology Gene Sets MSigDB C5
      • "c5.bp.v6.1.symbols.gmt" MSigDB C5 Gene Ontology Biological Processes Gene Subsets GO BP