Skip to content
Snippets Groups Projects

Determining cellular heterogeneity in the human prostate with single-cell RNA sequencing

Data Analysis

  • Requirements:

    • /analysis/DATA/D17PrF/
      • D17PrF-demultiplex.csv
      • 10x/
        • aggregation_csv.csv
        • GRCh38/
          • "barcodes.tsv"
          • "genes.tsv"
          • "matrix.mtx"
    • /analysis/DATA/D27PrF/
      • D27PrF-demultiplex.csv
      • 10x/
        • aggregation_csv.csv
        • GRCh38/
          • "barcodes.tsv"
          • "genes.tsv"
          • "matrix.mtx"
    • 10x CellRanger analyzed data:
      • "filtered_gene_bc_matrice_mex" folder
      • csv file cellranger aggr used for aggregation
      • demultiplex csv file to define subsets of samples
    • R (v3.4.1) packages:
      • methods (v3.4.1)
      • optparse (v1.4.4)
      • Seurat (v2.3.1)
      • readr (v1.1.1)
      • fBasics (v3042.89)
      • pastecs (v1.3.21)
      • qusage (v2.10.0)
      • RColorBrewer (v1.1-2)
      • monocle (v2.6.3)
      • dplyr (v0.7.6)
      • and all dependencies
  • HOW TO RUN

    • 1 Patient D17PrF
      • run bash script sc_TissueMapper-D17PrF.sh
    • 2 Patients D17PrF and D27PrF
      • run bash script sc_TissueMapper-DPrF2.sh
  • Pipeline:

    • Link cellranger count/aggr output to analysis
    • Create demultiplex file to add custom sample groups
    • Load R packages
    • Create analysis folders
    • Load analysis parameters (from default or overwrite from command line)
    • Load cellranger data into R/Seurat
    • Label cells based on their cell cycle stated using Seurat based method
    • QC and filter cells/genes
    • If combining multiple experiments: Align experiments using canonical correlation analysis (CCA)
    • Perform principle component analysis (PCA) using most highly variable genes (HVG) for downstream clustering etc
    • Perform initial "over" clustering
    • Identify "highly stressed" cells using custom PCA based analysis, remove stressed clusters/cells, and re-cluster
    • Correlate cluster gene expression using Quantitative Set Analysis for Gene Expression (QuSAGE) on lineage genesets for identification (epithelia, and stroma)
    • Subset epithelia from stroma for additional analaysis
      • Re-cluster cell types separately
        • Correlate cluster gene expression using QuSAGE on epithelial subtype genesets for identification (basal, luminal and "other")
        • Correlate cluster gene expression using QuSAGE on stromal subtype genesets for identification (fibroblasts, smooth muscle, endothelia and leukocyte)
      • Optional: Correlate cluster gene expression using QuSAGE on additional genesets for analysis
    • Merge epithelial and stromal cells
    • Identify neuroendocrine cells from epithelial cells using custom PCA based analysis
    • Tabulate population cell numbers
    • Generate differentially expressed genelists (DEGs) of populations
  • Genesets:

    • Cell cycle:
    • Stress:
      • "DEG_C2.CGP.M10970.txt" MSigDB C2 Chemical and Genetic Perturbations M10970 CHUANG_OXIDATIVE_STRESS_RESPONSE_UP
      • "genes.deg.Stress.csv" DWS generated DEGs of stressed cells from scRNA-Seq of patient D17 only
      • "" DWS generated DEGs of stressed cells from scRNA-Seq of an aggregation of patient D17 and D27
    • Lineage:
      • "DEG_Epi_5FC.txt" DWS generated DEGs of epithelia from FACS population (bulk) RNA-sequencing
      • "DEG_FMSt_5FC.txt" DWS generated DEGs of fibromuscular stroma from FACS population (bulk) RNA-sequencing
      • "genes.deg.Epi.csv" DWS generated DEGs of epithelial cells from scRNA-Seq of patient D17 only
      • "genes.deg.St.csv" DWS generated DEGs of stromal cells from scRNA-Seq of patient D17 only
      • "" DWS generated DEGs of epithelial cells from from scRNA-Seq of an aggregation of patient D17 and D27
      • "" DWS generated DEGs of stromal cells from scRNA-Seq of an aggregation of patient D17 and D27
    • Epithelia:
      • "DEG_BE_5FC.txt" DWS generated DEGs of basal epithelia from FACS population (bulk) RNA-sequencing
      • "DEG_LE_5FC.txt" DWS generated DEGs of luminal epithelia from FACS population (bulk) RNA-sequencing
      • "DEG_OE_5FC.txt" DWS generated DEGs of "other" epithelia from FACS population (bulk) RNA-sequencing
      • "genes.deg.BE.csv" DWS generated DEGs of basal epithelial cells from scRNA-Seq of patient D17 only
      • "genes.deg.LE.csv" DWS generated DEGs of luminal epithelial cells from scRNA-Seq of patient D17 only
      • "genes.deg.OE1.csv" DWS generated DEGs of "other" epithelia cluster 1 cells from scRNA-Seq of patient D17 only
      • "genes.deg.OE2.csv" DWS generated DEGs of "other" epithelia cluster 2 cells from scRNA-Seq of patient D17 only
    • Stroma:
    • Neuroendocrine:
      • "EurUrol.2005.NE.txt" Neuroendocrine markers from Table 1 of Eur Urol. 2005 Feb;47(2):147-55
      • "genes.deg.NE.csv" DWS generated DEGs of neuroendocrine epithelial cells from scRNA-Seq of patient D17 only
      • "" DWS generated DEGs of neuroendocrine epithelial cells from scRNA-Seq of patient D17 only
    • Lung epithelia from Lung Gene Expression Analysis (LGEA) Web Portal:
    • General MSigDb
      • "c2.all.v6.1.symbols.gmt" MSigDB C2 Curated Gene Sets MSigDB C2
      • "c2.cp.kegg.v6.1.symbols" MSigDB C2 KEGG Gene Subsets KEGG
      • "c5.all.v6.1.symbols.gmt" MSigDB C5 Gene Ontology Gene Sets MSigDB C5
      • "c5.bp.v6.1.symbols.gmt" MSigDB C5 Gene Ontology Biological Processes Gene Subsets GO BP