Determining cellular heterogeneity in the human prostate with single-cell RNA sequencing
- Contact: Gervaise H. Henry
- Institution: UT Southwestern Medical Center
- Department: Urology
- Lab: Strand Lab
- PI: Douglas W. Strand, PhD
- ANALYZED DATA FOR QUERYING AT: StrandLab.net
- Raw data at: GEO & GenitoUrinary Development Molecular Anatomy Project (GUDMAP)
-
Publication at:
- Cell Reports: PENDING
- BioRxiv
Data Analysis
-
Requirements:
- /analysis/DATA/
- Pd-demultiplex.csv
- 10x/
- aggregation_csv.csv
- GRCh38/
- "barcodes.tsv"
- "genes.tsv"
- "matrix.mtx"
- 10x CellRanger analyzed data:
- "filtered_gene_bc_matrice_mex" folder
- csv file cellranger aggr used for aggregation
- demultiplex csv file to define subsets of samples
- R (v3.4.1) packages:
- methods (v3.4.1)
- optparse (v1.4.4)
- Seurat (v2.3.1)
- readr (v1.1.1)
- fBasics (v3042.89)
- pastecs (v1.3.21)
- qusage (v2.10.0)
- RColorBrewer (v1.1-2)
- monocle (v2.6.3)
- dplyr (v0.7.6)
- viridis (v0.5.1)
- and all dependencies
- /analysis/DATA/
-
HOW TO RUN
- 1 Run on 3 patient aggregate
- run bash script sc_TissueMapper-Pr.sh
- 2 Run on 1st patent FACS samples
- run bash script sc_TissueMapper-D17_FACS.sh
- 3 Run on 2nd patient FACS samples
- run bash script sc_TissueMapper-D27_FACS.sh
- 4 Run on several downsamples from 1 sample from 1st patient
- run bash script sc_TissueMapper-DS_D17.sh
- 5 Aggregate and compare several downsamples from # 4
- run r script sc_TissueMapper_RUN.DS_D17.aggr.R
- 1 Run on 3 patient aggregate
-
Pipeline:
- Link cellranger count/aggr output to analysis
- Create demultiplex file to add custom sample groups
- Load R packages
- Create analysis folders
- Load analysis parameters (from default or overwrite from command line)
- Load cellranger data into R/Seurat
- Label cells based on their cell cycle stated using Seurat based method
- QC and filter cells/genes
- If analyzing samples from multiple patients: Align experiments using canonical correlation analysis (CCA)
- If analyzing samples from one patient: Perform principle component analysis (PCA) using most highly variable genes (HVG) for downstream clustering etc
- Perform initial "over" clustering
- Identify "highly stressed" cells using custom PCA based analysis, remove stressed clusters/cells, and re-cluster
- Correlate cluster gene expression using Quantitative Set Analysis for Gene Expression (QuSAGE) on lineage genesets for identification (epithelia, and stroma)
- Subset epithelia from stroma for additional analaysis
- Re-cluster cell types separately
- Correlate cluster gene expression using QuSAGE on epithelial subtype genesets for identification (basal, luminal and "other")
- Correlate cluster gene expression using QuSAGE on stromal subtype genesets for identification (fibroblasts, smooth muscle, endothelia and leukocyte)
- Optional: Correlate cluster gene expression using QuSAGE on additional genesets for analysis
- Re-cluster cell types separately
- Merge epithelial and stromal cells
- Identify neuroendocrine cells from epithelial cells using custom PCA based analysis
- Tabulate population cell numbers
- Generate differentially expressed genelists (DEGs) of populations
-
Genesets:
- Cell cycle:
- "regev_lab_cell_cycle_genes.txt" G2M and S phase genes from Genome Res. 2015 Dec; 25(12): 1860–1872
- Stress:
- "DEG_C2.CGP.M10970.txt" MSigDB C2 Chemical and Genetic Perturbations M10970 CHUANG_OXIDATIVE_STRESS_RESPONSE_UP
- "genes.deg.Stress.csv" DWS generated DEGs of stressed cells from scRNA-Seq of 3 patient aggregate
- Lineage:
- "DEG_Epi_5FC.txt" DWS generated DEGs of epithelia from FACS population (bulk) RNA-sequencing
- "DEG_FMSt_5FC.txt" DWS generated DEGs of fibromuscular stroma from FACS population (bulk) RNA-sequencing
- Epithelia:
- "DEG_BE_5FC.txt" DWS generated DEGs of basal epithelia from FACS population (bulk) RNA-sequencing
- "DEG_LE_5FC.txt" DWS generated DEGs of luminal epithelia from FACS population (bulk) RNA-sequencing
- "DEG_OE_5FC.txt" DWS generated DEGs of "other" epithelia from FACS population (bulk) RNA-sequencing
- "genes.deg.BE.csv" DWS generated DEGs of basal epithelial cells from scRNA-Seq of 3 patient aggregate
- "genes.deg.LE.csv" DWS generated DEGs of luminal epithelial cells from scRNA-Seq of 3 patient aggregate
- "genes.deg.OE1.csv" DWS generated DEGs of "other" epithelia cluster 1 cells from scRNA-Seq of 3 patient aggregate
- "genes.deg.OE2.csv" DWS generated DEGs of "other" epithelia cluster 2 cells from scRNA-Seq of 3 patient aggregate
- Stroma:
- "DEG_C5.BP.M11704.txt" MSigDB C5 GO Biological Processes M11704 GO_ENDOTHELIAL_CELL_DIFFERENTIATION
- "DEG_C5.BP.M10794.txt" MSigDB C5 GO Biological Processes M10794 GO_SMOOTH_MUSCLE_CELL_DIFFERENTIATION
- "DEG_C5.BP.M13024.txt" MSigDB C5 GO Biological Processes M13024 GO_REGULATION_OF_FIBROBLAST_PROLIFERATION
- "DEG_C5.BP.M10124.txt" MSigDB C5 GO Biological Processes M10124 GO_LEUKOCYTE_ACTIVATION
- "genes.deg.Fib.csv" DWS generated DEGs of fibroblast cells from scRNA-Seq of 3 patient aggregate
- "genes.deg.SM.csv" DWS generated DEGs of smooth muscle cells from scRNA-Seq of 3 patient aggregate
- "genes.deg.Endo.csv" DWS generated DEGs of endothelial ccells from scRNA-Seq of 3 patient aggregate
- "genes.deg.Leu.csv" DWS generated DEGs of leukocyte cells from scRNA-Seq of 3 patient aggregate
- Neuroendocrine:
- "EurUrol.2005.NE.txt" Neuroendocrine markers from Table 1 of Eur Urol. 2005 Feb;47(2):147-55
- "genes.deg.NE.csv" DWS generated DEGs of neuroendocrine epithelial cells from scRNA-Seq of 3 patient aggregate
- Lung epithelia from Lung Gene Expression Analysis (LGEA) Web Portal:
- "Basal cells-signature-genes.csv" scRNA-Seq LGEA generated top 20 DEGs for [human lung Basal Cells] (https://research.cchmc.org/pbge/lunggens/lungDisease/celltype_IPF.html?cid=3)
- "Normal AT2 cells-signature-genes.csv" scRNA-Sequecing LGEA generated top 20 DEGs for human lung Alveolar Type 2 Cells
- "Club_Goblet cells-signature-genes.csv" scRNA-Sequencing LGEA generated top 20 DEGs for human lung Club/Goblet Cells
- Lung epithelia from Nature 2018 Aug;560(7718):319
- "SupTab3_Consensus_Sigs.csv" scRNA-Sequencing DEGs of mouse basal, club, ciliated, tuft, neuroendocrine, ionocyte cells (Supplementary Tables, SupTab3_Consensus_Sigs)
- "SupTab6_Krt13_Hillock.csv" scRNA-Sequencing DEGs of mouse Krt13+ hillock cells (Supplementary Tables, SupTab6_Krt13_Hillock)
- "Ensemble.mus-hum.txt" Ensemble export of mouse (GRCm38.p6) to human ortholog mapping
- General MSigDb
- Cell cycle: