Determining cellular heterogeneity in the human prostate with single-cell RNA sequencing
- Contact: Gervaise H. Henry
- Institution: UT Southwestern Medical Center
- Department: Urology
- Lab: Strand Lab
- PI: Douglas W. Strand, PhD
Data Analysis
-
Requirements:
- /analysis/DATA/D17PrF/
- D17PrF-demultiplex.csv
- 10x/
- aggregation_csv.csv
- GRCh38/
- "barcodes.tsv"
- "genes.tsv"
- "matrix.mtx"
- /analysis/DATA/D27PrF/
- D27PrF-demultiplex.csv
- 10x/
- aggregation_csv.csv
- GRCh38/
- "barcodes.tsv"
- "genes.tsv"
- "matrix.mtx"
- 10x CellRanger analyzed data:
- "filtered_gene_bc_matrice_mex" folder
- csv file cellranger aggr used for aggregation
- demultiplex csv file to define subsets of samples
- R (v3.4.1) packages:
- methods (v3.4.1)
- optparse (v1.4.4)
- Seurat (v2.3.1)
- readr (v1.1.1)
- fBasics (v3042.89)
- pastecs (v1.3.21)
- qusage (v2.10.0)
- RColorBrewer (v1.1-2)
- monocle (v2.6.3)
- dplyr (v0.7.6)
- and all dependencies
- /analysis/DATA/D17PrF/
-
HOW TO RUN
- 1 Patient D17PrF
- run bash script sc_TissueMapper-D17PrF.sh
- 2 Patients D17PrF and D27PrF
- run bash script sc_TissueMapper-DPrF2.sh
- 1 Patient D17PrF
-
Pipeline:
- Link cellranger count/aggr output to analysis
- Create demultiplex file to add custom sample groups
- Load R packages
- Create analysis folders
- Load analysis parameters (from default or overwrite from command line)
- Load cellranger data into R/Seurat
- Label cells based on their cell cycle stated using Seurat based method
- QC and filter cells/genes
- If combining multiple experiments: Align experiments using canonical correlation analysis (CCA)
- Perform principle component analysis (PCA) using most highly variable genes (HVG) for downstream clustering etc
- Perform initial "over" clustering
- Identify "highly stressed" cells using custom PCA based analysis, remove stressed clusters/cells, and re-cluster
- Correlate cluster gene expression using Quantitative Set Analysis for Gene Expression (QuSAGE) on lineage genesets for identification (epithelia, and stroma)
- Subset epithelia from stroma for additional analaysis
- Re-cluster cell types separately
- Correlate cluster gene expression using QuSAGE on epithelial subtype genesets for identification (basal, luminal and "other")
- Correlate cluster gene expression using QuSAGE on stromal subtype genesets for identification (fibroblasts, smooth muscle, endothelia and leukocyte)
- Optional: Correlate cluster gene expression using QuSAGE on additional genesets for analysis
- Re-cluster cell types separately
- Merge epithelial and stromal cells
- Identify neuroendocrine cells from epithelial cells using custom PCA based analysis
- Tabulate population cell numbers
- Generate differentially expressed genelists (DEGs) of populations
-
Genesets:
- Cell cycle:
- "regev_lab_cell_cycle_genes.txt" G2M and S phase genes from Genome Res. 2015 Dec; 25(12): 1860–1872
- Stress:
- "DEG_C2.CGP.M10970.txt" MSigDB C2 Chemical and Genetic Perturbations M10970 CHUANG_OXIDATIVE_STRESS_RESPONSE_UP
- "genes.deg.Stress.csv" DWS generated DEGs of stressed cells from scRNA-Seq of patient D17 only
- "" DWS generated DEGs of stressed cells from scRNA-Seq of an aggregation of patient D17 and D27
- Lineage:
- "DEG_Epi_5FC.txt" DWS generated DEGs of epithelia from FACS population (bulk) RNA-sequencing
- "DEG_FMSt_5FC.txt" DWS generated DEGs of fibromuscular stroma from FACS population (bulk) RNA-sequencing
- "genes.deg.Epi.csv" DWS generated DEGs of epithelial cells from scRNA-Seq of patient D17 only
- "genes.deg.St.csv" DWS generated DEGs of stromal cells from scRNA-Seq of patient D17 only
- "" DWS generated DEGs of epithelial cells from from scRNA-Seq of an aggregation of patient D17 and D27
- "" DWS generated DEGs of stromal cells from scRNA-Seq of an aggregation of patient D17 and D27
- Epithelia:
- "DEG_BE_5FC.txt" DWS generated DEGs of basal epithelia from FACS population (bulk) RNA-sequencing
- "DEG_LE_5FC.txt" DWS generated DEGs of luminal epithelia from FACS population (bulk) RNA-sequencing
- "DEG_OE_5FC.txt" DWS generated DEGs of "other" epithelia from FACS population (bulk) RNA-sequencing
- "genes.deg.BE.csv" DWS generated DEGs of basal epithelial cells from scRNA-Seq of patient D17 only
- "genes.deg.LE.csv" DWS generated DEGs of luminal epithelial cells from scRNA-Seq of patient D17 only
- "genes.deg.OE1.csv" DWS generated DEGs of "other" epithelia cluster 1 cells from scRNA-Seq of patient D17 only
- "genes.deg.OE2.csv" DWS generated DEGs of "other" epithelia cluster 2 cells from scRNA-Seq of patient D17 only
- Stroma:
- "DEG_C5.BP.M11704.txt" MSigDB C5 GO Biological Processes M11704 GO_ENDOTHELIAL_CELL_DIFFERENTIATION
- "DEG_C5.BP.M10794.txt" MSigDB C5 GO Biological Processes M10794 GO_SMOOTH_MUSCLE_CELL_DIFFERENTIATION
- "DEG_C5.BP.M13024.txt" MSigDB C5 GO Biological Processes M13024 GO_REGULATION_OF_FIBROBLAST_PROLIFERATION
- "DEG_C5.BP.M10124.txt" MSigDB C5 GO Biological Processes M10124 GO_LEUKOCYTE_ACTIVATION
- Neuroendocrine:
- "EurUrol.2005.NE.txt" Neuroendocrine markers from Table 1 of Eur Urol. 2005 Feb;47(2):147-55
- "genes.deg.NE.csv" DWS generated DEGs of neuroendocrine epithelial cells from scRNA-Seq of patient D17 only
- "" DWS generated DEGs of neuroendocrine epithelial cells from scRNA-Seq of patient D17 only
- Lung epithelia from Lung Gene Expression Analysis (LGEA) Web Portal:
- "Basal cells-signature-genes.csv" scRNA-Seq LGEA generated top 20 DEGs for [human lung Basal Cells] (https://research.cchmc.org/pbge/lunggens/lungDisease/celltype_IPF.html?cid=3)
- "Normal AT2 cells-signature-genes.csv" scRNA-Sequecing LGEA generated top 20 DEGs for human lung Alveolar Type 2 Cells
- "Club_Goblet cells-signature-genes.csv" scRNA-Sequencing LGEA generated top 20 DEGs for human lung Club/Goblet Cells
- General MSigDb
- Cell cycle: