diff --git a/README.md b/README.md index b36447b266276e1d14d582ca37f11b88b0940a0d..cfb0bb098f004ff6bb307f4e6d24454ab489540e 100755 --- a/README.md +++ b/README.md @@ -10,21 +10,16 @@ Determining cellular heterogeneity in the human prostate with single-cell RNA se * PI: Douglas W. Strand, PhD * <a href="https://orcid.org/0000-0002-0746-927X" target="orcid.widget" rel="noopener noreferrer" style="vertical-align:top;"><img src="https://orcid.org/sites/default/files/images/orcid_16x16.png" style="width:1em;margin-right:.5em;" alt="ORCID iD icon">orcid.org/0000-0002-0746-927X</a> * PI Email: [douglas.strand@utsouthwestern.edu](mailto:douglas.strand@utsouthwestern.edu) +* **ANALYZED DATA FOR QUERYING AT: [StrandLab.net](http://strandlab.net/analysis.php)** +* **Raw data at: [GEO (scRNA-Seq)](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE117403) & [GEO (popRNA-Seq)](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE117271) & [GenitoUrinary Development Molecular Anatomy Project (GUDMAP)]("https://doi.org/10.25548/W-R8CM")** +* **Publication at: PENDING** Data Analysis ------------- * **Requirements:** - * /analysis/DATA/D17PrF/ - * D17PrF-demultiplex.csv - * 10x/ - * aggregation_csv.csv - * GRCh38/ - * "barcodes.tsv" - * "genes.tsv" - * "matrix.mtx" - * /analysis/DATA/D27PrF/ - * D27PrF-demultiplex.csv + * /analysis/DATA/ + * Pd-demultiplex.csv * 10x/ * aggregation_csv.csv * GRCh38/ @@ -46,12 +41,15 @@ Data Analysis * RColorBrewer (v1.1-2) * monocle (v2.6.3) * dplyr (v0.7.6) + * viridis (v0.5.1) * *and all dependencies* * **HOW TO RUN** - * 1 Patient D17PrF - * run bash script sc_TissueMapper-D17PrF.sh - * 2 Patients D17PrF and D27PrF - * run bash script sc_TissueMapper-DPrF2.sh + * 1 Run on 3 patient aggregate + * run bash script [sc\_TissueMapper\-Pr.sh](https://git.biohpc.swmed.edu/StrandLab/sc-TissueMapper_Pr/blob/master/bash.scripts/sc_TissueMapper-Pd.sh) + * 2 Run on 1st patent FACS samples + * run bash script [sc\_TissueMapper\-D17\_FACS.sh](https://git.biohpc.swmed.edu/StrandLab/sc-TissueMapper_Pr/blob/master/bash.scripts/sc_TissueMapper-D17_FACS.sh) + * 3 Run on 2nd patient FACS samples + * run bash script [sc\_TissueMapper\-D27\_FACS.sh](https://git.biohpc.swmed.edu/StrandLab/sc-TissueMapper_Pr/blob/master/bash.scripts/sc_TissueMapper-D27_FACS.sh) * **Pipeline:** * Link cellranger count/aggr output to analysis * Create demultiplex file to add custom sample groups @@ -61,8 +59,8 @@ Data Analysis * Load cellranger data into R/Seurat * Label cells based on their cell cycle stated using Seurat based method * QC and filter cells/genes - * *If combining multiple experiments:* Align experiments using canonical correlation analysis (CCA) - * Perform principle component analysis (PCA) using most highly variable genes (HVG) for downstream clustering etc + * *If analyzing samples from multiple patients:* Align experiments using canonical correlation analysis (CCA) + * *If analyzing samples from one patient:* Perform principle component analysis (PCA) using most highly variable genes (HVG) for downstream clustering etc * Perform initial "over" clustering * Identify "highly stressed" cells using custom PCA based analysis, remove stressed clusters/cells, and re-cluster * Correlate cluster gene expression using Quantitative Set Analysis for Gene Expression (QuSAGE) on lineage genesets for identification (epithelia, and stroma) @@ -82,36 +80,40 @@ Data Analysis * "regev\_lab\_cell\_cycle\_genes.txt" G2M and S phase genes from [*Genome Res. 2015 Dec; 25(12): 1860–1872*](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4665007/) * Stress: * "DEG\_C2.CGP.M10970.txt" MSigDB C2 Chemical and Genetic Perturbations M10970 [**CHUANG\_OXIDATIVE\_STRESS\_RESPONSE\_UP**](http://software.broadinstitute.org/gsea/msigdb/cards/CHUANG_OXIDATIVE_STRESS_RESPONSE_UP.html) - * "genes.deg.Stress.csv" DWS generated DEGs of stressed cells from scRNA-Seq of patient D17 only - * "" DWS generated DEGs of stressed cells from scRNA-Seq of an aggregation of patient D17 and D27 + * "genes.deg.Stress.csv" DWS generated DEGs of stressed cells from scRNA-Seq of 3 patient aggregate * Lineage: * "DEG\_Epi_5FC.txt" DWS generated DEGs of epithelia from FACS population (bulk) RNA-sequencing * "DEG\_FMSt_5FC.txt" DWS generated DEGs of fibromuscular stroma from FACS population (bulk) RNA-sequencing - * "genes.deg.Epi.csv" DWS generated DEGs of epithelial cells from scRNA-Seq of patient D17 only - * "genes.deg.St.csv" DWS generated DEGs of stromal cells from scRNA-Seq of patient D17 only - * "" DWS generated DEGs of epithelial cells from from scRNA-Seq of an aggregation of patient D17 and D27 - * "" DWS generated DEGs of stromal cells from scRNA-Seq of an aggregation of patient D17 and D27 + * "genes.deg.Epi.csv" DWS generated DEGs of epithelial cells from scRNA-Seq of 3 patient aggregate + * "genes.deg.St.csv" DWS generated DEGs of stromal cells from scRNA-Seq of 3 patient aggregate * Epithelia: * "DEG\_BE_5FC.txt" DWS generated DEGs of basal epithelia from FACS population (bulk) RNA-sequencing * "DEG\_LE_5FC.txt" DWS generated DEGs of luminal epithelia from FACS population (bulk) RNA-sequencing * "DEG\_OE_5FC.txt" DWS generated DEGs of "other" epithelia from FACS population (bulk) RNA-sequencing - * "genes.deg.BE.csv" DWS generated DEGs of basal epithelial cells from scRNA-Seq of patient D17 only - * "genes.deg.LE.csv" DWS generated DEGs of luminal epithelial cells from scRNA-Seq of patient D17 only - * "genes.deg.OE1.csv" DWS generated DEGs of "other" epithelia cluster 1 cells from scRNA-Seq of patient D17 only - * "genes.deg.OE2.csv" DWS generated DEGs of "other" epithelia cluster 2 cells from scRNA-Seq of patient D17 only + * "genes.deg.BE.csv" DWS generated DEGs of basal epithelial cells from scRNA-Seq of 3 patient aggregate + * "genes.deg.LE.csv" DWS generated DEGs of luminal epithelial cells from scRNA-Seq of 3 patient aggregate + * "genes.deg.OE1.csv" DWS generated DEGs of "other" epithelia cluster 1 cells from scRNA-Seq of 3 patient aggregate + * "genes.deg.OE2.csv" DWS generated DEGs of "other" epithelia cluster 2 cells from scRNA-Seq of 3 patient aggregate * Stroma: * "DEG\_C5.BP.M11704.txt" MSigDB C5 GO Biological Processes M11704 [**GO\_ENDOTHELIAL\_CELL\_DIFFERENTIATION**](http://software.broadinstitute.org/gsea/msigdb/cards/GO_ENDOTHELIAL_CELL_DIFFERENTIATION.html) * "DEG\_C5.BP.M10794.txt" MSigDB C5 GO Biological Processes M10794 [**GO\_SMOOTH\_MUSCLE\_CELL\_DIFFERENTIATION**](http://software.broadinstitute.org/gsea/msigdb/cards/GO_SMOOTH_MUSCLE_CELL_DIFFERENTIATION.html) * "DEG\_C5.BP.M13024.txt" MSigDB C5 GO Biological Processes M13024 [**GO\_REGULATION\_OF\_FIBROBLAST\_PROLIFERATION**](http://software.broadinstitute.org/gsea/msigdb/cards/GO_REGULATION_OF_FIBROBLAST_PROLIFERATION.html) * "DEG\_C5.BP.M10124.txt" MSigDB C5 GO Biological Processes M10124 [**GO\_LEUKOCYTE\_ACTIVATION**](http://software.broadinstitute.org/gsea/msigdb/cards/GO_LEUKOCYTE_ACTIVATION.html) + * "genes.deg.Fib.csv" DWS generated DEGs of fibroblast cells from scRNA-Seq of 3 patient aggregate + * "genes.deg.SM.csv" DWS generated DEGs of smooth muscle cells from scRNA-Seq of 3 patient aggregate + * "genes.deg.Endo.csv" DWS generated DEGs of endothelial ccells from scRNA-Seq of 3 patient aggregate + * "genes.deg.Leu.csv" DWS generated DEGs of leukocyte cells from scRNA-Seq of 3 patient aggregate * Neuroendocrine: * "EurUrol.2005.NE.txt" Neuroendocrine markers from Table 1 of [*Eur Urol. 2005 Feb;47(2):147-55*](https://www.ncbi.nlm.nih.gov/pubmed/15661408) - * "genes.deg.NE.csv" DWS generated DEGs of neuroendocrine epithelial cells from scRNA-Seq of patient D17 only - * "" DWS generated DEGs of neuroendocrine epithelial cells from scRNA-Seq of patient D17 only + * "genes.deg.NE.csv" DWS generated DEGs of neuroendocrine epithelial cells from scRNA-Seq of 3 patient aggregate * Lung epithelia from [Lung Gene Expression Analysis (LGEA) Web Portal](https://research.cchmc.org/pbge/lunggens/mainportal.html): * "Basal cells-signature-genes.csv" scRNA-Seq LGEA generated top 20 DEGs for [human lung Basal Cells] (https://research.cchmc.org/pbge/lunggens/lungDisease/celltype_IPF.html?cid=3) * "Normal AT2 cells-signature-genes.csv" scRNA-Sequecing LGEA generated top 20 DEGs for [human lung Alveolar Type 2 Cells](https://research.cchmc.org/pbge/lunggens/lungDisease/celltype_IPF.html?cid=1) * "Club\_Goblet cells-signature-genes.csv" scRNA-Sequencing LGEA generated top 20 DEGs for [human lung Club/Goblet Cells](https://research.cchmc.org/pbge/lunggens/lungDisease/celltype_IPF.html?cid=4) + * Lung epithelia from [*Nature 2018 Aug;560(7718):319*](https://doi.org/10.1038/s41586-018-0393-7) + * "SupTab3\_Consensus\_Sigs.csv" scRNA-Sequencing DEGs of mouse basal, club, ciliated, tuft, neuroendocrine, ionocyte cells (Supplementary Tables, SupTab3\_Consensus\_Sigs) + * "SupTab6\_Krt13\_Hillock.csv" scRNA-Sequencing DEGs of mouse Krt13+ hillock cells (Supplementary Tables, SupTab6\_Krt13\_Hillock) + * "Ensemble.mus-hum.txt" Ensemble export of mouse (GRCm38.p6) to human ortholog mapping * General MSigDb * "c2.all.v6.1.symbols.gmt" MSigDB C2 Curated Gene Sets [**MSigDB C2**](http://software.broadinstitute.org/gsea/msigdb/genesets.jsp?collection=C2) * "c2.cp.kegg.v6.1.symbols" MSigDB C2 KEGG Gene Subsets [**KEGG**](http://software.broadinstitute.org/gsea/msigdb/genesets.jsp?collection=CP:KEGG)