Skip to content
Snippets Groups Projects
Commit 1838a861 authored by Gervaise Henry's avatar Gervaise Henry :cowboy:
Browse files

Merge branch 'master' of git.biohpc.swmed.edu:StrandLab/sc-TissueMapper_Pr...

Merge branch 'master' of git.biohpc.swmed.edu:StrandLab/sc-TissueMapper_Pr into test.LineageSubClust_branch
parents 438822c4 c743dd80
Branches
Tags
No related merge requests found
analysis/
.vscode/
*.err
*.out
*.Rhistory
......
README.md 100644 → 100755
Determining cellular heterogeneity in the human prostate with single-cell RNA sequencing
========================================================================================
* Contact: **Gervaise H. Henry**
* Email: [gervaise.henry@utsouthwestern.edu](mailto:gervaise.henry@utsouthwestern.edu)
* Institution: UT Southwestern Medical Center
* Department: Urology
* Lab: Strand Lab
* PI: Douglas W. Strand, PhD
* PI Email: [douglas.strand@utsouthwestern.edu](mailto:douglas.strand@utsouthwestern.edu)
* PI: Douglas W. Strand, PhD
* PI Email: [douglas.strand@utsouthwestern.edu](mailto:douglas.strand@utsouthwestern.edu)
Data Analysis
-------------
## Data Analysis
* **Requirements**:
* 10x CellRanger analyzed data:
* "filtered\_gene\_bc\_matrice_mex" folder
* csv file cellranger aggr used for aggregation
* demultiplex csv file to define subsets of samples
* 10x CellRanger analyzed data:
* "filtered\_gene\_bc\_matrice_mex" folder
* csv file cellranger aggr used for aggregation
* demultiplex csv file to define subsets of samples
* **Pipeline** (r.scripts):
* sc_Demultiplex (import 10x Cell Ranger data into R and subset samples based on user input)
* sc_SeuratScore.CellCycle (identify cell cycle state)
* sc_QC (filter cells, scale and remove variation associated to: UMI; % mitochondrial genes; S; G2M phase score, identify most variable genes and run PCA on them
* sc_Cluster (perform tSNE and cluster using graph*based approach)
* sc_PC.Score.Stress (PCA analysis of stress gene signature, projecction to PC1 used as "Stress Score", stressed cells and clusters removed cells re*clustered)
* sc_QuSAGE.Lineage (clusters correlated to prostate population RNA-Seq DEGs of Epithelia and Fibromuscular Stroma to assign those identities to those clusters)
* sc_QuSAGE.Epi (Epithelial clusters correlated to prostate population RNA-Seq DEGs of Basal, Luminal, and "Other" Epithelia to assign those identities to those clusters)
* sc_QuSAGE.St (Stromal clusters correlated to external DEGs of Endothelial Cells, Smooth Muscle Cells, and Fibroblasts to assign those identities to those clusters)
* sc_PC.Score.NE (PCA analysis of Neurocrine Epithelia protein marker signature, projecction to PC1 used as "NE Score", highest scoring cells identified as Neuroendocrine Cells)
* sc_QuSAGE.LGEA.Epi (Epithelial clusters correlated to top 20 DEGs from scRNA-seq of human lung Basal Cells, Alveolar Type 2 Cells, Club/Goblet Cells to determine similarity to lung epithelial cell types)
* sc_QuSAGE.LGEA.St (Stromal clusters correlated to human orthologs of mouse lung stromal DEGs from scRNA*seq to determine similarity to:)
* Proliferative Fibrobblasts
* Myofibroblast/Smooth Muscle-Like Cells
* Matrix Fibroblasts
* Endothelial Cells
* Myeloid/Immune Cells
* sc_DEG (generate DEG lists between important cell types and predict surface and nuclear markers from them)
* sc_Table (produce tables of population differences between samples)
* sc_Demultiplex (import 10x Cell Ranger data into R and subset samples based on user input)
* sc_SeuratScore.CellCycle (identify cell cycle state)
* sc_QC (filter cells, scale and remove variation associated to: UMI; % mitochondrial genes; S; G2M phase score, identify most variable genes and run PCA on them
* sc_Cluster (perform tSNE and cluster using graph*based approach)
* sc_PC.Score.Stress (PCA analysis of stress gene signature, projecction to PC1 used as "Stress Score", stressed cells and clusters removed cells re*clustered)
* sc_QuSAGE.Lineage (clusters correlated to prostate population RNA-Seq DEGs of Epithelia and Fibromuscular Stroma to assign those identities to those clusters)
* sc_QuSAGE.Epi (Epithelial clusters correlated to prostate population RNA-Seq DEGs of Basal, Luminal, and "Other" Epithelia to assign those identities to those clusters)
* sc_QuSAGE.St (Stromal clusters correlated to external DEGs of Endothelial Cells, Smooth Muscle Cells, and Fibroblasts to assign those identities to those clusters)
* sc_PC.Score.NE (PCA analysis of Neurocrine Epithelia protein marker signature, projecction to PC1 used as "NE Score", highest scoring cells identified as Neuroendocrine Cells)
* sc_QuSAGE.LGEA.Epi (Epithelial clusters correlated to top 20 DEGs from scRNA-seq of human lung Basal Cells, Alveolar Type 2 Cells, Club/Goblet Cells to determine similarity to lung epithelial cell types)
* sc_QuSAGE.LGEA.St (Stromal clusters correlated to human orthologs of mouse lung stromal DEGs from scRNA*seq to determine similarity to:)
* Proliferative Fibrobblasts
* Myofibroblast/Smooth Muscle-Like Cells
* Matrix Fibroblasts
* Endothelial Cells
* Myeloid/Immune Cells
* sc_DEG (generate DEG lists between important cell types and predict surface and nuclear markers from them)
* sc_Table (produce tables of population differences between samples)
* **Genesets**:
* "regev\_lab\_cell\_cycle\_genes.txt" G2M and S phase genes from [*Genome Res. 2015 Dec; 25(12): 1860–1872*](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4665007/)
* "DEG\_C2.CGP.M10970.txt" MSigDB C2 Chemical and Genetic Perturbations M10970 [**CHUANG\_OXIDATIVE\_STRESS\_RESPONSE\_UP**](http://software.broadinstitute.org/gsea/msigdb/cards/CHUANG_OXIDATIVE_STRESS_RESPONSE_UP.html)
* "DEG\_Epi_2FC.txt" DWS generated DEGs of Epithelia from FACS population (bulk) RNA-sequencing
* "DEG\_FMSt_2FC.txt" DWS generated DEGs of Fibromuscular Stroma from FACS population (bulk) RNA-sequencing
* "DEG\_BE_2FC.txt" DWS generated DEGs of Basal Epithelia from FACS population (bulk) RNA-sequencing
* "DEG\_LE_2FC.txt" DWS generated DEGs of Luminal Epithelia from FACS population (bulk) RNA-sequencing
* "DEG\_OE_2FC.txt" DWS generated DEGs of "Other" Epithelia from FACS population (bulk) RNA-sequencing
* "DEG\_C5.BP.M11704.txt" MSigDB C5 GO Biological Processes M11704 [**GO\_ENDOTHELIAL\_CELL\_DIFFERENTIATION**](http://software.broadinstitute.org/gsea/msigdb/cards/GO_ENDOTHELIAL_CELL_DIFFERENTIATION.html)
* "DEG\_C5.BP.M10794.txt" MSigDB C5 GO Biological Processes M10794 [**GO\_SMOOTH\_MUSCLE\_CELL\_DIFFERENTIATION**](http://software.broadinstitute.org/gsea/msigdb/cards/GO_SMOOTH_MUSCLE_CELL_DIFFERENTIATION.html)
* "DEG\_C5.BP.M13024.txt" MSigDB C5 GO Biological Processes M13024 [**GO\_REGULATION\_OF\_FIBROBLAST\_PROLIFERATION**](http://software.broadinstitute.org/gsea/msigdb/cards/GO_REGULATION_OF_FIBROBLAST_PROLIFERATION.html)
* "EurUrol.2005.NE.txt" Neuroendocrine markers from Table 1 of [*Eur Urol. 2005 Feb;47(2):147-55*](https://www.ncbi.nlm.nih.gov/pubmed/15661408)
* "Basal cells-signature-genes.csv" scRNA-Sequencing Lung Map generated top 20 DEGs for human lung Basal Cells
* "Normal AT2 cells-signature-genes.csv" scRNA-Sequecing Lung Map generated top 20 DEGs for human lung Alveolar Type 2 Cells
* "Club\_Goblet cells-signature-genes.csv" scRNA-Sequencing Lung Map generated top 20 DEGs for human lung Club/Goblet Cells
* "journal.pcbi.1004575.s026.XLSX" scRNA-Sequencing Lung Map generated DEGs for E16.5 mouse lung stromal subtypes (genes converted to human othologs with Ensembl):
* Proliferative Fibroblasts
* Myofibroblast/Smooth Muscle-like Cells
* Pericytes, Matrix Fibroblasts
* Endothelial Cells
* Myeloid/Immune Cells
* "DWS.scStress.txt" DWS generated DEGs of Stressed Cells from scRNA-Sequencing
* "DWS.scNE.txt" DWS generated DEGs of Neuroendocrine Cells from scRNA-Sequencing
* "regev\_lab\_cell\_cycle\_genes.txt" G2M and S phase genes from [*Genome Res. 2015 Dec; 25(12): 1860–1872*](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4665007/)
* "DEG\_C2.CGP.M10970.txt" MSigDB C2 Chemical and Genetic Perturbations M10970 [**CHUANG\_OXIDATIVE\_STRESS\_RESPONSE\_UP**](http://software.broadinstitute.org/gsea/msigdb/cards/CHUANG_OXIDATIVE_STRESS_RESPONSE_UP.html)
* "DEG\_Epi_2FC.txt" DWS generated DEGs of Epithelia from FACS population (bulk) RNA-sequencing
* "DEG\_FMSt_2FC.txt" DWS generated DEGs of Fibromuscular Stroma from FACS population (bulk) RNA-sequencing
* "DEG\_BE_2FC.txt" DWS generated DEGs of Basal Epithelia from FACS population (bulk) RNA-sequencing
* "DEG\_LE_2FC.txt" DWS generated DEGs of Luminal Epithelia from FACS population (bulk) RNA-sequencing
* "DEG\_OE_2FC.txt" DWS generated DEGs of "Other" Epithelia from FACS population (bulk) RNA-sequencing
* "DEG\_C5.BP.M11704.txt" MSigDB C5 GO Biological Processes M11704 [**GO\_ENDOTHELIAL\_CELL\_DIFFERENTIATION**](http://software.broadinstitute.org/gsea/msigdb/cards/GO_ENDOTHELIAL_CELL_DIFFERENTIATION.html)
* "DEG\_C5.BP.M10794.txt" MSigDB C5 GO Biological Processes M10794 [**GO\_SMOOTH\_MUSCLE\_CELL\_DIFFERENTIATION**](http://software.broadinstitute.org/gsea/msigdb/cards/GO_SMOOTH_MUSCLE_CELL_DIFFERENTIATION.html)
* "DEG\_C5.BP.M13024.txt" MSigDB C5 GO Biological Processes M13024 [**GO\_REGULATION\_OF\_FIBROBLAST\_PROLIFERATION**](http://software.broadinstitute.org/gsea/msigdb/cards/GO_REGULATION_OF_FIBROBLAST_PROLIFERATION.html)
* "EurUrol.2005.NE.txt" Neuroendocrine markers from Table 1 of [*Eur Urol. 2005 Feb;47(2):147-55*](https://www.ncbi.nlm.nih.gov/pubmed/15661408)
* "Basal cells-signature-genes.csv" scRNA-Sequencing Lung Map generated top 20 DEGs for human lung Basal Cells
* "Normal AT2 cells-signature-genes.csv" scRNA-Sequecing Lung Map generated top 20 DEGs for human lung Alveolar Type 2 Cells
* "Club\_Goblet cells-signature-genes.csv" scRNA-Sequencing Lung Map generated top 20 DEGs for human lung Club/Goblet Cells
* "journal.pcbi.1004575.s026.XLSX" scRNA-Sequencing Lung Map generated DEGs for E16.5 mouse lung stromal subtypes (genes converted to human othologs with Ensembl):
* Proliferative Fibroblasts
* Myofibroblast/Smooth Muscle-like Cells
* Pericytes, Matrix Fibroblasts
* Endothelial Cells
* Myeloid/Immune Cells
* ["DWS.scStress.txt"](genesets/DWS.scStress.txt) DWS generated DEGs of Stressed Cells from scRNA-Sequencing
* ["DWS.scNE.txt"](genesets/DWS.scNE.txt) DWS generated DEGs of Neuroendocrine Cells from scRNA-Sequencing
......@@ -6,7 +6,7 @@ library(readr)
#retrive command line options
option_list=list(
make_option("--p",action="store",default="Pr",type='character',help="Project Name"),
make_option("--p",action="store",default="ALL",type='character',help="Project Name"),
make_option("--d",action="store",default=4,type='integer',help="Demultiplex Group Number"),
make_option("--mc",action="store",default=3,type='integer',help="Minimum Cells"),
make_option("--mg",action="store",default=200,type='integer',help="Minimum Genes")
......@@ -24,7 +24,7 @@ cell.codes <- as.data.frame(sc10x@raw.data@Dimnames[[2]])
colnames(cell.codes) <- "barcodes"
rownames(cell.codes) <- cell.codes$barcodes
cell.codes$lib.codes <- as.factor(gsub(pattern=".+-",replacement="",cell.codes$barcodes))
cell.codes$samples <- as.vector(sc10x.aggr$library_id[cell.codes$lib.codes])
cell.codes$samples <- sc10x.aggr$library_id[match(cell.codes$lib.codes,as.numeric(rownames(sc10x.aggr)))]
sc10x <- CreateSeuratObject(raw.data=sc10x.data,meta.data=cell.codes["samples"],min.cells=opt$mc,min.genes=opt$mg,project=Project.Name)
rm(sc10x.data)
rm(cell.codes)
......
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment