Commit 2fcf9963 authored by Gervaise H. Henry's avatar Gervaise H. Henry 🤠

Update README.md

parent e1852ac3
Generalized Code for the Analysis of scRNA-seq Data
===================================================
Strand Lab analysis of single-cell RNA sequencing
=================================================
* Contact: **Gervaise H. Henry**
* <a href="https://orcid.org/0000-0001-7772-9578" target="orcid.widget" rel="noopener noreferrer" style="vertical-align:top;"><img src="https://orcid.org/sites/default/files/images/orcid_16x16.png" style="width:1em;margin-right:.5em;" alt="ORCID iD icon">orcid.org/0000-0001-7772-9578</a>
......@@ -18,93 +18,33 @@ Data Analysis
* /analysis/DATA/
* **ProjectName**-demultiplex.csv
* 10x/
* aggregation_csv.csv
* GRCh38/
* "barcodes.tsv"
* "genes.tsv"
* "matrix.mtx"
* **SampletName**/
* filtered_feature_bc_matrix/
* `barcodes.tsv.gz`
* `features.tsv.gz`
* `matrix.mtx.gz`
* 10x CellRanger analyzed data:
* "filtered\_gene\_bc\_matrice_mex" folder
* csv file cellranger aggr used for aggregation
* "filtered\_feature\_bc\_matrice_matrix" folder
* demultiplex csv file to define subsets of samples
* R (v3.4.1) packages:
* methods (v3.4.1)
* optparse (v1.4.4)
* Seurat (v2.3.1)
* readr (v1.1.1)
* fBasics (v3042.89)
* pastecs (v1.3.21)
* qusage (v2.10.0)
* RColorBrewer (v1.1-2)
* monocle (v2.6.3)
* dplyr (v0.7.6)
* viridis (v0.5.1)
* *and all dependencies*
* **Pipeline:**
* Link cellranger count/aggr output to analysis
* Create demultiplex file to add custom sample groups
* Load R packages
* Create analysis folders
* Load analysis parameters (from default or overwrite from command line)
* Load cellranger data into R/Seurat
* Label cells based on their cell cycle stated using Seurat based method
* QC and filter cells/genes
* *If analyzing samples from multiple patients:* Align experiments using canonical correlation analysis (CCA)
* *If analyzing samples from one patient:* Perform principle component analysis (PCA) using most highly variable genes (HVG) for downstream clustering etc
* Perform initial "over" clustering
* Identify "highly stressed" cells using custom PCA based analysis, remove stressed clusters/cells, and re-cluster
* Correlate cluster gene expression using Quantitative Set Analysis for Gene Expression (QuSAGE) on lineage genesets for identification (epithelia, and stroma)
* Subset epithelia from stroma for additional analaysis
* Re-cluster cell types separately
* Correlate cluster gene expression using QuSAGE on epithelial subtype genesets for identification (basal, luminal and "other")
* Correlate cluster gene expression using QuSAGE on stromal subtype genesets for identification (fibroblasts, smooth muscle, endothelia and leukocyte)
* *Optional:* Correlate cluster gene expression using QuSAGE on additional genesets for analysis
* Merge epithelial and stromal cells
* Identify neuroendocrine cells from epithelial cells using custom PCA based analysis
* Tabulate population cell numbers
* Generate differentially expressed genelists (DEGs) of populations
* **Genesets:**
* Cell cycle:
* "regev\_lab\_cell\_cycle\_genes.txt" G2M and S phase genes from [*Genome Res. 2015 Dec; 25(12): 1860–1872*](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4665007/)
* Stress:
* "DEG\_C2.CGP.M10970.txt" MSigDB C2 Chemical and Genetic Perturbations M10970 [**CHUANG\_OXIDATIVE\_STRESS\_RESPONSE\_UP**](http://software.broadinstitute.org/gsea/msigdb/cards/CHUANG_OXIDATIVE_STRESS_RESPONSE_UP.html)
* "genes.deg.Stress.csv" DWS generated DEGs of stressed cells from scRNA-Seq of 3 patient aggregate
* Lineage:
* "DEG\_Epi_5FC.txt" DWS generated DEGs of epithelia from FACS population (bulk) RNA-sequencing
* "DEG\_FMSt_5FC.txt" DWS generated DEGs of fibromuscular stroma from FACS population (bulk) RNA-sequencing
* Epithelia:
* "DEG\_BE_5FC.txt" DWS generated DEGs of basal epithelia from FACS population (bulk) RNA-sequencing
* "DEG\_LE_5FC.txt" DWS generated DEGs of luminal epithelia from FACS population (bulk) RNA-sequencing
* "DEG\_OE_5FC.txt" DWS generated DEGs of "other" epithelia from FACS population (bulk) RNA-sequencing
* "genes.deg.BE.csv" DWS generated DEGs of basal epithelial cells from scRNA-Seq of 3 patient aggregate
* "genes.deg.LE.csv" DWS generated DEGs of luminal epithelial cells from scRNA-Seq of 3 patient aggregate
* "genes.deg.OE1.csv" DWS generated DEGs of "other" epithelia cluster 1 cells from scRNA-Seq of 3 patient aggregate
* "genes.deg.OE2.csv" DWS generated DEGs of "other" epithelia cluster 2 cells from scRNA-Seq of 3 patient aggregate
* Stroma:
* "DEG\_C5.BP.M11704.txt" MSigDB C5 GO Biological Processes M11704 [**GO\_ENDOTHELIAL\_CELL\_DIFFERENTIATION**](http://software.broadinstitute.org/gsea/msigdb/cards/GO_ENDOTHELIAL_CELL_DIFFERENTIATION.html)
* "DEG\_C5.BP.M10794.txt" MSigDB C5 GO Biological Processes M10794 [**GO\_SMOOTH\_MUSCLE\_CELL\_DIFFERENTIATION**](http://software.broadinstitute.org/gsea/msigdb/cards/GO_SMOOTH_MUSCLE_CELL_DIFFERENTIATION.html)
* "DEG\_C5.BP.M13024.txt" MSigDB C5 GO Biological Processes M13024 [**GO\_REGULATION\_OF\_FIBROBLAST\_PROLIFERATION**](http://software.broadinstitute.org/gsea/msigdb/cards/GO_REGULATION_OF_FIBROBLAST_PROLIFERATION.html)
* "DEG\_C5.BP.M10124.txt" MSigDB C5 GO Biological Processes M10124 [**GO\_LEUKOCYTE\_ACTIVATION**](http://software.broadinstitute.org/gsea/msigdb/cards/GO_LEUKOCYTE_ACTIVATION.html)
* "genes.deg.Fib.csv" DWS generated DEGs of fibroblast cells from scRNA-Seq of 3 patient aggregate
* "genes.deg.SM.csv" DWS generated DEGs of smooth muscle cells from scRNA-Seq of 3 patient aggregate
* "genes.deg.Endo.csv" DWS generated DEGs of endothelial ccells from scRNA-Seq of 3 patient aggregate
* "genes.deg.Leu.csv" DWS generated DEGs of leukocyte cells from scRNA-Seq of 3 patient aggregate
* Neuroendocrine:
* "EurUrol.2005.NE.txt" Neuroendocrine markers from Table 1 of [*Eur Urol. 2005 Feb;47(2):147-55*](https://www.ncbi.nlm.nih.gov/pubmed/15661408)
* "genes.deg.NE.csv" DWS generated DEGs of neuroendocrine epithelial cells from scRNA-Seq of 3 patient aggregate
* Lung epithelia from [Lung Gene Expression Analysis (LGEA) Web Portal](https://research.cchmc.org/pbge/lunggens/mainportal.html):
* "Basal cells-signature-genes.csv" scRNA-Seq LGEA generated top 20 DEGs for [human lung Basal Cells] (https://research.cchmc.org/pbge/lunggens/lungDisease/celltype_IPF.html?cid=3)
* "Normal AT2 cells-signature-genes.csv" scRNA-Sequecing LGEA generated top 20 DEGs for [human lung Alveolar Type 2 Cells](https://research.cchmc.org/pbge/lunggens/lungDisease/celltype_IPF.html?cid=1)
* "Club\_Goblet cells-signature-genes.csv" scRNA-Sequencing LGEA generated top 20 DEGs for [human lung Club/Goblet Cells](https://research.cchmc.org/pbge/lunggens/lungDisease/celltype_IPF.html?cid=4)
* Lung epithelia from [*Nature 2018 Aug;560(7718):319*](https://doi.org/10.1038/s41586-018-0393-7)
* "SupTab3\_Consensus\_Sigs.csv" scRNA-Sequencing DEGs of mouse basal, club, ciliated, tuft, neuroendocrine, ionocyte cells (Supplementary Tables, SupTab3\_Consensus\_Sigs)
* "SupTab6\_Krt13\_Hillock.csv" scRNA-Sequencing DEGs of mouse Krt13+ hillock cells (Supplementary Tables, SupTab6\_Krt13\_Hillock)
* "Ensemble.mus-hum.txt" Ensemble export of mouse (GRCm38.p6) to human ortholog mapping
* General MSigDb
* "c2.all.v6.1.symbols.gmt" MSigDB C2 Curated Gene Sets [**MSigDB C2**](http://software.broadinstitute.org/gsea/msigdb/genesets.jsp?collection=C2)
* "c2.cp.kegg.v6.1.symbols" MSigDB C2 KEGG Gene Subsets [**KEGG**](http://software.broadinstitute.org/gsea/msigdb/genesets.jsp?collection=CP:KEGG)
* "c5.all.v6.1.symbols.gmt" MSigDB C5 Gene Ontology Gene Sets [**MSigDB C5**](http://software.broadinstitute.org/gsea/msigdb/genesets.jsp?collection=C5)
* "c5.bp.v6.1.symbols.gmt" MSigDB C5 Gene Ontology Biological Processes Gene Subsets [**GO BP**](http://software.broadinstitute.org/gsea/msigdb/genesets.jsp?collection=BP)
* R packages:
* methods
* optparse
* Seurat
* readr
* readxl
* fBasics
* pastecs
* qusage
* RColorBrewer
* monocle
* dplyr
* viridis
* gridExtra
* SingleR
* sctransform
* autothresholdr
* ggplot2
* cowplot
* scales
* ComplexHeatmap
* *and all dependencies*
\ No newline at end of file
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment