Commit 545cc57e authored by Gervaise H. Henry's avatar Gervaise H. Henry 🤠

Merge branch 'Seurat3.0' into 'develop'

Seurat3.0

See merge request !2
parents dab5ccad 801bdaa5
genesets/scDWSpr.Rda filter=lfs diff=lfs merge=lfs -text
genesets/scDWShuPr.rda filter=lfs diff=lfs merge=lfs -text
genesets/Pd.shallow.Rda filter=lfs diff=lfs merge=lfs -text
genesets/sc10x.epi.id.rda filter=lfs diff=lfs merge=lfs -text
analysis/*
!.gitkeep
.vscode/
WR/
analysis/*
analysis/DATA/*
!analysis/DATA/
*.err
*.out
*.Rhistory
*.Rda
*.RData
*~
# Windows thumbnail cache files
Thumbs.db
ehthumbs.db
ehthumbs_vista.db
# Dump file
*.stackdump
# Folder config file
[Dd]esktop.ini
# Recycle Bin used on file shares
$RECYCLE.BIN/
# Windows Installer files
*.cab
*.msi
*.msix
*.msm
*.msp
# Windows shortcuts
*.lnk
*.tmp
# Word temporary
~$*.doc*
# Word Auto Backup File
Backup of *.doc*
# Excel temporary
~$*.xls*
# Excel Backup File
*.xlk
# PowerPoint temporary
~$*.ppt*
# Visio autosave temporary files
*.~vsd*
*~
# temporary files which can be created if a process still has a handle open of a deleted file
.fuse_hidden*
# KDE directory preferences
.directory
# Linux trash folder which might appear on any partition or disk
.Trash-*
# .nfs files are created when an open file is removed but is still being accessed
.nfs*
*.tmp
# Word temporary
~$*.doc*
# Word Auto Backup File
Backup of *.doc*
# Excel temporary
~$*.xls*
# Excel Backup File
*.xlk
# PowerPoint temporary
~$*.ppt*
# Visio autosave temporary files
*.~vsd*
.vscode/*
!.vscode/settings.json
!.vscode/tasks.json
!.vscode/launch.json
!.vscode/extensions.json
# History files
.Rhistory
.Rapp.history
# Session Data files
.RData
# Example code in package build process
*-Ex.R
# Output files from R CMD build
/*.tar.gz
# Output files from R CMD check
/*.Rcheck/
# RStudio files
.Rproj.user/
# produced vignettes
vignettes/*.html
vignettes/*.pdf
# OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3
.httr-oauth
# knitr and R markdown default cache directories
/*_cache/
/cache/
# Temporary files created by R markdown
*.utf8.md
*.knit.md
# Shiny token, see https://shiny.rstudio.com/articles/shinyapps.html
rsconnect/
temp_png.png
Rplots.pdf
stress/*
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
.python-version
# celery beat schedule file
celerybeat-schedule
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
Determining cellular heterogeneity in the human prostate with single-cell RNA sequencing
========================================================================================
Generalized Code for the Analysis of scRNA-seq Data
===================================================
* Contact: **Gervaise H. Henry**
* <a href="https://orcid.org/0000-0001-7772-9578" target="orcid.widget" rel="noopener noreferrer" style="vertical-align:top;"><img src="https://orcid.org/sites/default/files/images/orcid_16x16.png" style="width:1em;margin-right:.5em;" alt="ORCID iD icon">orcid.org/0000-0001-7772-9578</a>
......@@ -10,18 +10,13 @@ Determining cellular heterogeneity in the human prostate with single-cell RNA se
* PI: Douglas W. Strand, PhD
* <a href="https://orcid.org/0000-0002-0746-927X" target="orcid.widget" rel="noopener noreferrer" style="vertical-align:top;"><img src="https://orcid.org/sites/default/files/images/orcid_16x16.png" style="width:1em;margin-right:.5em;" alt="ORCID iD icon">orcid.org/0000-0002-0746-927X</a>
* PI Email: [douglas.strand@utsouthwestern.edu](mailto:douglas.strand@utsouthwestern.edu)
* **ANALYZED DATA FOR QUERYING AT: [StrandLab.net](http://strandlab.net/analysis.php)**
* **Raw data at: [GEO](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE120716) & [GenitoUrinary Development Molecular Anatomy Project (GUDMAP)]("https://doi.org/10.25548/W-R8CM")**
* **Publication at:**
* Cell Reports: PENDING
* [BioRxiv](https://www.biorxiv.org/content/early/2018/10/15/439935)
Data Analysis
-------------
* **Requirements:**
* /analysis/DATA/
* Pd-demultiplex.csv
* **ProjectName**-demultiplex.csv
* 10x/
* aggregation_csv.csv
* GRCh38/
......@@ -45,17 +40,7 @@ Data Analysis
* dplyr (v0.7.6)
* viridis (v0.5.1)
* *and all dependencies*
* **HOW TO RUN**
* 1 Run on 3 patient aggregate
* run bash script [sc\_TissueMapper\-Pr.sh](https://git.biohpc.swmed.edu/StrandLab/sc-TissueMapper_Pr/blob/master/bash.scripts/sc_TissueMapper-Pd.sh)
* 2 Run on 1st patent FACS samples
* run bash script [sc\_TissueMapper\-D17\_FACS.sh](https://git.biohpc.swmed.edu/StrandLab/sc-TissueMapper_Pr/blob/master/bash.scripts/sc_TissueMapper-D17_FACS.sh)
* 3 Run on 2nd patient FACS samples
* run bash script [sc\_TissueMapper\-D27\_FACS.sh](https://git.biohpc.swmed.edu/StrandLab/sc-TissueMapper_Pr/blob/master/bash.scripts/sc_TissueMapper-D27_FACS.sh)
* 4 Run on several downsamples from 1 sample from 1st patient
* run bash script [sc\_TissueMapper\-DS\_D17.sh](https://git.biohpc.swmed.edu/StrandLab/sc-TissueMapper_Pr/blob/master/bash.scripts/sc_TissueMapper-DS_D17.sh)
* 5 Aggregate and compare several downsamples from # 4
* run r script [sc\_TissueMapper\_RUN.DS\_D17.aggr.R](https://git.biohpc.swmed.edu/StrandLab/sc-TissueMapper_Pr/blob/master/r.scripts/sc_TissueMapper_RUN.DS_D17.aggr.R)
* **Pipeline:**
* Link cellranger count/aggr output to analysis
* Create demultiplex file to add custom sample groups
......
#!/bin/bash
#SBATCH --job-name sc10x.dispatch
#SBATCH -p 256GB,256GBv1,384GB
#SBATCH -N 1
#SBATCH -t 7-0:0:0
#SBATCH -o job_%j.out
#SBATCH -e job_%j.out
#SBATCH --mail-type ALL
#SBATCH --mail-user gervaise.henry@utsouthwestern.edu
git pull origin Seurat3.0
rm ../analysis/*.rda
rm ../analysis/*.RData
rm -r ../analysis/cor/
rm -r ../analysis/qc/
rm -r ../analysis/score_id/
rm -r ../analysis/shiny/
rm -r ../analysis/vis/
module load python/3.6.4-anaconda
source activate umap
module load R/3.5.1-gccmkl
module load hdf5_18/1.8.17
Rscript ../r.scripts/sc-TissueMapper_RUN.R --p "$1" --s "$2"
module unload R/3.5.1-gccmkl
module load R/3.6.1-gccmkl
if [[ "$1" == "PdPgb" ]] || [[ "$1" == "PdPb" ]]
then
Rscript ../r.scripts/SingleR.R --p "$1" --s "$2" --o "$3"
elif [[ "$3" == "epi" ]] || [["$3" == "fmst" ]]
then
Rscript ../r.scripts/huPr_muPr.R --p "$1" --r "$3"
fi
module unload R/3.5.1-gccmkl
module load R/3.6.1-gccmkl
if [[ "$1" == "PdPgb" ]]
then
Rscript ../r.scripts/diy_PdPgb.R
elif [[ "$3" == "epi" ]]
then
Rscript ../r.scripts/diy_muPrUr_Epi.R
fi
#!/bin/bash
#SBATCH --job-name R_FullAnalysis.D27FACS
#SBATCH -p 256GB,256GBv1,384GB
#SBATCH --job-name sc10x.id
#SBATCH -p 128GB,256GB,256GBv1,384GB
#SBATCH -N 1
#SBATCH -t 7-0:0:0
#SBATCH -o job_%j.out
......@@ -8,8 +8,9 @@
#SBATCH --mail-type ALL
#SBATCH --mail-user gervaise.henry@utsouthwestern.edu
module load R/3.4.1-gccmkl
sh ./sc_LinkData.sh D27_FACS
Rscript ../r.scripts/sc-TissueMapper_RUN.D27_FACS.R
module load python/3.6.4-anaconda
source activate umap
module load R/3.6.1-gccmkl
module load hdf5_18/1.8.17
Rscript ../r.scripts/SingleR.R --p "$1" --s "$2" --o "$3"
#!/bin/bash
#SBATCH --job-name R_FullAnalysis.D17FACS
#SBATCH -p 256GB,256GBv1,384GB
#SBATCH --job-name sc10x.id_orth
#SBATCH -p 128GB,256GB,256GBv1,384GB
#SBATCH -N 1
#SBATCH -t 7-0:0:0
#SBATCH -o job_%j.out
......@@ -8,8 +8,9 @@
#SBATCH --mail-type ALL
#SBATCH --mail-user gervaise.henry@utsouthwestern.edu
module load R/3.4.1-gccmkl
sh ./sc_LinkData.sh D17_FACS
Rscript ../r.scripts/sc-TissueMapper_RUN.D17_FACS.R
module load python/3.6.4-anaconda
source activate umap
module load R/3.6.1-gccmkl
module load hdf5_18/1.8.17
Rscript ../r.scripts/huPr_muPr.R --p "$1" --r "$3"
#!/bin/bash
#SBATCH --job-name R_FullAnalysis.Pd
#SBATCH --job-name sc10x.raw
#SBATCH -p 256GB,256GBv1,384GB
#SBATCH -N 1
#SBATCH -t 7-0:0:0
......@@ -8,9 +8,9 @@
#SBATCH --mail-type ALL
#SBATCH --mail-user gervaise.henry@utsouthwestern.edu
module load R/3.4.1-gccmkl
sh ./sc_LinkData.sh Pd
Rscript ../r.scripts/sc-TissueMapper_RUN.Pd.R
Rscript ../r.scripts/sc-TissueMapper_RUN.Pd.StressCompare.R
module load python/3.6.4-anaconda
source activate umap
module load R/3.5.1-gccmkl
module load hdf5_18/1.8.17
Rscript ../r.scripts/sc-TissueMapper_RUN.R --p "$1" --s "$2"
#!/bin/bash
mkdir ../analysis
mkdir ../analysis/DATA
ln -s /work/urology/ghenry/RNA-Seq/SingleCell/PIPELINE/DATA/"$1"/10x/filtered_gene_bc_matrices_mex/GRCh38/* ../analysis/DATA/
ln -s /work/urology/ghenry/RNA-Seq/SingleCell/PIPELINE/DATA/ANALYSIS/"$1"/sc10x.Pr.ALL.cluster.NOStress.IDepi+st+ne.Merge.Rda ../analysis/DATA/sc10x.data.Rda
mkdir ../analysis/CI.TestData.downsample
mkdir ../analysis/CI.TestData.downsample/10x
mkdir ../analysis/CI.TestData.downsample/10x/filtered_gene_bc_matrices_mex
mkdir ../analysis/CI.TestData.downsample/10x/filtered_gene_bc_matrices_mex/GRCh38
cp /work/urology/ghenry/RNA-Seq/SingleCell/PIPELINE/DATA/"$1"/"$1"-demultiplex.csv ../analysis/CI.TestData.downsample/"$1"-demultiplex.csv
cp /work/urology/ghenry/RNA-Seq/SingleCell/PIPELINE/DATA/"$1"/10x/"$1"-aggr.csv ../analysis/CI.TestData.downsample/10x/"$1"-aggr.csv
\ No newline at end of file
#!/bin/bash
mkdir ../analysis
mkdir ../analysis/DATA
ln -s /work/urology/ghenry/RNA-Seq/SingleCell/PIPELINE/DATA/"$1"/* ../analysis/DATA/
\ No newline at end of file
mkdir ../analysis/DATA/10x
for i in `cat ../analysis/DATA/${1}-demultiplex.csv`; do
if [[ ${i} = *Samples* ]]
then
continue
else
sample=`echo ${i} | cut -f1 -d ','`
mkdir ../analysis/DATA/10x/${sample}
ln -s /work/urology/ghenry/RNA-Seq/SingleCell/PIPELINE/DATA/"${sample}"/outs/* ../analysis/DATA/10x/${sample}
fi
done
#!/bin/bash
#SBATCH --job-name R_FullAnalysis.DS
#SBATCH -p 256GB,256GBv1,384GB
#SBATCH -N 1
#SBATCH -t 7-0:0:0
#SBATCH -o job_%j.out
#SBATCH -e job_%j.out
#SBATCH --mail-type ALL
#SBATCH --mail-user gervaise.henry@utsouthwestern.edu
module load R/3.4.1-gccmkl
Rscript ../r.scripts/sc-TissueMapper_RUN.DS_D17.R
This source diff could not be displayed because it is too large. You can view the blob instead.
cell, gene, p-value
Basal cells,KRT14,0
Basal cells,SERPINB5,0
Basal cells,BNC1,0
Basal cells,G0S2,0
Basal cells,SNAI2,0
Basal cells,ANXA8L2,0
Basal cells,VSNL1,0
Basal cells,KRT6A,0
Basal cells,ACKR3,0
Basal cells,KRT6C,0
Basal cells,IL1R2,0
Basal cells,RPL5P11,0
Basal cells,IGFBP6,0
Basal cells,ANXA8,0
Basal cells,PCDH7,0
Basal cells,SOX7,0
Basal cells,TP63,0
Basal cells,FST,0
Basal cells,SOX15,0
Basal cells,COL17A1,0
cell, gene, p-value
Club/Goblet cells,MUC16,0
Club/Goblet cells,BPIFB1,0
Club/Goblet cells,SCGB3A1,0
Club/Goblet cells,SAA2,0
Club/Goblet cells,DHRS9,0
Club/Goblet cells,SAA1,0
Club/Goblet cells,LGALS9B,0
Club/Goblet cells,SLC9A3R2,0
Club/Goblet cells,LGALS9,0
Club/Goblet cells,RP5-940J5.9,0
Club/Goblet cells,FUT2,0
Club/Goblet cells,LINC01207,0
Club/Goblet cells,CFB,0
Club/Goblet cells,SPNS2,0
Club/Goblet cells,KYNU,0
Club/Goblet cells,CRABP2,0
Club/Goblet cells,EPS8L1,0
Club/Goblet cells,WFDC2,0
Club/Goblet cells,IGFBP3,0
Club/Goblet cells,S100P,0
AAAS
AARS2
ABCC10
AC025335.1
ACAD9
ACBD3
ACBD4
ACOT8
ACTR1B
ACVR1
ADAT2
ADH5
ADM
AGO1
AIP
AKAP13
ALDH1B1
ALG10
ALG10B
ALKBH4
AMER1
ANAPC1
ANAPC4
ANKH
ANKIB1
AP1G2
APLF
APOPT1
APTX
ARF1
ARGLU1
ARHGAP44
ARHGEF3
ARMC6
ARMCX3
ARPC5L
ASH2L
ASUN
ATAD3A
ATP5G1
ATP5J2
ATP6V0D1
ATPAF1
ATRIP
AURKC
B3GALT6
B3GNT2
B4GALT1
B4GALT2
B4GALT5
B4GALT7
BAMBI
BCL2L11
BCL3
BCL9L
BCS1L
BMF
BORCS6
BRAF
BRAT1
BRD2
BRD3
BRIX1
BRPF3
C10orf2
C12orf45
C12orf75
C15orf40
C1orf198
C22orf46
C9orf142
C9orf64
CAB39L
CAD
CADM4
CAPN12
CAPN5
CARD10
CARHSP1
CARS
CASP10
CASP3
CASP4
CBWD2
CBX4
CCDC117
CCDC137
CCDC14
CCDC25
CCDC43
CCDC50
CCNE1
CD3EAP
CD83
CDAN1
CDC123
CDC14A
CDK1
CDK14
CDKN2AIP
CDPF1
CELSR1
CENPL
CEP104
CEP170B
CEP295
CEP41
CERS6
CES2
CFAP45
CH507-9B2.5
CHCHD4
CHCHD5
CHEK2
CHIC2
CHMP4B
CILP2
CITED4
CLIP4
CMYA5
CNIH3
CNN3
CNTNAP3
COA5
COQ10A
COX17
CPEB4
CPPED1
CPT2
CREBRF
CRYZ
CSRP1
CSTF3