Skip to content
Snippets Groups Projects
Commit 6eb0e21e authored by Venkat Malladi's avatar Venkat Malladi
Browse files

Merge branch 'dev' into 'master'

Release 1.1.3

See merge request BICF/Astrocyte/chipseq_analysis!73
parents e6b416e5 e4c84105
Branches master
No related merge requests found
......@@ -27,7 +27,7 @@ bash_tests:
astrocyte:
stage: astrocyte
script:
- module load astrocyte/0.2.0
- module load astrocyte/0.3.1
- module unload nextflow
- cd ..
- astrocyte_cli validate chipseq_analysis
......
......@@ -2,6 +2,13 @@
All notable changes to this project will be documented in this file.
## [publish_1.1.3 ] - 2020-08-16
### Updated
- Updated astrocyte to 0.3.1
### Fixed
- Fixed missing gene names in annotation
## [publish_1.1.2 ] - 2020-06-22
- Add pipeline tracking
......
......@@ -10,8 +10,8 @@
|[![coverage report](https://git.biohpc.swmed.edu/BICF/Astrocyte/chipseq_analysis/badges/master/coverage.svg)](https://git.biohpc.swmed.edu/BICF/Astrocyte/chipseq_analysis/commits/master)|[![coverage report](https://git.biohpc.swmed.edu/BICF/Astrocyte/chipseq_analysis/badges/dev/coverage.svg)](https://git.biohpc.swmed.edu/BICF/Astrocyte/chipseq_analysis/commits/dev)|
[![Nextflow](https://img.shields.io/badge/nextflow-%E2%89%A50.31.0-brightgreen)](https://www.nextflow.io/)
[![Astrocyte](https://img.shields.io/badge/astrocyte-%E2%89%A50.2.0-blue)](https://astrocyte-test.biohpc.swmed.edu/static/docs/index.html)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.2648845.svg)](https://doi.org/10.5281/zenodo.2648845)
[![Astrocyte](https://img.shields.io/badge/astrocyte-%E2%89%A50.3.1-blue)](https://astrocyte-test.biohpc.swmed.edu/static/docs/index.html)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.2648844.svg)](https://doi.org/10.5281/zenodo.2648844)
## Introduction
......
......@@ -52,7 +52,7 @@
* Ewels P., Magnusson M., Lundin S. and Käller M. 2016. MultiQC: Summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32(19): 3047–3048. doi:[10.1093/bioinformatics/btw354](https://dx.doi.org/10.1093/bioinformatics/btw354)
17. **BICF ChIP-seq Analysis Workflow**:
* Spencer D. Barnes, Holly Ruess, Jeremy A. Mathews, Beibei Chen, and Venkat S. Malladi. 2020. BICF ChIP-seq Analysis Workflow (publish_1.1.2). Zenodo. doi:[10.5281/zenodo.3903277](https://doi.org/10.5281/zenodo.3903277)
* Spencer D. Barnes, Holly Ruess, Jeremy A. Mathews, Beibei Chen, and Venkat S. Malladi. 2020. BICF ChIP-seq Analysis Workflow (publish_1.1.3). Zenodo. doi:[10.5281/zenodo.3986942](https://doi.org/10.5281/zenodo.3986942)
18. **Nextflow**:
* Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E., and Notredame, C. 2017. Nextflow enables reproducible computational workflows. Nature biotechnology, 35(4), 316.
......
......@@ -43,13 +43,14 @@ names(files) <- design$Condition
peaks <- lapply(files, readPeakFile, as = "GRanges", header = FALSE)
peakAnnoList <- lapply(peaks, annotatePeak, TxDb=txdb, tssRegion=c(-3000, 3000), verbose=FALSE)
column_names <- c("chr", "start", "end", "width", "strand_1", "name", "score", "strand", "signalValue",
column_names <- c("geneId","chr", "start", "end", "width", "strand_1", "name", "score", "strand", "signalValue",
"pValue", "qValue", "peak", "annotation", "geneChr", "geneStart", "geneEnd",
"geneLength" ,"geneStrand", "geneId", "transcriptId", "distanceToTSS", "symbol")
"geneLength" ,"geneStrand", "transcriptId", "distanceToTSS", "symbol")
for(index in c(1:length(peakAnnoList))) {
filename <- paste(names(peaks)[index], ".chipseeker_annotation.tsv", sep="")
df <- as.data.frame(peakAnnoList[[index]])
df$geneId <- sapply(strsplit(as.character(df$geneId), split = "\\."), "[[", 1)
df_final <- merge(df, sym, by.x="geneId", by.y="ensembl", all.x=T)
colnames(df_final) <- column_names
write.table(df_final[ , !(names(df_final) %in% c('strand_1'))], filename, sep="\t" ,quote=F, row.names=F)
......
......@@ -25,6 +25,10 @@ def test_annotation_singleend():
annotation_file = test_output_path + 'ENCSR238SGC.chipseeker_annotation.tsv'
assert os.path.exists(annotation_file)
assert utils.count_lines(annotation_file) >= 149284
df = pd.read_csv(annotation_file, sep = "\t", header = 0)
print(df.head())
#assert df['symbol'].notna().all()
assert not(df['symbol'].isnull().values.any())
@pytest.mark.pairedend
......@@ -42,3 +46,7 @@ def test_annotation_pairedend():
annotation_file = test_output_path + 'ENCSR729LGA.chipseeker_annotation.tsv'
assert os.path.exists(annotation_file)
assert utils.count_lines(annotation_file) >= 25367
df = pd.read_csv(annotation_file, sep = "\t", header = 0)
print(df.head())
#assert df['symbol'].notna().all()
assert not(df['symbol'].isnull().values.any())
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment