Skip to content

Resolve "Chunk bam for parallel tin calculation"

Jonathan Gesell requested to merge 45-parallelizeTin into develop

Please fill in the appropriate checklist below (delete those which are not relevant). These are the most common things requested on pull requests.

PR checklist

  • This comment contains a description of changes (with reason)
  • If you've fixed a bug or added code that should be tested, add tests!
  • Documentation in docs is updated
  • Replace dag.png with the most recent CI pipleine integrated_pe artifact
  • CHANGELOG.md is updated
  • README.md is updated
  • LICENSE.md is updated with new contributors

Chunked the BAM file for faster TIN calculation. Chunking takes place in the dedup phase, and all chunks are concatenated together at the end of inferMetadata. Note: this does remove all non-canonical contigs from the calculations, however, after several tests, it appears this was occurring anyway, as tin.py does not calculate anything no present in the BED file.

/cc @ghenry @venkat.malladi

Edited by Gervaise Henry

Merge request reports