Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metagenomic additions #60

Merged
merged 51 commits into from
May 11, 2020
Merged

metagenomic additions #60

merged 51 commits into from
May 11, 2020

Conversation

dpark01
Copy link
Member

@dpark01 dpark01 commented May 7, 2020

  • adds kraken2 classification task and workflow
  • adds kraken2 as part of demux_metag workflow
  • adds kraken2_build task and workflow
  • adds blastx task, taking contig inputs and emitting a krona plot using ktImportBLAST
  • adds classify_multi workflow which is basically demux_metag without the demux
  • adds local "tinytest" for kraken2 classification that is small enough to run in a travis vm and the db fits in the github repo
  • removes default value of minscoretofilter=60 from align_and_count (spike-in counting task) due to out-of-memory issues on large bam inputs (custom python code runs a big in-memory map on read ids)
  • adds a default input value for samplename for assemble_refbased to simplify the one-bam-input scenario

dpark01 added 30 commits May 5, 2020 12:57
…krakenuniq build process to hit a more optimal speed, expose optional params for kraken2
… (zstd default), initial trial attempts at a dnanexus-like heartbeat monitor
…ny db"

(the tarball is missing the taxdb.. need to fix that)
This reverts commit 5b8c3ad.
…nce it causes out of memory issues for large bam inputs
dpark01 added 21 commits May 7, 2020 22:28
…update classify_multi workflow to use kraken2 based human read depletion, add dnanexus CI test for classify_multi
@dpark01
Copy link
Member Author

dpark01 commented May 11, 2020

The code looks good and is behaving as it should. As for the currently poor performance of classification, we can tune the db and parameters later -- this PR does not change demux_plus or preexisting production workflows to kraken2 yet.

@dpark01 dpark01 merged commit 18386c6 into master May 11, 2020
@dpark01 dpark01 deleted the dp-metagenomics branch May 11, 2020 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant