-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recommended parameters for metagenome assembly and a related question #30
Comments
Dear Xiaowen, thanks for your interest! For mdbg on metagenomes (or in fact isolates too), there are several possible execution modes:
For 1., our paper experiments were made with We don't have a way to adjust parameters in terms of number of species and genome size. I suggest you just run with one of the two ways above (1. or 2.) and see if the results look reasonable. For mdbg in metagenomics, a reasonable result will be that the per-species coverage is high but contiguity is lower than hifiasm-meta. In any case, please make sure to use the https://github.com/ekimb/rust-mdbg/blob/master/utils/magic_simplify_meta script and not the usual
please let us know if you have any issues, best, |
Hi Rayan, Thank you so much for the suggestions and mentioning I leave the issue open for now in case I may need more advises from you. I will come back and close it by next week if I don't run into anything. Thank you! Best, |
Thanks a lot for the help, assembly runs were smooth. I have one additional question, not related to the issue's title though: have you tried busco (eukaryotes) or checkM (microbial) for evaluation? Could you offer some advises if so? I tried checkM1 and it seems to be confused by insertions. |
Hi Xiaowen, great to hear. We haven't run extensive evaluations using checkM on our rust-mdbg metagenomes, but based on feedback by a collaborator, it makes sense that rough unpolished metagenome assemblies, such as the ones produced by rust-mdbg, would have poor checkM score due to indels, provoking frameshifts, then hurting sensibility of the gene detection method thus lowering the gene completeness score. The gene is in fact likely there in the assembly, except not detected due the need for high base quality in those assembly assessment methods. One possible workaround would be to run a polishing software such as racon on the assembly, but this is just a hypothesis. Rayan |
Awesome, thank you for the suggestions. |
Hi,
I want to try mdBG on real metagenome samples. I wonder if you could suggest a parameter combo to use (or combos to try out). And should I do the multi-k mode?
For the real samples, I could crudely guess the number of species in the library, and perhaps an exaggerated total genome size from it as well. I'm not sure if these could be useful.
Another question is: could mdBG output contig coverage estimates?
Thank you!
The text was updated successfully, but these errors were encountered: