Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

End of contig extended by single read past telomere sequence #390

Open
kcamnairb opened this issue May 27, 2021 · 1 comment
Open

End of contig extended by single read past telomere sequence #390

kcamnairb opened this issue May 27, 2021 · 1 comment
Labels

Comments

@kcamnairb
Copy link

Hi, I used Flye to assemble a fungal genome using PacBio Sequel subreads into chromosome level contigs with over 100X coverage. Over half of the contigs contain the expected telomere sequence near the ends, but there are several contigs that have sequence extended beyond the telomeres by only one read. I copied two examples below. The track above the reads shows the telomere sequence match. The sequence after the telomere seems like some kind of artifact. Is there a setting to require a minimum coverage, or do you have any ideas what might be causing this?

Thanks,
Brian

image

image

@mikolmogorov
Copy link
Owner

Hi Brian,

Definitely looks like a Flye artifact. Currently, the algorithm does require support of multiple reads within the contig, but contig ends may be extended with a single longest read (which seemed to happen in this case).

I am currently working on polishing improvements which potentially should help fixing this. In the meantime, there are a few things to try as an ad hoc solution. First, you can manually remove the offending reads from the set and reassemble (you can restart from the contigger stage). Alternatively, if you extract fasta sequences from the assembly_graph.gfa, they would contain edges before they are expanded using longest reads, so they should not have this artifacts (but possibly are a bit shorter).

Hope this helps,
Mikhail

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants