Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce redundancy in direct RNA long-read only assembly? #71

Open
mjudd8 opened this issue May 2, 2024 · 1 comment
Open

reduce redundancy in direct RNA long-read only assembly? #71

mjudd8 opened this issue May 2, 2024 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@mjudd8
Copy link

mjudd8 commented May 2, 2024

Hello,

I used rnabloom to construct a transcriptome from direct RNA data with the following:

rnabloom -long all_reads.fastq -stranded -t 25 -outdir dir/ -u true

The rnabloom.transcripts.fa assembly file seems to have a lot of redundant transcripts with very small variations - is there a way to generate a rnabloom.transcripts.nr.fa file with just the long read data?

Thanks!

@kmnip kmnip self-assigned this May 3, 2024
@kmnip kmnip added the question Further information is requested label May 3, 2024
@kmnip
Copy link
Collaborator

kmnip commented May 3, 2024

Some settings can be changed to reduce the redundancy of the assembly, e.g.

-indel 100 -tip 100 -p 0.6

The default for these for long reads are -indel 50 -tip 50 -p 0.7.
The other source for redundancy came from a bug in minimap2 not outputting some overlaps correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants