Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove extra column from blast output #34

Merged
merged 1 commit into from
Feb 5, 2025

Conversation

Ge94
Copy link
Member

@Ge94 Ge94 commented Feb 4, 2025

This is the change needed for seqkit to successfully filter out contaminated contigs, the issues is described in [https://www.ebi.ac.uk/panda/jira/browse/EMG-7172] and affects both human and host decontamination post-assembly. This implies that as of now, it's like we are not decontaminating at all post-assembly.

Jenny is doing a small benchmark of how many of her assemblies are affected by this.
out of ~1200 assemblies, 238 are contaminated by host. She is currently checking how many are contaminated by human.

My understanding is that Sonia already checked, and the blast score isn't used anywhere in our pipelines. If that is NOT the case, this is what the draft PR is for, otherwise we can already implement it in the pipeline.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • README.md is updated (including new tool citations and authors/contributors).

@mberacochea
Copy link
Member

Great, thanks @Ge94. Do you have a test assembly file I could use to verify this?

@mberacochea mberacochea marked this pull request as ready for review February 5, 2025 09:59
Copy link

@ochkalova ochkalova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Germana 🙏🏻

@Ge94 Ge94 merged commit 6f34678 into main Feb 5, 2025
3 checks passed
@Ge94 Ge94 deleted the bugfix/filter_decontamination_with_seqkit branch February 5, 2025 10:12
Copy link
Contributor

@KateSakharova KateSakharova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets see if something will brake

@jmattock5
Copy link
Contributor

Good news, none of my chicken or human gut assemblies had human+phiX contamination 🥳

@jmattock5 jmattock5 removed their request for review February 6, 2025 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants