You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am wondering if maybe the k-mers found were also largely shared with other species in g__Haemophilus? That would, I think, explain the result - LCA "pulls" the classification back to the lowest common ancestor that matches the hashes. Since you gave me enough info to reproduce, I'll take a look!
Dear sourmash team,
I have been using your great tool for years now and stumbled upon a strange behavior.
Issue: Missing species description when using
sourmash lca classify
on a Haemophilus influenzae genome.Example fasta: https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=1355929925&rettype=fasta
(I tried a few other H. influenza strains, but the same missing species issue. However, different species work fine, such as E.coli, S. aureus, etc.)
Database used: https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db/gtdb-rs214/gtdb-rs214-k31.lca.json.gz
(same issue on an older DB)
version sourmash 4.8.14
Results:
<style> </style>side info:
<style> </style>sourmash results when using
sourmash gather
The text was updated successfully, but these errors were encountered: