-
-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NCBITaxa() error #469
Comments
Problem seems to be declaring spname as NOCASE: |
You can edit 802 line like follows until new PR comes along with a fix: Or 785 I would guess that second option is better. |
Removing |
Worked for me too, thanks! Will leave this issue open for developers. |
Hi there! I am running into the same problem. Thanks a lot! |
Depending on your setup but pkg file is: |
It's the last file in the error trace. In the first post of this issue, that would be: |
Thanks for the super quick reply. |
Oops, maybe I should have read this thread before opening a PR #471 |
There was an additional problem with duplicate synonyms that differ only with quoting (not just casing), i.e.:
In the syn.tab file being produced. |
For those who are using ete3 in pipeline and afraid to change the module file, I have a runtime patch here: from ete3 import NCBITaxa
try:
import ast
import inspect
import sys
print("Patching NCBITaxa's base methods. For reason, see https://github.com/etetoolkit/ete/issues/469.\n")
code_to_patch = """db.execute("INSERT INTO synonym (taxid, spname) VALUES (?, ?);", (taxid, spname))"""
patched_code = """db.execute("INSERT OR REPLACE INTO synonym (taxid, spname) VALUES (?, ?);", (taxid, spname))"""
ncbiquery = sys.modules[NCBITaxa.__module__]
lines_code = [x.replace(code_to_patch, patched_code)
for x in inspect.getsourcelines(ncbiquery.upload_data)[0]]
# Insert info message to see if patch is really applied
lines_code.insert(1, " print('\\nIf this message shown, then the patch is successful!')\n")
# Insert external import and constants since only this function is patched and recompiled
lines_code.insert(1, " import os, sqlite3, sys\n")
lines_code.insert(1, " DB_VERSION = 2\n")
lines_code = "".join(lines_code)
# Compile and apply the patch
ast_tree = ast.parse(lines_code)
patched_function = compile(ast_tree, "<string>", mode="exec")
mod_dummy = {}
exec(patched_function, mod_dummy)
ncbiquery.upload_data = mod_dummy["upload_data"]
except Exception:
print("Patching failed, current taxonomy data downloaded from FTP may be failed to update with ETE3!")
finally:
print("Patch finished.") Import NCBITaxa and add these code should fix the problem when updating the database by replacing the wrong code with the correct one at runtime. This method is quite tricky and dangerous but I think if only this little bit, it's ok to use here as an emergency patch. Tested on my machine and there's nothing wrong happened. |
Thanks everyone for reporting, debugging and providing workarounds. Cannot remove the COLLATE NOCASE from the the db definition, because it would affect name searches (currently cases insensitive by default), so I basically added a few lines of code in the parsing function, so we make sure we ignore synonym duplicates. hope it helps! Reopen if any further issues are found. |
Apologies for the duplicate post over on the Google Group, but I thought it might be better to post this as an issue. After working reliably for some time the NCBITaxa() threw the following error. Fresh install, new computer, etc. didn't solve problem.
Inserting synonyms: 30000 Traceback (most recent call last):
File "", line 1, in
File "/home/jsbowman/.local/lib/python3.6/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 110, in init
self.update_taxonomy_database(taxdump_file)
File "/home/jsbowman/.local/lib/python3.6/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 129, in update_taxonomy_database
update_db(self.dbfile)
File "/home/jsbowman/.local/lib/python3.6/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 760, in update_db
upload_data(dbfile)
File "/home/jsbowman/.local/lib/python3.6/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 802, in upload_data
db.execute("INSERT INTO synonym (taxid, spname) VALUES (?, ?);", (taxid, spname))
sqlite3.IntegrityError: UNIQUE constraint failed: synonym.spname, synonym.taxid
The text was updated successfully, but these errors were encountered: