-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
serialise fails to load blast dbase .. can't find entries ... dictionary value error issue #392
Comments
Hi! |
I split the mikado prepared.fasta into chunks for running on a job array with slurm, with each cmd being as such:
blastx -max_target_seqs 5 -num_threads 10 -query mikado_prepared."${SLURM_ARRAY_TASK_ID}".fasta -outfmt 5 -db ../xtrop_xlaevis_nparkeri_lcatesbeianus_protein.faa -evalue 0.000001 2> blast.${SLURM_ARRAY_TASK_ID}.log | sed '/^$/d' | gzip -c - > mikado.${SLURM_ARRAY_TASK_ID}.blast.xml.gz
${SLURM_ARRAY_TASK_ID}" is just the numbered subfile, e.g. mikado_prepared.1.fasta, mikado_prepared.2.fasta, etc.
I then zcat all the array outputs into one blast.xml file, then gzip that file
fwiw, this is exactly how i've run blastx previously for use with mikado
…-Adam
Adam H. Freedman, PhD
Data Scientist
Faculty of Arts & Sciences Informatics Group
Harvard University
38 Oxford St
Cambridge, MA 02138
phone: +001 310 415 7145
________________________________
From: Won Cheol Yim ***@***.***>
Sent: Tuesday, March 23, 2021 3:21 PM
To: EI-CoreBioinformatics/mikado ***@***.***>
Cc: Freedman, Adam ***@***.***>; Author ***@***.***>
Subject: Re: [EI-CoreBioinformatics/mikado] serialise fails to load blast dbase .. can't find entries ... dictionary value error issue (#392)
Hi!
It may be helpful for you to explain how you ran BLAST.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_EI-2DCoreBioinformatics_mikado_issues_392-23issuecomment-2D805168525&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MITI_LEJgyr1a24IMFAlSaZIPxMpOUT21T7L3fg4CjA&m=EHxwSGBF6spYhfafHFV_AbQ_iIgyXgduXu1tt3tPhgQ&s=y86wcJ85yxpC4_K24K0vHGZ1zn_obqomJv-MY39CQ7w&e=>, or unsubscribe<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADBMMUCYVPN27TVIF5E7IJ3TFDS4FANCNFSM4ZV2ACRQ&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MITI_LEJgyr1a24IMFAlSaZIPxMpOUT21T7L3fg4CjA&m=EHxwSGBF6spYhfafHFV_AbQ_iIgyXgduXu1tt3tPhgQ&s=uysXEujp3217KC0IDb4MccrQ3h8wx6WKUdNxvMjGcjs&e=>.
|
How about makeblastdb? it might need to have -parse_seqids. |
pretty sure I used -parse_seqids in the blast cmd, but I only have the stdout log for mkblastdb, but, i grabbed the makeblastdb cmd from the mikado readthedocs page which includes that switch
Adam H. Freedman, PhD
Data Scientist
Faculty of Arts & Sciences Informatics Group
Harvard University
38 Oxford St
Cambridge, MA 02138
phone: +001 310 415 7145
…________________________________
From: Won Cheol Yim ***@***.***>
Sent: Tuesday, March 23, 2021 3:32 PM
To: EI-CoreBioinformatics/mikado ***@***.***>
Cc: Freedman, Adam ***@***.***>; Author ***@***.***>
Subject: Re: [EI-CoreBioinformatics/mikado] serialise fails to load blast dbase .. can't find entries ... dictionary value error issue (#392)
How about makeblastdb? it might need to have -parse_seqids.
BTW, you can do cat *.gz >> output.gz instead of zcat and gzip
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_EI-2DCoreBioinformatics_mikado_issues_392-23issuecomment-2D805175250&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MITI_LEJgyr1a24IMFAlSaZIPxMpOUT21T7L3fg4CjA&m=vp6d1t77zacrJzkVeakMw7CADOSGtJCH571yOX7cyWM&s=zwjlBDXEsmOeEaT7zQ_xgkSF6dFPuU0pc2YPySTrnKM&e=>, or unsubscribe<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADBMMUBAMKG5LFUWPUYAL43TFDUEFANCNFSM4ZV2ACRQ&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MITI_LEJgyr1a24IMFAlSaZIPxMpOUT21T7L3fg4CjA&m=vp6d1t77zacrJzkVeakMw7CADOSGtJCH571yOX7cyWM&s=xkJlbX5qnrhtBUJ29BariRipx7afs0RrXbrX2wFkkx8&e=>.
|
Dear @adamfreedman Thank you for reporting this, and thank you to @wyim-pgl for helping out! I fear @ljyanesm and I might have introduced a bug in the parsing of the reference sequence in the latest release, I know we touched the relevant regular expression. Would you please be able to send us a minimal example here (e.g. some ten sequences on the blast database and ten from the As another note, @ljyanesm and I have recently moved Mikado away from using XML files as the default for BLAST, please see the documentation here: https://mikado.readthedocs.io/en/stable/Usage/Serialise/?highlight=tabular#blast-files I am in the process of revising the documentation and I will make sure to update the tutorial if it is out of sync with this change. It might very well be that the bug you encountered will affect the tabular format as well. Regardless, we would appreciate if you could send us a test file so that we can diagnose and solve the issue as soon as possible. Kind regards, |
here are fasta files of queries and targets for which the former hit the latter with blastx |
Dear @adamfreedman @ljyanesm and I identified the cause, it was indeed linked to the regular expression. Briefly, Mikado was malfunctioning when using the We have fixed the code and I am currently implementing the tests. We will be releasing a new version (2.2.3) later today UK time I hope. Kind regards, |
…pplying at all stages (query, target, XML loading, tabular loading)
* Fix tests on osx * Changing the GHA to use the cache for PIP and Conda * Disabling the full daijin_assemble run on the OSX tests as Portcullis is not (yet) available for it on Conda. * Properly fix #392, with attending tests * Fixed the log crash detected on OSX by @ljyanesm Co-authored-by: ljyanesm <yanes.luis@gmail.com>
Dear @adamfreedman We have fixed this in 69e45a4. I am about to release to PyPI and Conda. Kind regards, |
The fix may have added or exposed other bugs.
serialise threw an exception that suggested something wrong with the config file, so i re-ran configure. I assumed that the files created with the previous version of prepare would not create an issue. Upon running serialise again, I got:
Mikado crashed, cause:
junk after document element: line 53663, column 0
Traceback (most recent call last):
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/__main__.py", line 68, in main
args.func(args)
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/subprograms/serialise.py", line 384, in serialise
load_blast(mikado_configuration, logger)
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/subprograms/serialise.py", line 159, in load_blast
part_launcher(filenames)
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/subprograms/serialise.py", line 87, in xml_launcher
xml_serializer()
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/serializers/blast_serializer/blast_serialiser.py", line 377, in __call__
self.serialize()
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/serializers/blast_serializer/blast_serialiser.py", line 349, in serialize
self.__serialise_xmls()
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/serializers/blast_serializer/blast_serialiser.py", line 358, in __serialise_xmls
_serialise_xmls(self)
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/serializers/blast_serializer/xml_serialiser.py", line 111, in _serialise_xmls
for record in opened:
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/parsers/blast_utils.py", line 103, in __next__
return next(iter(self.parser))
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Bio/SearchIO/__init__.py", line 306, in parse
yield from generator
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Bio/SearchIO/BlastIO/blast_xml.py", line 240, in __iter__
yield from self._parse_qresult()
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Bio/SearchIO/BlastIO/blast_xml.py", line 289, in _parse_qresult
for event, qresult_elem in self.xml_iter:
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/xml/etree/ElementTree.py", line 1222, in iterator
yield from pullparser.read_events()
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/xml/etree/ElementTree.py", line 1297, in read_events
raise event
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/xml/etree/ElementTree.py", line 1269, in feed
self._parser.feed(data)
File "<string>", line None
xml.etree.ElementTree.ParseError: junk after document element: line 53663, column 0
…-Adam
Adam H. Freedman, PhD
Data Scientist
Faculty of Arts & Sciences Informatics Group
Harvard University
38 Oxford St
Cambridge, MA 02138
phone: +001 310 415 7145
________________________________
From: Luca Venturini ***@***.***>
Sent: Wednesday, March 24, 2021 2:57 PM
To: EI-CoreBioinformatics/mikado ***@***.***>
Cc: Freedman, Adam ***@***.***>; Mention ***@***.***>
Subject: Re: [EI-CoreBioinformatics/mikado] serialise fails to load blast dbase .. can't find entries ... dictionary value error issue (#392)
Dear @adamfreedman<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_adamfreedman&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MITI_LEJgyr1a24IMFAlSaZIPxMpOUT21T7L3fg4CjA&m=0oMa4OSxUDi4kNV6oO0eLLB18Kca8dxmazhQ-QFUDOg&s=eHq4n-8ygrOzk5__oVDY89qSH74fHzwlTWeCZIfmtI8&e=>
We have fixed this in 69e45a4<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_EI-2DCoreBioinformatics_mikado_commit_69e45a464c90b85985475c9ba18c8245c30fda4a&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MITI_LEJgyr1a24IMFAlSaZIPxMpOUT21T7L3fg4CjA&m=0oMa4OSxUDi4kNV6oO0eLLB18Kca8dxmazhQ-QFUDOg&s=VaZysm-gKaXxxgU0PJKZ7HPvGgK2Axg5BbrA4U66WLM&e=>. I am about to release to PyPI and Conda.
Kind regards,
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_EI-2DCoreBioinformatics_mikado_issues_392-23issuecomment-2D806076387&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MITI_LEJgyr1a24IMFAlSaZIPxMpOUT21T7L3fg4CjA&m=0oMa4OSxUDi4kNV6oO0eLLB18Kca8dxmazhQ-QFUDOg&s=mJBmx7kHm6YLInJ6F5XXAoihmIIOM2kkWotGfb2iYOk&e=>, or unsubscribe<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADBMMUGD4T4H3A76VSBC6J3TFIY2HANCNFSM4ZV2ACRQ&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MITI_LEJgyr1a24IMFAlSaZIPxMpOUT21T7L3fg4CjA&m=0oMa4OSxUDi4kNV6oO0eLLB18Kca8dxmazhQ-QFUDOg&s=xGGujlFf9USUmog5kJAl8_FVJjMpX7XaiCMg-oxL7G4&e=>.
|
and for what it's worth, this was done running on an updated install from conda.
Adam H. Freedman, PhD
Data Scientist
Faculty of Arts & Sciences Informatics Group
Harvard University
38 Oxford St
Cambridge, MA 02138
phone: +001 310 415 7145
…________________________________
From: Freedman, Adam ***@***.***>
Sent: Thursday, March 25, 2021 12:26 PM
To: EI-CoreBioinformatics/mikado ***@***.***>; EI-CoreBioinformatics/mikado ***@***.***>
Cc: Mention ***@***.***>
Subject: Re: [EI-CoreBioinformatics/mikado] serialise fails to load blast dbase .. can't find entries ... dictionary value error issue (#392)
The fix may have added or exposed other bugs.
serialise threw an exception that suggested something wrong with the config file, so i re-ran configure. I assumed that the files created with the previous version of prepare would not create an issue. Upon running serialise again, I got:
Mikado crashed, cause:
junk after document element: line 53663, column 0
Traceback (most recent call last):
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/__main__.py", line 68, in main
args.func(args)
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/subprograms/serialise.py", line 384, in serialise
load_blast(mikado_configuration, logger)
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/subprograms/serialise.py", line 159, in load_blast
part_launcher(filenames)
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/subprograms/serialise.py", line 87, in xml_launcher
xml_serializer()
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/serializers/blast_serializer/blast_serialiser.py", line 377, in __call__
self.serialize()
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/serializers/blast_serializer/blast_serialiser.py", line 349, in serialize
self.__serialise_xmls()
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/serializers/blast_serializer/blast_serialiser.py", line 358, in __serialise_xmls
_serialise_xmls(self)
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/serializers/blast_serializer/xml_serialiser.py", line 111, in _serialise_xmls
for record in opened:
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Mikado/parsers/blast_utils.py", line 103, in __next__
return next(iter(self.parser))
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Bio/SearchIO/__init__.py", line 306, in parse
yield from generator
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Bio/SearchIO/BlastIO/blast_xml.py", line 240, in __iter__
yield from self._parse_qresult()
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/site-packages/Bio/SearchIO/BlastIO/blast_xml.py", line 289, in _parse_qresult
for event, qresult_elem in self.xml_iter:
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/xml/etree/ElementTree.py", line 1222, in iterator
yield from pullparser.read_events()
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/xml/etree/ElementTree.py", line 1297, in read_events
raise event
File "/n/home_rc/afreedman/.conda/envs/mikado2.2.3/lib/python3.7/xml/etree/ElementTree.py", line 1269, in feed
self._parser.feed(data)
File "<string>", line None
xml.etree.ElementTree.ParseError: junk after document element: line 53663, column 0
-Adam
Adam H. Freedman, PhD
Data Scientist
Faculty of Arts & Sciences Informatics Group
Harvard University
38 Oxford St
Cambridge, MA 02138
phone: +001 310 415 7145
________________________________
From: Luca Venturini ***@***.***>
Sent: Wednesday, March 24, 2021 2:57 PM
To: EI-CoreBioinformatics/mikado ***@***.***>
Cc: Freedman, Adam ***@***.***>; Mention ***@***.***>
Subject: Re: [EI-CoreBioinformatics/mikado] serialise fails to load blast dbase .. can't find entries ... dictionary value error issue (#392)
Dear @adamfreedman<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_adamfreedman&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MITI_LEJgyr1a24IMFAlSaZIPxMpOUT21T7L3fg4CjA&m=0oMa4OSxUDi4kNV6oO0eLLB18Kca8dxmazhQ-QFUDOg&s=eHq4n-8ygrOzk5__oVDY89qSH74fHzwlTWeCZIfmtI8&e=>
We have fixed this in 69e45a4<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_EI-2DCoreBioinformatics_mikado_commit_69e45a464c90b85985475c9ba18c8245c30fda4a&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MITI_LEJgyr1a24IMFAlSaZIPxMpOUT21T7L3fg4CjA&m=0oMa4OSxUDi4kNV6oO0eLLB18Kca8dxmazhQ-QFUDOg&s=VaZysm-gKaXxxgU0PJKZ7HPvGgK2Axg5BbrA4U66WLM&e=>. I am about to release to PyPI and Conda.
Kind regards,
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_EI-2DCoreBioinformatics_mikado_issues_392-23issuecomment-2D806076387&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MITI_LEJgyr1a24IMFAlSaZIPxMpOUT21T7L3fg4CjA&m=0oMa4OSxUDi4kNV6oO0eLLB18Kca8dxmazhQ-QFUDOg&s=mJBmx7kHm6YLInJ6F5XXAoihmIIOM2kkWotGfb2iYOk&e=>, or unsubscribe<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADBMMUGD4T4H3A76VSBC6J3TFIY2HANCNFSM4ZV2ACRQ&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MITI_LEJgyr1a24IMFAlSaZIPxMpOUT21T7L3fg4CjA&m=0oMa4OSxUDi4kNV6oO0eLLB18Kca8dxmazhQ-QFUDOg&s=xGGujlFf9USUmog5kJAl8_FVJjMpX7XaiCMg-oxL7G4&e=>.
|
Dear @adamfreedman , Thank you for the update. May I suggest inspecting the XML files passed to I am asking this because the traceback indicates that the error was triggered in the BioPython code for parsing XML files, which itself was triggered by what seems an unexpected truncation of the document at line 53663. Admittedly the Mikado code could handle this better and better inform the user of what has happened, and in which file. This is something we can try to improve on. In case you indeed need to regenerate the BLAST files, I would like again to point out that the new Mikado versions can load data faster by using the tabular format rather than XML, with custom fields. Many thanks for your patience and feedback. |
yeah ... it looks like i did something wrong with the job array output concatenation. no record of what i did, but seems i was holding something wrong.
blast has already been done but in the future i'll just output in tabular format, per your suggestion.
thanks,
Adam
Adam H. Freedman, PhD
Data Scientist
Faculty of Arts & Sciences Informatics Group
Harvard University
38 Oxford St
Cambridge, MA 02138
phone: +001 310 415 7145
…________________________________
From: Luca Venturini ***@***.***>
Sent: Thursday, March 25, 2021 1:08 PM
To: EI-CoreBioinformatics/mikado ***@***.***>
Cc: Freedman, Adam ***@***.***>; Mention ***@***.***>
Subject: Re: [EI-CoreBioinformatics/mikado] serialise fails to load blast dbase .. can't find entries ... dictionary value error issue (#392)
Dear @adamfreedman<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_adamfreedman&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MITI_LEJgyr1a24IMFAlSaZIPxMpOUT21T7L3fg4CjA&m=SeN3JYWxPojAiVTNVOofvWvlboih37wsTnmQsG_n5NY&s=_eTx9tzoY_qJbVAdDzEcerIm-os4lrIa2SSU2_t2BcI&e=> ,
Thank you for the update. May I suggest inspecting the XML files passed to serialise though? I strongly suspect that one or more might be truncated.
I am asking this because the traceback indicates that the error was triggered in the BioPython code for parsing XML files, which itself was triggered by what seems an unexpected truncation of the document at line 53663.
Admittedly the Mikado code could handle this better and better inform the user of what has happened, and in which file. This is something we can try to improve on.
In case you indeed need to regenerate the BLAST files, I would like again to point out that the new Mikado versions can load data faster by using the tabular format rather than XML, with custom fields.
Many thanks for your patience and feedback.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_EI-2DCoreBioinformatics_mikado_issues_392-23issuecomment-2D807111590&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MITI_LEJgyr1a24IMFAlSaZIPxMpOUT21T7L3fg4CjA&m=SeN3JYWxPojAiVTNVOofvWvlboih37wsTnmQsG_n5NY&s=TtPzEWbO2A5xh34As5eke2IADsTsxIrz4Kn63m9iLNw&e=>, or unsubscribe<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADBMMUHNLM44EOHKVA5IS6LTFNUYTANCNFSM4ZV2ACRQ&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=MITI_LEJgyr1a24IMFAlSaZIPxMpOUT21T7L3fg4CjA&m=SeN3JYWxPojAiVTNVOofvWvlboih37wsTnmQsG_n5NY&s=Nx32s3_bmWKWVd6UVyHLEao3Zd-mjf6A-95ErCp5hpc&e=>.
|
Dear @adamfreedman Thank you again for the update. I hope that this time Mikado will run more smoothly. Please let us know if you encounter any other issue. Many thanks, |
running the latest mikado using similar cmds to what i used in 2020 with success ...
the cmd:
mikado serialise --json-conf configuration.yaml --xml blastx/mikado.blastx.xml.cocnat_2021.03.23.xml.gz --orfs transdecoder/mikado_prepared.fasta.transdecoder.bed --blast_targets xtrop_xlaevis_nparkeri_lcatesbeianus_protein.faa
stderror:
Mikado crashed, cause:
ref|XP_018411542.1| not found (Accession: {'_id': None, '_id_alt': [], '_query_id': None, '_description': 'PREDICTED: ras association domain-containing protein 7 [Nanorana parkeri]', '_description_alt': [], '_query_description': '', 'attributes': {}, 'dbxrefs': [], '_items': [HSP(hit_id='ref|XP_018411542.1|', query_id='scallop_TU12746', 1 fragments)], 'blast_id': 'ref|XP_018411542.1|', 'accession': 'XP_018411542', 'seq_len': 431})
Traceback (most recent call last):
File "/n/home_rc/afreedman/.conda/envs/mikado2021/lib/python3.7/site-packages/Mikado/main.py", line 68, in main
args.func(args)
File "/n/home_rc/afreedman/.conda/envs/mikado2021/lib/python3.7/site-packages/Mikado/subprograms/serialise.py", line 378, in serialise
load_blast(args, logger)
File "/n/home_rc/afreedman/.conda/envs/mikado2021/lib/python3.7/site-packages/Mikado/subprograms/serialise.py", line 125, in load_blast
part_launcher(filenames)
File "/n/home_rc/afreedman/.conda/envs/mikado2021/lib/python3.7/site-packages/Mikado/subprograms/serialise.py", line 53, in xml_launcher
xml_serializer()
File "/n/home_rc/afreedman/.conda/envs/mikado2021/lib/python3.7/site-packages/Mikado/serializers/blast_serializer/blast_serialiser.py", line 360, in call
self.serialize()
File "/n/home_rc/afreedman/.conda/envs/mikado2021/lib/python3.7/site-packages/Mikado/serializers/blast_serializer/blast_serialiser.py", line 342, in serialize
self.__serialise_xmls()
File "/n/home_rc/afreedman/.conda/envs/mikado2021/lib/python3.7/site-packages/Mikado/serializers/blast_serializer/blast_serialiser.py", line 351, in __serialise_xmls
_serialise_xmls(self)
File "/n/home_rc/afreedman/.conda/envs/mikado2021/lib/python3.7/site-packages/Mikado/serializers/blast_serializer/xml_serialiser.py", line 124, in _serialise_xmls
max_target_seqs=self._max_target_seqs, logger=self.logger, off_by_one=off_by_one)
File "/n/home_rc/afreedman/.conda/envs/mikado2021/lib/python3.7/site-packages/Mikado/serializers/blast_serializer/xml_serialiser.py", line 224, in objectify_record
current_target, cache["target"] = _get_target_for_blast(alignment, cache["target"])
File "/n/home_rc/afreedman/.conda/envs/mikado2021/lib/python3.7/site-packages/Mikado/serializers/blast_serializer/xml_utils.py", line 89, in _get_target_for_blast
raise ValueError("{} not found (Accession: {})".format(alignment.id, alignment.dict))
ValueError: ref|XP_018411542.1| not found (Accession: {'_id': None, '_id_alt': [], '_query_id': None, '_description': 'PREDICTED: ras association domain-containing protein 7 [Nanorana parkeri]', '_description_alt': [], '_query_description': '', 'attributes': {}, 'dbxrefs': [], '_items': [HSP(hit_id='ref|XP_018411542.1|', query_id='scallop_TU12746', 1 fragments)], 'blast_id': 'ref|XP_018411542.1|', 'accession': 'XP_018411542', 'seq_len': 431})
The text was updated successfully, but these errors were encountered: