-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mikado 20190606_6c8d542 serialise ValueError: Invalid frame specified #181
Comments
Hi @gemygk, Luca |
Hi @gemygk , |
Where are you pushing it to?
…On Mon, Jun 10, 2019 at 10:49 AM Luca Venturini ***@***.***> wrote:
Hi @gemygk <https://github.com/gemygk> ,
update: the bug is triggered because Prodigal found a *GTG* start, and
Mikado normally only considers *ATG* as a valid start (although this can
be controlled). The bug is triggered because I fixed the start finding on
the *positive* strand but not the negative one.
I am pushing the fix now.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#181?email_source=notifications&email_token=AADIJYPN37ZLU6CA3ZNNMDDPZYPSXA5CNFSM4HWRWBU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXJN3RA#issuecomment-500358596>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AADIJYJCNYC3KGRFNOS63T3PZYPSXANCNFSM4HWRWBUQ>
.
|
… On Mon, Jun 10, 2019 at 12:00 PM C Schu ***@***.***> wrote:
Where are you pushing it to?
On Mon, Jun 10, 2019 at 10:49 AM Luca Venturini ***@***.***>
wrote:
> Hi @gemygk <https://github.com/gemygk> ,
> update: the bug is triggered because Prodigal found a *GTG* start, and
> Mikado normally only considers *ATG* as a valid start (although this can
> be controlled). The bug is triggered because I fixed the start finding on
> the *positive* strand but not the negative one.
> I am pushing the fix now.
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#181?email_source=notifications&email_token=AADIJYPN37ZLU6CA3ZNNMDDPZYPSXA5CNFSM4HWRWBU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXJN3RA#issuecomment-500358596>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AADIJYJCNYC3KGRFNOS63T3PZYPSXANCNFSM4HWRWBUQ>
> .
>
|
Hi @cschu ,
The test finished successfully, see the log and database:
The run crashed, but that's because the Blast database is not formatted properly (the FASTA lines all have different lengths). The orf loading was successful. See e.g. the ORF 38_4 for
Now it has been correctly transitioned from 1343-1795 (with a GTG start) to 1343-1821 (ATG start). I will close down the issue as the problem seems patched. |
Why is "the FASTA lines all have different lengths" an issue? Which tool
dies because of that?
…On Mon, Jun 10, 2019 at 12:10 PM Luca Venturini ***@***.***> wrote:
Hi @cschu <https://github.com/cschu> ,
yes, that was the correct one.
The image is here:
/ei/software/testing/mikado/20190610_94160dd/x86_64/
The test finished successfully, see the log and database:
/ei/workarea/group-ga/Projects/CB-GENANNO-444_Myzus_persicae_clone_O_v2_annotation/Analysis/mikado-20190606_6c8d542/trans_run1/mikado_long_reads
The run crashed, but that's because the Blast database is not formatted
properly (the FASTA lines all have different lengths). The orf loading was
successful. See e.g. the ORF 38_4 for that was crashing Mikado earlier:
68|38|1|1821|38_1|-|13|657|50.1|0|1|645|0
69|38|1|1821|38_2|-|739|906|8.1|1|1|168|0
70|38|1|1821|38_3|-|910|1101|3.9|1|1|192|0
71|38|1|1821|38_4|-|1343|1821|27.7|0|1|479|0
Now it has been correctly transitioned from 1343-1795 (with a GTG start)
to 1343-1821 (ATG start).
I will close down the issue as the problem seems patched.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#181?email_source=notifications&email_token=AADIJYOW7ZNA33YQPEGOCA3PZYZCNA5CNFSM4HWRWBU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXJTBIA#issuecomment-500379808>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AADIJYMZQKZJ6V2BYWNWA5LPZYZCNANCNFSM4HWRWBUQ>
.
|
Samtools faidx, through Pysam. I am using it to index all FASTA files and/or reading the FAI index, as that is much much faster that any of the alternatives (BioPython, PyFaidx) even though BioPython would nominally be more robust. If this is an issue, I can have a fallback on BioPython and spit out a warning if such an error is encountered. I would do this only for serialising the Blastx index though, in other cases when I index a file in Mikado I generally need to be able to access the sequence data quickly (e.g. for padding, which is apparently already slow as it is). |
* This should address #173 (both configuration file and docs) and #158 * Fix #181 and small bug fix for parsing Mikado annotations. * Progress for #142 - this should fix the wrong ORF calculation for cases when the CDS was open at the 5' end. * Fixed previous commit (always for #142) * #142: corrected and tested the issue with one-off exons, for padding. * This should fix and test #142 for good. * Removed spurious warning/error messages * #142: solved a bug which caused truncated transcripts at the 5' end not to be padded. * #142: solved a problem which caused a false abort for transcripts on the - strand with changed stop codon. * #142: fixing previous commit * Pushing the fix for #182 onto the development branch * Fix #183 * Fix #183 and previous commit * #183: now Mikado configure will set a seed when generating the configuration file. The seed will be explicitly mentioned in the log. * #177: made ORF loading slightly faster with pysam. Also made XML serialisation much faster using SQL sessions and multiprocessing.Pool instead of queues. * Solved annoying bug that caused Mikado to crash with TAIR GFF3s.
* Solved a small bug in the Gene class * This commit should fix some of the performance issues found in Mikado compare when testing in the all vs all (issue #166). * Updated the CHANGELOG. * Slight improvements to the generic GFLine class and to the to_gff wrapper * Solved some assorted bugs, from stop_codon parsing in GTF2 (for Augustus) to avoiding a very costly pragma check on MIDX databases. * Now Mikado util stats will only return one value for the mode, making the table parsable * Solved some small bugs introduced by changing the mode for mikado util stats * Dropping automated support for Python3.5. The conda environment cannot be created successfully, too many packages have not been updated in the original repositories. * Updating the conda environment to reflect that only Python>=3.6 is now accepted * Various fixes for managing correctly BED12 files. * Fix for the previous commit breaking TRAVIS * Switched to PySam for loading and fetching from genome files. Also, improved massively the speed of tests. * Fixed previous commit * Fixed travis bug * Refactoring of check_index for Mikado compare (#166) and fix for #172 * Now Mikado will merge touching (NOT overlapping) exons coming from BED12 files. This should fix an issue with halLiftover * This commit should fix a bunch of tests for when Mikado is installed with SUDO privileges (#137) potentially also fixing #172. * Corrected a bug in the printing of transcriptomic BED12 files, corrected a bug in the serialisation of ORFs * Fixed previous breakage * Moved the code for checking the index into gene_dict. Also, now GeneDict allows access to positions as well. * Minor edit to assigner * Fixing previously broken commit * Solving a bug which rendered the exclude_utr/protein_coding flags of mikado compare useless. * Adding the GZI index to the tests directory to avoid permission errors. Addressing #175 * Corrected some testing. Moreover, now Mikado supports the BED12+1 format (ie gffread --bed output) * Adding a maximum intron length for the default scoring configuration files. * BROKEN. Proceeding on #142. Now the padding algorithm is aware of where a transcript finishes (intron vs exon). Moreover, we need to change the data structure for padding to a *directional* graph and keep in mind the distance needed to pad a transcript, to solve ambiguous cases in a deterministic (rather than random) way. * Issue #174: modification to the abstractlocus.py file, to try to solve the issue found by @cschuh. * #174: this should provide a solution to the issue, which is however only temporary. To be tested. * #174: making the implicit "for" cycle explicit. Hopefully this should help pinpoint the error better. * #174: peppered the failing block with try-except statements. * #174: this should solve it. Now missing external scores in the database will cause Mikado to explicitly fail. * Fixed #176 * BROKEN. Progress on #142, the code runs, but the tests are broken. **This might be legitimate as we changed the behaviour of the code.** * Closing #155. * #174: Now Mikado pick will die informatively if the SQLite3 database has not been found. * #166: fixed some issues with self-compare * BROKEN. We have to verify that the padding functions also on the 5' end, but we need to make a new test for that. The test development is in progress. * The padding now should be tested and correct. * Fixed previous commit. This should fix #142. * Development (#178) * Switched to PySam for loading and fetching from genome files. Also, improved massively the speed of tests. * Fixed previous commit * Fixed travis bug * Refactoring of check_index for Mikado compare (#166) and fix for #172 * Now Mikado will merge touching (NOT overlapping) exons coming from BED12 files. This should fix an issue with halLiftover * This commit should fix a bunch of tests for when Mikado is installed with SUDO privileges (#137) potentially also fixing #172. * Corrected a bug in the printing of transcriptomic BED12 files, corrected a bug in the serialisation of ORFs * Fixed previous breakage * Moved the code for checking the index into gene_dict. Also, now GeneDict allows access to positions as well. * Minor edit to assigner * Fixing previously broken commit * Solving a bug which rendered the exclude_utr/protein_coding flags of mikado compare useless. * Adding the GZI index to the tests directory to avoid permission errors. Addressing #175 * Corrected some testing. Moreover, now Mikado supports the BED12+1 format (ie gffread --bed output) * Adding a maximum intron length for the default scoring configuration files. * BROKEN. Proceeding on #142. Now the padding algorithm is aware of where a transcript finishes (intron vs exon). Moreover, we need to change the data structure for padding to a *directional* graph and keep in mind the distance needed to pad a transcript, to solve ambiguous cases in a deterministic (rather than random) way. * Issue #174: modification to the abstractlocus.py file, to try to solve the issue found by @cschuh. * #174: this should provide a solution to the issue, which is however only temporary. To be tested. * #174: making the implicit "for" cycle explicit. Hopefully this should help pinpoint the error better. * #174: peppered the failing block with try-except statements. * #174: this should solve it. Now missing external scores in the database will cause Mikado to explicitly fail. * Fixed #176 * BROKEN. Progress on #142, the code runs, but the tests are broken. **This might be legitimate as we changed the behaviour of the code.** * Closing #155. * #174: Now Mikado pick will die informatively if the SQLite3 database has not been found. * #166: fixed some issues with self-compare * BROKEN. We have to verify that the padding functions also on the 5' end, but we need to make a new test for that. The test development is in progress. * The padding now should be tested and correct. * Fixed previous commit. This should fix #142. * Update Singularity.centos.def Changed python to python3 during %post, otherwise it will use the system python2.7... * Fixed small bug in external metrics handling * Update Singularity.centos.def * Development (#184) * This should address #173 (both configuration file and docs) and #158 * Fix #181 and small bug fix for parsing Mikado annotations. * Progress for #142 - this should fix the wrong ORF calculation for cases when the CDS was open at the 5' end. * Fixed previous commit (always for #142) * #142: corrected and tested the issue with one-off exons, for padding. * This should fix and test #142 for good. * Removed spurious warning/error messages * #142: solved a bug which caused truncated transcripts at the 5' end not to be padded. * #142: solved a problem which caused a false abort for transcripts on the - strand with changed stop codon. * #142: fixing previous commit * Pushing the fix for #182 onto the development branch * Fix #183 * Fix #183 and previous commit * #183: now Mikado configure will set a seed when generating the configuration file. The seed will be explicitly mentioned in the log. * #177: made ORF loading slightly faster with pysam. Also made XML serialisation much faster using SQL sessions and multiprocessing.Pool instead of queues. * Solved annoying bug that caused Mikado to crash with TAIR GFF3s.
* Solved a small bug in the Gene class * This commit should fix some of the performance issues found in Mikado compare when testing in the all vs all (issue #166). * Updated the CHANGELOG. * Slight improvements to the generic GFLine class and to the to_gff wrapper * Solved some assorted bugs, from stop_codon parsing in GTF2 (for Augustus) to avoiding a very costly pragma check on MIDX databases. * Now Mikado util stats will only return one value for the mode, making the table parsable * Solved some small bugs introduced by changing the mode for mikado util stats * Dropping automated support for Python3.5. The conda environment cannot be created successfully, too many packages have not been updated in the original repositories. * Updating the conda environment to reflect that only Python>=3.6 is now accepted * Various fixes for managing correctly BED12 files. * Fix for the previous commit breaking TRAVIS * Switched to PySam for loading and fetching from genome files. Also, improved massively the speed of tests. * Fixed previous commit * Fixed travis bug * Refactoring of check_index for Mikado compare (#166) and fix for #172 * Now Mikado will merge touching (NOT overlapping) exons coming from BED12 files. This should fix an issue with halLiftover * This commit should fix a bunch of tests for when Mikado is installed with SUDO privileges (#137) potentially also fixing #172. * Corrected a bug in the printing of transcriptomic BED12 files, corrected a bug in the serialisation of ORFs * Fixed previous breakage * Moved the code for checking the index into gene_dict. Also, now GeneDict allows access to positions as well. * Minor edit to assigner * Fixing previously broken commit * Solving a bug which rendered the exclude_utr/protein_coding flags of mikado compare useless. * Adding the GZI index to the tests directory to avoid permission errors. Addressing #175 * Corrected some testing. Moreover, now Mikado supports the BED12+1 format (ie gffread --bed output) * Adding a maximum intron length for the default scoring configuration files. * BROKEN. Proceeding on #142. Now the padding algorithm is aware of where a transcript finishes (intron vs exon). Moreover, we need to change the data structure for padding to a *directional* graph and keep in mind the distance needed to pad a transcript, to solve ambiguous cases in a deterministic (rather than random) way. * Issue #174: modification to the abstractlocus.py file, to try to solve the issue found by @cschuh. * #174: this should provide a solution to the issue, which is however only temporary. To be tested. * #174: making the implicit "for" cycle explicit. Hopefully this should help pinpoint the error better. * #174: peppered the failing block with try-except statements. * #174: this should solve it. Now missing external scores in the database will cause Mikado to explicitly fail. * Fixed #176 * BROKEN. Progress on #142, the code runs, but the tests are broken. **This might be legitimate as we changed the behaviour of the code.** * Closing #155. * #174: Now Mikado pick will die informatively if the SQLite3 database has not been found. * #166: fixed some issues with self-compare * BROKEN. We have to verify that the padding functions also on the 5' end, but we need to make a new test for that. The test development is in progress. * The padding now should be tested and correct. * Fixed previous commit. This should fix #142. * Development (#178) * Switched to PySam for loading and fetching from genome files. Also, improved massively the speed of tests. * Fixed previous commit * Fixed travis bug * Refactoring of check_index for Mikado compare (#166) and fix for #172 * Now Mikado will merge touching (NOT overlapping) exons coming from BED12 files. This should fix an issue with halLiftover * This commit should fix a bunch of tests for when Mikado is installed with SUDO privileges (#137) potentially also fixing #172. * Corrected a bug in the printing of transcriptomic BED12 files, corrected a bug in the serialisation of ORFs * Fixed previous breakage * Moved the code for checking the index into gene_dict. Also, now GeneDict allows access to positions as well. * Minor edit to assigner * Fixing previously broken commit * Solving a bug which rendered the exclude_utr/protein_coding flags of mikado compare useless. * Adding the GZI index to the tests directory to avoid permission errors. Addressing #175 * Corrected some testing. Moreover, now Mikado supports the BED12+1 format (ie gffread --bed output) * Adding a maximum intron length for the default scoring configuration files. * BROKEN. Proceeding on #142. Now the padding algorithm is aware of where a transcript finishes (intron vs exon). Moreover, we need to change the data structure for padding to a *directional* graph and keep in mind the distance needed to pad a transcript, to solve ambiguous cases in a deterministic (rather than random) way. * Issue #174: modification to the abstractlocus.py file, to try to solve the issue found by @cschuh. * #174: this should provide a solution to the issue, which is however only temporary. To be tested. * #174: making the implicit "for" cycle explicit. Hopefully this should help pinpoint the error better. * #174: peppered the failing block with try-except statements. * #174: this should solve it. Now missing external scores in the database will cause Mikado to explicitly fail. * Fixed #176 * BROKEN. Progress on #142, the code runs, but the tests are broken. **This might be legitimate as we changed the behaviour of the code.** * Closing #155. * #174: Now Mikado pick will die informatively if the SQLite3 database has not been found. * #166: fixed some issues with self-compare * BROKEN. We have to verify that the padding functions also on the 5' end, but we need to make a new test for that. The test development is in progress. * The padding now should be tested and correct. * Fixed previous commit. This should fix #142. * Update Singularity.centos.def Changed python to python3 during %post, otherwise it will use the system python2.7... * Fixed small bug in external metrics handling * Update Singularity.centos.def * This should address #173 (both configuration file and docs) and #158 * Fix #181 and small bug fix for parsing Mikado annotations. * Progress for #142 - this should fix the wrong ORF calculation for cases when the CDS was open at the 5' end. * Fixed previous commit (always for #142) * #142: corrected and tested the issue with one-off exons, for padding. * This should fix and test #142 for good. * Removed spurious warning/error messages * #142: solved a bug which caused truncated transcripts at the 5' end not to be padded. * #142: solved a problem which caused a false abort for transcripts on the - strand with changed stop codon. * #142: fixing previous commit * Pushing the fix for #182 onto the development branch * Fix #183 * Fix #183 and previous commit * #183: now Mikado configure will set a seed when generating the configuration file. The seed will be explicitly mentioned in the log. * #177: made ORF loading slightly faster with pysam. Also made XML serialisation much faster using SQL sessions and multiprocessing.Pool instead of queues. * Solved annoying bug that caused Mikado to crash with TAIR GFF3s. * Development (#184) * This should address #173 (both configuration file and docs) and #158 * Fix #181 and small bug fix for parsing Mikado annotations. * Progress for #142 - this should fix the wrong ORF calculation for cases when the CDS was open at the 5' end. * Fixed previous commit (always for #142) * #142: corrected and tested the issue with one-off exons, for padding. * This should fix and test #142 for good. * Removed spurious warning/error messages * #142: solved a bug which caused truncated transcripts at the 5' end not to be padded. * #142: solved a problem which caused a false abort for transcripts on the - strand with changed stop codon. * #142: fixing previous commit * Pushing the fix for #182 onto the development branch * Fix #183 * Fix #183 and previous commit * #183: now Mikado configure will set a seed when generating the configuration file. The seed will be explicitly mentioned in the log. * #177: made ORF loading slightly faster with pysam. Also made XML serialisation much faster using SQL sessions and multiprocessing.Pool instead of queues. * Solved annoying bug that caused Mikado to crash with TAIR GFF3s.
* Solved a small bug in the Gene class * This commit should fix some of the performance issues found in Mikado compare when testing in the all vs all (issue EI-CoreBioinformatics#166). * Updated the CHANGELOG. * Slight improvements to the generic GFLine class and to the to_gff wrapper * Solved some assorted bugs, from stop_codon parsing in GTF2 (for Augustus) to avoiding a very costly pragma check on MIDX databases. * Now Mikado util stats will only return one value for the mode, making the table parsable * Solved some small bugs introduced by changing the mode for mikado util stats * Dropping automated support for Python3.5. The conda environment cannot be created successfully, too many packages have not been updated in the original repositories. * Updating the conda environment to reflect that only Python>=3.6 is now accepted * Various fixes for managing correctly BED12 files. * Fix for the previous commit breaking TRAVIS * Switched to PySam for loading and fetching from genome files. Also, improved massively the speed of tests. * Fixed previous commit * Fixed travis bug * Refactoring of check_index for Mikado compare (EI-CoreBioinformatics#166) and fix for EI-CoreBioinformatics#172 * Now Mikado will merge touching (NOT overlapping) exons coming from BED12 files. This should fix an issue with halLiftover * This commit should fix a bunch of tests for when Mikado is installed with SUDO privileges (EI-CoreBioinformatics#137) potentially also fixing EI-CoreBioinformatics#172. * Corrected a bug in the printing of transcriptomic BED12 files, corrected a bug in the serialisation of ORFs * Fixed previous breakage * Moved the code for checking the index into gene_dict. Also, now GeneDict allows access to positions as well. * Minor edit to assigner * Fixing previously broken commit * Solving a bug which rendered the exclude_utr/protein_coding flags of mikado compare useless. * Adding the GZI index to the tests directory to avoid permission errors. Addressing EI-CoreBioinformatics#175 * Corrected some testing. Moreover, now Mikado supports the BED12+1 format (ie gffread --bed output) * Adding a maximum intron length for the default scoring configuration files. * BROKEN. Proceeding on EI-CoreBioinformatics#142. Now the padding algorithm is aware of where a transcript finishes (intron vs exon). Moreover, we need to change the data structure for padding to a *directional* graph and keep in mind the distance needed to pad a transcript, to solve ambiguous cases in a deterministic (rather than random) way. * Issue EI-CoreBioinformatics#174: modification to the abstractlocus.py file, to try to solve the issue found by @cschuh. * EI-CoreBioinformatics#174: this should provide a solution to the issue, which is however only temporary. To be tested. * EI-CoreBioinformatics#174: making the implicit "for" cycle explicit. Hopefully this should help pinpoint the error better. * EI-CoreBioinformatics#174: peppered the failing block with try-except statements. * EI-CoreBioinformatics#174: this should solve it. Now missing external scores in the database will cause Mikado to explicitly fail. * Fixed EI-CoreBioinformatics#176 * BROKEN. Progress on EI-CoreBioinformatics#142, the code runs, but the tests are broken. **This might be legitimate as we changed the behaviour of the code.** * Closing EI-CoreBioinformatics#155. * EI-CoreBioinformatics#174: Now Mikado pick will die informatively if the SQLite3 database has not been found. * EI-CoreBioinformatics#166: fixed some issues with self-compare * BROKEN. We have to verify that the padding functions also on the 5' end, but we need to make a new test for that. The test development is in progress. * The padding now should be tested and correct. * Fixed previous commit. This should fix EI-CoreBioinformatics#142. * Development (EI-CoreBioinformatics#178) * Switched to PySam for loading and fetching from genome files. Also, improved massively the speed of tests. * Fixed previous commit * Fixed travis bug * Refactoring of check_index for Mikado compare (EI-CoreBioinformatics#166) and fix for EI-CoreBioinformatics#172 * Now Mikado will merge touching (NOT overlapping) exons coming from BED12 files. This should fix an issue with halLiftover * This commit should fix a bunch of tests for when Mikado is installed with SUDO privileges (EI-CoreBioinformatics#137) potentially also fixing EI-CoreBioinformatics#172. * Corrected a bug in the printing of transcriptomic BED12 files, corrected a bug in the serialisation of ORFs * Fixed previous breakage * Moved the code for checking the index into gene_dict. Also, now GeneDict allows access to positions as well. * Minor edit to assigner * Fixing previously broken commit * Solving a bug which rendered the exclude_utr/protein_coding flags of mikado compare useless. * Adding the GZI index to the tests directory to avoid permission errors. Addressing EI-CoreBioinformatics#175 * Corrected some testing. Moreover, now Mikado supports the BED12+1 format (ie gffread --bed output) * Adding a maximum intron length for the default scoring configuration files. * BROKEN. Proceeding on EI-CoreBioinformatics#142. Now the padding algorithm is aware of where a transcript finishes (intron vs exon). Moreover, we need to change the data structure for padding to a *directional* graph and keep in mind the distance needed to pad a transcript, to solve ambiguous cases in a deterministic (rather than random) way. * Issue EI-CoreBioinformatics#174: modification to the abstractlocus.py file, to try to solve the issue found by @cschuh. * EI-CoreBioinformatics#174: this should provide a solution to the issue, which is however only temporary. To be tested. * EI-CoreBioinformatics#174: making the implicit "for" cycle explicit. Hopefully this should help pinpoint the error better. * EI-CoreBioinformatics#174: peppered the failing block with try-except statements. * EI-CoreBioinformatics#174: this should solve it. Now missing external scores in the database will cause Mikado to explicitly fail. * Fixed EI-CoreBioinformatics#176 * BROKEN. Progress on EI-CoreBioinformatics#142, the code runs, but the tests are broken. **This might be legitimate as we changed the behaviour of the code.** * Closing EI-CoreBioinformatics#155. * EI-CoreBioinformatics#174: Now Mikado pick will die informatively if the SQLite3 database has not been found. * EI-CoreBioinformatics#166: fixed some issues with self-compare * BROKEN. We have to verify that the padding functions also on the 5' end, but we need to make a new test for that. The test development is in progress. * The padding now should be tested and correct. * Fixed previous commit. This should fix EI-CoreBioinformatics#142. * Update Singularity.centos.def Changed python to python3 during %post, otherwise it will use the system python2.7... * Fixed small bug in external metrics handling * Update Singularity.centos.def * Development (EI-CoreBioinformatics#184) * This should address EI-CoreBioinformatics#173 (both configuration file and docs) and EI-CoreBioinformatics#158 * Fix EI-CoreBioinformatics#181 and small bug fix for parsing Mikado annotations. * Progress for EI-CoreBioinformatics#142 - this should fix the wrong ORF calculation for cases when the CDS was open at the 5' end. * Fixed previous commit (always for EI-CoreBioinformatics#142) * EI-CoreBioinformatics#142: corrected and tested the issue with one-off exons, for padding. * This should fix and test EI-CoreBioinformatics#142 for good. * Removed spurious warning/error messages * EI-CoreBioinformatics#142: solved a bug which caused truncated transcripts at the 5' end not to be padded. * EI-CoreBioinformatics#142: solved a problem which caused a false abort for transcripts on the - strand with changed stop codon. * EI-CoreBioinformatics#142: fixing previous commit * Pushing the fix for EI-CoreBioinformatics#182 onto the development branch * Fix EI-CoreBioinformatics#183 * Fix EI-CoreBioinformatics#183 and previous commit * EI-CoreBioinformatics#183: now Mikado configure will set a seed when generating the configuration file. The seed will be explicitly mentioned in the log. * EI-CoreBioinformatics#177: made ORF loading slightly faster with pysam. Also made XML serialisation much faster using SQL sessions and multiprocessing.Pool instead of queues. * Solved annoying bug that caused Mikado to crash with TAIR GFF3s.
* Solved a small bug in the Gene class * This commit should fix some of the performance issues found in Mikado compare when testing in the all vs all (issue EI-CoreBioinformatics#166). * Updated the CHANGELOG. * Slight improvements to the generic GFLine class and to the to_gff wrapper * Solved some assorted bugs, from stop_codon parsing in GTF2 (for Augustus) to avoiding a very costly pragma check on MIDX databases. * Now Mikado util stats will only return one value for the mode, making the table parsable * Solved some small bugs introduced by changing the mode for mikado util stats * Dropping automated support for Python3.5. The conda environment cannot be created successfully, too many packages have not been updated in the original repositories. * Updating the conda environment to reflect that only Python>=3.6 is now accepted * Various fixes for managing correctly BED12 files. * Fix for the previous commit breaking TRAVIS * Switched to PySam for loading and fetching from genome files. Also, improved massively the speed of tests. * Fixed previous commit * Fixed travis bug * Refactoring of check_index for Mikado compare (EI-CoreBioinformatics#166) and fix for EI-CoreBioinformatics#172 * Now Mikado will merge touching (NOT overlapping) exons coming from BED12 files. This should fix an issue with halLiftover * This commit should fix a bunch of tests for when Mikado is installed with SUDO privileges (EI-CoreBioinformatics#137) potentially also fixing EI-CoreBioinformatics#172. * Corrected a bug in the printing of transcriptomic BED12 files, corrected a bug in the serialisation of ORFs * Fixed previous breakage * Moved the code for checking the index into gene_dict. Also, now GeneDict allows access to positions as well. * Minor edit to assigner * Fixing previously broken commit * Solving a bug which rendered the exclude_utr/protein_coding flags of mikado compare useless. * Adding the GZI index to the tests directory to avoid permission errors. Addressing EI-CoreBioinformatics#175 * Corrected some testing. Moreover, now Mikado supports the BED12+1 format (ie gffread --bed output) * Adding a maximum intron length for the default scoring configuration files. * BROKEN. Proceeding on EI-CoreBioinformatics#142. Now the padding algorithm is aware of where a transcript finishes (intron vs exon). Moreover, we need to change the data structure for padding to a *directional* graph and keep in mind the distance needed to pad a transcript, to solve ambiguous cases in a deterministic (rather than random) way. * Issue EI-CoreBioinformatics#174: modification to the abstractlocus.py file, to try to solve the issue found by @cschuh. * EI-CoreBioinformatics#174: this should provide a solution to the issue, which is however only temporary. To be tested. * EI-CoreBioinformatics#174: making the implicit "for" cycle explicit. Hopefully this should help pinpoint the error better. * EI-CoreBioinformatics#174: peppered the failing block with try-except statements. * EI-CoreBioinformatics#174: this should solve it. Now missing external scores in the database will cause Mikado to explicitly fail. * Fixed EI-CoreBioinformatics#176 * BROKEN. Progress on EI-CoreBioinformatics#142, the code runs, but the tests are broken. **This might be legitimate as we changed the behaviour of the code.** * Closing EI-CoreBioinformatics#155. * EI-CoreBioinformatics#174: Now Mikado pick will die informatively if the SQLite3 database has not been found. * EI-CoreBioinformatics#166: fixed some issues with self-compare * BROKEN. We have to verify that the padding functions also on the 5' end, but we need to make a new test for that. The test development is in progress. * The padding now should be tested and correct. * Fixed previous commit. This should fix EI-CoreBioinformatics#142. * Development (EI-CoreBioinformatics#178) * Switched to PySam for loading and fetching from genome files. Also, improved massively the speed of tests. * Fixed previous commit * Fixed travis bug * Refactoring of check_index for Mikado compare (EI-CoreBioinformatics#166) and fix for EI-CoreBioinformatics#172 * Now Mikado will merge touching (NOT overlapping) exons coming from BED12 files. This should fix an issue with halLiftover * This commit should fix a bunch of tests for when Mikado is installed with SUDO privileges (EI-CoreBioinformatics#137) potentially also fixing EI-CoreBioinformatics#172. * Corrected a bug in the printing of transcriptomic BED12 files, corrected a bug in the serialisation of ORFs * Fixed previous breakage * Moved the code for checking the index into gene_dict. Also, now GeneDict allows access to positions as well. * Minor edit to assigner * Fixing previously broken commit * Solving a bug which rendered the exclude_utr/protein_coding flags of mikado compare useless. * Adding the GZI index to the tests directory to avoid permission errors. Addressing EI-CoreBioinformatics#175 * Corrected some testing. Moreover, now Mikado supports the BED12+1 format (ie gffread --bed output) * Adding a maximum intron length for the default scoring configuration files. * BROKEN. Proceeding on EI-CoreBioinformatics#142. Now the padding algorithm is aware of where a transcript finishes (intron vs exon). Moreover, we need to change the data structure for padding to a *directional* graph and keep in mind the distance needed to pad a transcript, to solve ambiguous cases in a deterministic (rather than random) way. * Issue EI-CoreBioinformatics#174: modification to the abstractlocus.py file, to try to solve the issue found by @cschuh. * EI-CoreBioinformatics#174: this should provide a solution to the issue, which is however only temporary. To be tested. * EI-CoreBioinformatics#174: making the implicit "for" cycle explicit. Hopefully this should help pinpoint the error better. * EI-CoreBioinformatics#174: peppered the failing block with try-except statements. * EI-CoreBioinformatics#174: this should solve it. Now missing external scores in the database will cause Mikado to explicitly fail. * Fixed EI-CoreBioinformatics#176 * BROKEN. Progress on EI-CoreBioinformatics#142, the code runs, but the tests are broken. **This might be legitimate as we changed the behaviour of the code.** * Closing EI-CoreBioinformatics#155. * EI-CoreBioinformatics#174: Now Mikado pick will die informatively if the SQLite3 database has not been found. * EI-CoreBioinformatics#166: fixed some issues with self-compare * BROKEN. We have to verify that the padding functions also on the 5' end, but we need to make a new test for that. The test development is in progress. * The padding now should be tested and correct. * Fixed previous commit. This should fix EI-CoreBioinformatics#142. * Update Singularity.centos.def Changed python to python3 during %post, otherwise it will use the system python2.7... * Fixed small bug in external metrics handling * Update Singularity.centos.def * This should address EI-CoreBioinformatics#173 (both configuration file and docs) and EI-CoreBioinformatics#158 * Fix EI-CoreBioinformatics#181 and small bug fix for parsing Mikado annotations. * Progress for EI-CoreBioinformatics#142 - this should fix the wrong ORF calculation for cases when the CDS was open at the 5' end. * Fixed previous commit (always for EI-CoreBioinformatics#142) * EI-CoreBioinformatics#142: corrected and tested the issue with one-off exons, for padding. * This should fix and test EI-CoreBioinformatics#142 for good. * Removed spurious warning/error messages * EI-CoreBioinformatics#142: solved a bug which caused truncated transcripts at the 5' end not to be padded. * EI-CoreBioinformatics#142: solved a problem which caused a false abort for transcripts on the - strand with changed stop codon. * EI-CoreBioinformatics#142: fixing previous commit * Pushing the fix for EI-CoreBioinformatics#182 onto the development branch * Fix EI-CoreBioinformatics#183 * Fix EI-CoreBioinformatics#183 and previous commit * EI-CoreBioinformatics#183: now Mikado configure will set a seed when generating the configuration file. The seed will be explicitly mentioned in the log. * EI-CoreBioinformatics#177: made ORF loading slightly faster with pysam. Also made XML serialisation much faster using SQL sessions and multiprocessing.Pool instead of queues. * Solved annoying bug that caused Mikado to crash with TAIR GFF3s. * Development (EI-CoreBioinformatics#184) * This should address EI-CoreBioinformatics#173 (both configuration file and docs) and EI-CoreBioinformatics#158 * Fix EI-CoreBioinformatics#181 and small bug fix for parsing Mikado annotations. * Progress for EI-CoreBioinformatics#142 - this should fix the wrong ORF calculation for cases when the CDS was open at the 5' end. * Fixed previous commit (always for EI-CoreBioinformatics#142) * EI-CoreBioinformatics#142: corrected and tested the issue with one-off exons, for padding. * This should fix and test EI-CoreBioinformatics#142 for good. * Removed spurious warning/error messages * EI-CoreBioinformatics#142: solved a bug which caused truncated transcripts at the 5' end not to be padded. * EI-CoreBioinformatics#142: solved a problem which caused a false abort for transcripts on the - strand with changed stop codon. * EI-CoreBioinformatics#142: fixing previous commit * Pushing the fix for EI-CoreBioinformatics#182 onto the development branch * Fix EI-CoreBioinformatics#183 * Fix EI-CoreBioinformatics#183 and previous commit * EI-CoreBioinformatics#183: now Mikado configure will set a seed when generating the configuration file. The seed will be explicitly mentioned in the log. * EI-CoreBioinformatics#177: made ORF loading slightly faster with pysam. Also made XML serialisation much faster using SQL sessions and multiprocessing.Pool instead of queues. * Solved annoying bug that caused Mikado to crash with TAIR GFF3s.
Hi @lucventurini,
Looks like there is an issue at Mikado serialise stage when using prodigal gff. I also got this error for an earlier Mikado stable commit - mikado-20190325_c940de1, which we are using for the wheat accessions.
Please see below the error and logs.
CMD:
WD:
ERROR:
When I look at the line with id '38_4' (see below, taken from prodigal output - mikado_prepared.fasta.prodigal.gff), the phase is 0 (which is valid) and it looks fine to me.
Can you please look into this?
Thanks,
Gemy
The text was updated successfully, but these errors were encountered: