Static random seed #183

lucventurini · 2019-06-17T09:48:37Z

Currently Mikado does not allow to set a random seed. This is a problem, as it is impossible to set a value from outside and have the program stick to it. So mikado runs can never be truly repeated.

* This should address #173 (both configuration file and docs) and #158 * Fix #181 and small bug fix for parsing Mikado annotations. * Progress for #142 - this should fix the wrong ORF calculation for cases when the CDS was open at the 5' end. * Fixed previous commit (always for #142) * #142: corrected and tested the issue with one-off exons, for padding. * This should fix and test #142 for good. * Removed spurious warning/error messages * #142: solved a bug which caused truncated transcripts at the 5' end not to be padded. * #142: solved a problem which caused a false abort for transcripts on the - strand with changed stop codon. * #142: fixing previous commit * Pushing the fix for #182 onto the development branch * Fix #183 * Fix #183 and previous commit * #183: now Mikado configure will set a seed when generating the configuration file. The seed will be explicitly mentioned in the log. * #177: made ORF loading slightly faster with pysam. Also made XML serialisation much faster using SQL sessions and multiprocessing.Pool instead of queues. * Solved annoying bug that caused Mikado to crash with TAIR GFF3s.

@cschuh

* Solved a small bug in the Gene class * This commit should fix some of the performance issues found in Mikado compare when testing in the all vs all (issue #166). * Updated the CHANGELOG. * Slight improvements to the generic GFLine class and to the to_gff wrapper * Solved some assorted bugs, from stop_codon parsing in GTF2 (for Augustus) to avoiding a very costly pragma check on MIDX databases. * Now Mikado util stats will only return one value for the mode, making the table parsable * Solved some small bugs introduced by changing the mode for mikado util stats * Dropping automated support for Python3.5. The conda environment cannot be created successfully, too many packages have not been updated in the original repositories. * Updating the conda environment to reflect that only Python>=3.6 is now accepted * Various fixes for managing correctly BED12 files. * Fix for the previous commit breaking TRAVIS * Switched to PySam for loading and fetching from genome files. Also, improved massively the speed of tests. * Fixed previous commit * Fixed travis bug * Refactoring of check_index for Mikado compare (#166) and fix for #172 * Now Mikado will merge touching (NOT overlapping) exons coming from BED12 files. This should fix an issue with halLiftover * This commit should fix a bunch of tests for when Mikado is installed with SUDO privileges (#137) potentially also fixing #172. * Corrected a bug in the printing of transcriptomic BED12 files, corrected a bug in the serialisation of ORFs * Fixed previous breakage * Moved the code for checking the index into gene_dict. Also, now GeneDict allows access to positions as well. * Minor edit to assigner * Fixing previously broken commit * Solving a bug which rendered the exclude_utr/protein_coding flags of mikado compare useless. * Adding the GZI index to the tests directory to avoid permission errors. Addressing #175 * Corrected some testing. Moreover, now Mikado supports the BED12+1 format (ie gffread --bed output) * Adding a maximum intron length for the default scoring configuration files. * BROKEN. Proceeding on #142. Now the padding algorithm is aware of where a transcript finishes (intron vs exon). Moreover, we need to change the data structure for padding to a *directional* graph and keep in mind the distance needed to pad a transcript, to solve ambiguous cases in a deterministic (rather than random) way. * Issue #174: modification to the abstractlocus.py file, to try to solve the issue found by @cschuh. * #174: this should provide a solution to the issue, which is however only temporary. To be tested. * #174: making the implicit "for" cycle explicit. Hopefully this should help pinpoint the error better. * #174: peppered the failing block with try-except statements. * #174: this should solve it. Now missing external scores in the database will cause Mikado to explicitly fail. * Fixed #176 * BROKEN. Progress on #142, the code runs, but the tests are broken. **This might be legitimate as we changed the behaviour of the code.** * Closing #155. * #174: Now Mikado pick will die informatively if the SQLite3 database has not been found. * #166: fixed some issues with self-compare * BROKEN. We have to verify that the padding functions also on the 5' end, but we need to make a new test for that. The test development is in progress. * The padding now should be tested and correct. * Fixed previous commit. This should fix #142. * Development (#178) * Switched to PySam for loading and fetching from genome files. Also, improved massively the speed of tests. * Fixed previous commit * Fixed travis bug * Refactoring of check_index for Mikado compare (#166) and fix for #172 * Now Mikado will merge touching (NOT overlapping) exons coming from BED12 files. This should fix an issue with halLiftover * This commit should fix a bunch of tests for when Mikado is installed with SUDO privileges (#137) potentially also fixing #172. * Corrected a bug in the printing of transcriptomic BED12 files, corrected a bug in the serialisation of ORFs * Fixed previous breakage * Moved the code for checking the index into gene_dict. Also, now GeneDict allows access to positions as well. * Minor edit to assigner * Fixing previously broken commit * Solving a bug which rendered the exclude_utr/protein_coding flags of mikado compare useless. * Adding the GZI index to the tests directory to avoid permission errors. Addressing #175 * Corrected some testing. Moreover, now Mikado supports the BED12+1 format (ie gffread --bed output) * Adding a maximum intron length for the default scoring configuration files. * BROKEN. Proceeding on #142. Now the padding algorithm is aware of where a transcript finishes (intron vs exon). Moreover, we need to change the data structure for padding to a *directional* graph and keep in mind the distance needed to pad a transcript, to solve ambiguous cases in a deterministic (rather than random) way. * Issue #174: modification to the abstractlocus.py file, to try to solve the issue found by @cschuh. * #174: this should provide a solution to the issue, which is however only temporary. To be tested. * #174: making the implicit "for" cycle explicit. Hopefully this should help pinpoint the error better. * #174: peppered the failing block with try-except statements. * #174: this should solve it. Now missing external scores in the database will cause Mikado to explicitly fail. * Fixed #176 * BROKEN. Progress on #142, the code runs, but the tests are broken. **This might be legitimate as we changed the behaviour of the code.** * Closing #155. * #174: Now Mikado pick will die informatively if the SQLite3 database has not been found. * #166: fixed some issues with self-compare * BROKEN. We have to verify that the padding functions also on the 5' end, but we need to make a new test for that. The test development is in progress. * The padding now should be tested and correct. * Fixed previous commit. This should fix #142. * Update Singularity.centos.def Changed python to python3 during %post, otherwise it will use the system python2.7... * Fixed small bug in external metrics handling * Update Singularity.centos.def * Development (#184) * This should address #173 (both configuration file and docs) and #158 * Fix #181 and small bug fix for parsing Mikado annotations. * Progress for #142 - this should fix the wrong ORF calculation for cases when the CDS was open at the 5' end. * Fixed previous commit (always for #142) * #142: corrected and tested the issue with one-off exons, for padding. * This should fix and test #142 for good. * Removed spurious warning/error messages * #142: solved a bug which caused truncated transcripts at the 5' end not to be padded. * #142: solved a problem which caused a false abort for transcripts on the - strand with changed stop codon. * #142: fixing previous commit * Pushing the fix for #182 onto the development branch * Fix #183 * Fix #183 and previous commit * #183: now Mikado configure will set a seed when generating the configuration file. The seed will be explicitly mentioned in the log. * #177: made ORF loading slightly faster with pysam. Also made XML serialisation much faster using SQL sessions and multiprocessing.Pool instead of queues. * Solved annoying bug that caused Mikado to crash with TAIR GFF3s.

lucventurini · 2019-06-18T22:40:46Z

Giving in the seed on Mikado pick fails to set the seed correctly. Reopening to fix this.

* Fix #189 * Fix #186 * #183: added static seed from CLI for pick. * #186: introduced a maximum intron length parameter for mikado prepare (prepare/max_intron_length), with a default value of 1M bps and a minimum value of 20. * #186: there was a very serious bug in the evaluation of negative truncated ORFs, which potentially led to a lot of them being called incorrectly at the serialisation stage. Refactored the function responsible for the mishap and added a unit-test which confirmed fixing of the bug.

…eseed (a pretty inexpensive operation) each time before drawing. This ensures that we will always have a reproducible result.

Changes: - Mikado now will not hang if a subprocess dies, it will immediately exit. - Ensured that Mikado runs are fully reproducible using a random seed (EI-CoreBioinformatics#183) - Solved a bug that crashed Mikado prepare in the presence of incorrect transcripts - Removed the cause for locked interprocess-exchange databases in Mikado pick. Switched to WAL and increased the timeout limit.

…s#183 In this pull request, there are multiple enhancements to the code base. This does **not** close EI-CoreBioinformatics#208. General improvements: - The superlocus algorithms have been revamped: - the transcript graph definition should now have O(nlogn) complexity, rather than O(n^2). This should reduce a lot runtimes and even potentially memory usage in complex loci. - removed the "reduce_method_three" method for complex loci, which was untested and probably hardly ever used. The other two remaining methods have been updated and properly tested. - The first reduction method, in particular, should now be faster as well. For EI-CoreBioinformatics#208: - Now "=", "_" and "n" will be automatically added to the valid class codes **only when executing a reference-update run**. This should prevent runaway behaviour in normal runs. - the Locus class will remove redundancies (ie transcripts that are strictly equal) after padding Still problematic: adding "=", "_" and "n" as valid class codes could theoretically cause grief given the hard limit on the number of transcripts per locus. Briefly, we risk only adding minor variants of the same transcript, uniform them into a single one, and lose all ASEs. This aspect will require careful thought. Various: - Solved a serious bug in `mikado prepare` - in multiprocessing mode, the CDS was not appropriately kept when strip-cds was disabled. - Solved a bug in the expansion of transcripts (template having a non-terminal exon with the same start coordinate of the last exon of the transcript to expand) - Python 3.7.4 does **not** produce functional .so files with cythonize. Blacklisting it. - Added and configured a proper .coveragerc file; fixed .travis.yml and the environment YAML file

@cschuh

* Solved a small bug in the Gene class * This commit should fix some of the performance issues found in Mikado compare when testing in the all vs all (issue EI-CoreBioinformatics#166). * Updated the CHANGELOG. * Slight improvements to the generic GFLine class and to the to_gff wrapper * Solved some assorted bugs, from stop_codon parsing in GTF2 (for Augustus) to avoiding a very costly pragma check on MIDX databases. * Now Mikado util stats will only return one value for the mode, making the table parsable * Solved some small bugs introduced by changing the mode for mikado util stats * Dropping automated support for Python3.5. The conda environment cannot be created successfully, too many packages have not been updated in the original repositories. * Updating the conda environment to reflect that only Python>=3.6 is now accepted * Various fixes for managing correctly BED12 files. * Fix for the previous commit breaking TRAVIS * Switched to PySam for loading and fetching from genome files. Also, improved massively the speed of tests. * Fixed previous commit * Fixed travis bug * Refactoring of check_index for Mikado compare (EI-CoreBioinformatics#166) and fix for EI-CoreBioinformatics#172 * Now Mikado will merge touching (NOT overlapping) exons coming from BED12 files. This should fix an issue with halLiftover * This commit should fix a bunch of tests for when Mikado is installed with SUDO privileges (EI-CoreBioinformatics#137) potentially also fixing EI-CoreBioinformatics#172. * Corrected a bug in the printing of transcriptomic BED12 files, corrected a bug in the serialisation of ORFs * Fixed previous breakage * Moved the code for checking the index into gene_dict. Also, now GeneDict allows access to positions as well. * Minor edit to assigner * Fixing previously broken commit * Solving a bug which rendered the exclude_utr/protein_coding flags of mikado compare useless. * Adding the GZI index to the tests directory to avoid permission errors. Addressing EI-CoreBioinformatics#175 * Corrected some testing. Moreover, now Mikado supports the BED12+1 format (ie gffread --bed output) * Adding a maximum intron length for the default scoring configuration files. * BROKEN. Proceeding on EI-CoreBioinformatics#142. Now the padding algorithm is aware of where a transcript finishes (intron vs exon). Moreover, we need to change the data structure for padding to a *directional* graph and keep in mind the distance needed to pad a transcript, to solve ambiguous cases in a deterministic (rather than random) way. * Issue EI-CoreBioinformatics#174: modification to the abstractlocus.py file, to try to solve the issue found by @cschuh. * EI-CoreBioinformatics#174: this should provide a solution to the issue, which is however only temporary. To be tested. * EI-CoreBioinformatics#174: making the implicit "for" cycle explicit. Hopefully this should help pinpoint the error better. * EI-CoreBioinformatics#174: peppered the failing block with try-except statements. * EI-CoreBioinformatics#174: this should solve it. Now missing external scores in the database will cause Mikado to explicitly fail. * Fixed EI-CoreBioinformatics#176 * BROKEN. Progress on EI-CoreBioinformatics#142, the code runs, but the tests are broken. **This might be legitimate as we changed the behaviour of the code.** * Closing EI-CoreBioinformatics#155. * EI-CoreBioinformatics#174: Now Mikado pick will die informatively if the SQLite3 database has not been found. * EI-CoreBioinformatics#166: fixed some issues with self-compare * BROKEN. We have to verify that the padding functions also on the 5' end, but we need to make a new test for that. The test development is in progress. * The padding now should be tested and correct. * Fixed previous commit. This should fix EI-CoreBioinformatics#142. * Development (EI-CoreBioinformatics#178) * Switched to PySam for loading and fetching from genome files. Also, improved massively the speed of tests. * Fixed previous commit * Fixed travis bug * Refactoring of check_index for Mikado compare (EI-CoreBioinformatics#166) and fix for EI-CoreBioinformatics#172 * Now Mikado will merge touching (NOT overlapping) exons coming from BED12 files. This should fix an issue with halLiftover * This commit should fix a bunch of tests for when Mikado is installed with SUDO privileges (EI-CoreBioinformatics#137) potentially also fixing EI-CoreBioinformatics#172. * Corrected a bug in the printing of transcriptomic BED12 files, corrected a bug in the serialisation of ORFs * Fixed previous breakage * Moved the code for checking the index into gene_dict. Also, now GeneDict allows access to positions as well. * Minor edit to assigner * Fixing previously broken commit * Solving a bug which rendered the exclude_utr/protein_coding flags of mikado compare useless. * Adding the GZI index to the tests directory to avoid permission errors. Addressing EI-CoreBioinformatics#175 * Corrected some testing. Moreover, now Mikado supports the BED12+1 format (ie gffread --bed output) * Adding a maximum intron length for the default scoring configuration files. * BROKEN. Proceeding on EI-CoreBioinformatics#142. Now the padding algorithm is aware of where a transcript finishes (intron vs exon). Moreover, we need to change the data structure for padding to a *directional* graph and keep in mind the distance needed to pad a transcript, to solve ambiguous cases in a deterministic (rather than random) way. * Issue EI-CoreBioinformatics#174: modification to the abstractlocus.py file, to try to solve the issue found by @cschuh. * EI-CoreBioinformatics#174: this should provide a solution to the issue, which is however only temporary. To be tested. * EI-CoreBioinformatics#174: making the implicit "for" cycle explicit. Hopefully this should help pinpoint the error better. * EI-CoreBioinformatics#174: peppered the failing block with try-except statements. * EI-CoreBioinformatics#174: this should solve it. Now missing external scores in the database will cause Mikado to explicitly fail. * Fixed EI-CoreBioinformatics#176 * BROKEN. Progress on EI-CoreBioinformatics#142, the code runs, but the tests are broken. **This might be legitimate as we changed the behaviour of the code.** * Closing EI-CoreBioinformatics#155. * EI-CoreBioinformatics#174: Now Mikado pick will die informatively if the SQLite3 database has not been found. * EI-CoreBioinformatics#166: fixed some issues with self-compare * BROKEN. We have to verify that the padding functions also on the 5' end, but we need to make a new test for that. The test development is in progress. * The padding now should be tested and correct. * Fixed previous commit. This should fix EI-CoreBioinformatics#142. * Update Singularity.centos.def Changed python to python3 during %post, otherwise it will use the system python2.7... * Fixed small bug in external metrics handling * Update Singularity.centos.def * Development (EI-CoreBioinformatics#184) * This should address EI-CoreBioinformatics#173 (both configuration file and docs) and EI-CoreBioinformatics#158 * Fix EI-CoreBioinformatics#181 and small bug fix for parsing Mikado annotations. * Progress for EI-CoreBioinformatics#142 - this should fix the wrong ORF calculation for cases when the CDS was open at the 5' end. * Fixed previous commit (always for EI-CoreBioinformatics#142) * EI-CoreBioinformatics#142: corrected and tested the issue with one-off exons, for padding. * This should fix and test EI-CoreBioinformatics#142 for good. * Removed spurious warning/error messages * EI-CoreBioinformatics#142: solved a bug which caused truncated transcripts at the 5' end not to be padded. * EI-CoreBioinformatics#142: solved a problem which caused a false abort for transcripts on the - strand with changed stop codon. * EI-CoreBioinformatics#142: fixing previous commit * Pushing the fix for EI-CoreBioinformatics#182 onto the development branch * Fix EI-CoreBioinformatics#183 * Fix EI-CoreBioinformatics#183 and previous commit * EI-CoreBioinformatics#183: now Mikado configure will set a seed when generating the configuration file. The seed will be explicitly mentioned in the log. * EI-CoreBioinformatics#177: made ORF loading slightly faster with pysam. Also made XML serialisation much faster using SQL sessions and multiprocessing.Pool instead of queues. * Solved annoying bug that caused Mikado to crash with TAIR GFF3s.

@cschuh

* Solved a small bug in the Gene class * This commit should fix some of the performance issues found in Mikado compare when testing in the all vs all (issue EI-CoreBioinformatics#166). * Updated the CHANGELOG. * Slight improvements to the generic GFLine class and to the to_gff wrapper * Solved some assorted bugs, from stop_codon parsing in GTF2 (for Augustus) to avoiding a very costly pragma check on MIDX databases. * Now Mikado util stats will only return one value for the mode, making the table parsable * Solved some small bugs introduced by changing the mode for mikado util stats * Dropping automated support for Python3.5. The conda environment cannot be created successfully, too many packages have not been updated in the original repositories. * Updating the conda environment to reflect that only Python>=3.6 is now accepted * Various fixes for managing correctly BED12 files. * Fix for the previous commit breaking TRAVIS * Switched to PySam for loading and fetching from genome files. Also, improved massively the speed of tests. * Fixed previous commit * Fixed travis bug * Refactoring of check_index for Mikado compare (EI-CoreBioinformatics#166) and fix for EI-CoreBioinformatics#172 * Now Mikado will merge touching (NOT overlapping) exons coming from BED12 files. This should fix an issue with halLiftover * This commit should fix a bunch of tests for when Mikado is installed with SUDO privileges (EI-CoreBioinformatics#137) potentially also fixing EI-CoreBioinformatics#172. * Corrected a bug in the printing of transcriptomic BED12 files, corrected a bug in the serialisation of ORFs * Fixed previous breakage * Moved the code for checking the index into gene_dict. Also, now GeneDict allows access to positions as well. * Minor edit to assigner * Fixing previously broken commit * Solving a bug which rendered the exclude_utr/protein_coding flags of mikado compare useless. * Adding the GZI index to the tests directory to avoid permission errors. Addressing EI-CoreBioinformatics#175 * Corrected some testing. Moreover, now Mikado supports the BED12+1 format (ie gffread --bed output) * Adding a maximum intron length for the default scoring configuration files. * BROKEN. Proceeding on EI-CoreBioinformatics#142. Now the padding algorithm is aware of where a transcript finishes (intron vs exon). Moreover, we need to change the data structure for padding to a *directional* graph and keep in mind the distance needed to pad a transcript, to solve ambiguous cases in a deterministic (rather than random) way. * Issue EI-CoreBioinformatics#174: modification to the abstractlocus.py file, to try to solve the issue found by @cschuh. * EI-CoreBioinformatics#174: this should provide a solution to the issue, which is however only temporary. To be tested. * EI-CoreBioinformatics#174: making the implicit "for" cycle explicit. Hopefully this should help pinpoint the error better. * EI-CoreBioinformatics#174: peppered the failing block with try-except statements. * EI-CoreBioinformatics#174: this should solve it. Now missing external scores in the database will cause Mikado to explicitly fail. * Fixed EI-CoreBioinformatics#176 * BROKEN. Progress on EI-CoreBioinformatics#142, the code runs, but the tests are broken. **This might be legitimate as we changed the behaviour of the code.** * Closing EI-CoreBioinformatics#155. * EI-CoreBioinformatics#174: Now Mikado pick will die informatively if the SQLite3 database has not been found. * EI-CoreBioinformatics#166: fixed some issues with self-compare * BROKEN. We have to verify that the padding functions also on the 5' end, but we need to make a new test for that. The test development is in progress. * The padding now should be tested and correct. * Fixed previous commit. This should fix EI-CoreBioinformatics#142. * Development (EI-CoreBioinformatics#178) * Switched to PySam for loading and fetching from genome files. Also, improved massively the speed of tests. * Fixed previous commit * Fixed travis bug * Refactoring of check_index for Mikado compare (EI-CoreBioinformatics#166) and fix for EI-CoreBioinformatics#172 * Now Mikado will merge touching (NOT overlapping) exons coming from BED12 files. This should fix an issue with halLiftover * This commit should fix a bunch of tests for when Mikado is installed with SUDO privileges (EI-CoreBioinformatics#137) potentially also fixing EI-CoreBioinformatics#172. * Corrected a bug in the printing of transcriptomic BED12 files, corrected a bug in the serialisation of ORFs * Fixed previous breakage * Moved the code for checking the index into gene_dict. Also, now GeneDict allows access to positions as well. * Minor edit to assigner * Fixing previously broken commit * Solving a bug which rendered the exclude_utr/protein_coding flags of mikado compare useless. * Adding the GZI index to the tests directory to avoid permission errors. Addressing EI-CoreBioinformatics#175 * Corrected some testing. Moreover, now Mikado supports the BED12+1 format (ie gffread --bed output) * Adding a maximum intron length for the default scoring configuration files. * BROKEN. Proceeding on EI-CoreBioinformatics#142. Now the padding algorithm is aware of where a transcript finishes (intron vs exon). Moreover, we need to change the data structure for padding to a *directional* graph and keep in mind the distance needed to pad a transcript, to solve ambiguous cases in a deterministic (rather than random) way. * Issue EI-CoreBioinformatics#174: modification to the abstractlocus.py file, to try to solve the issue found by @cschuh. * EI-CoreBioinformatics#174: this should provide a solution to the issue, which is however only temporary. To be tested. * EI-CoreBioinformatics#174: making the implicit "for" cycle explicit. Hopefully this should help pinpoint the error better. * EI-CoreBioinformatics#174: peppered the failing block with try-except statements. * EI-CoreBioinformatics#174: this should solve it. Now missing external scores in the database will cause Mikado to explicitly fail. * Fixed EI-CoreBioinformatics#176 * BROKEN. Progress on EI-CoreBioinformatics#142, the code runs, but the tests are broken. **This might be legitimate as we changed the behaviour of the code.** * Closing EI-CoreBioinformatics#155. * EI-CoreBioinformatics#174: Now Mikado pick will die informatively if the SQLite3 database has not been found. * EI-CoreBioinformatics#166: fixed some issues with self-compare * BROKEN. We have to verify that the padding functions also on the 5' end, but we need to make a new test for that. The test development is in progress. * The padding now should be tested and correct. * Fixed previous commit. This should fix EI-CoreBioinformatics#142. * Update Singularity.centos.def Changed python to python3 during %post, otherwise it will use the system python2.7... * Fixed small bug in external metrics handling * Update Singularity.centos.def * This should address EI-CoreBioinformatics#173 (both configuration file and docs) and EI-CoreBioinformatics#158 * Fix EI-CoreBioinformatics#181 and small bug fix for parsing Mikado annotations. * Progress for EI-CoreBioinformatics#142 - this should fix the wrong ORF calculation for cases when the CDS was open at the 5' end. * Fixed previous commit (always for EI-CoreBioinformatics#142) * EI-CoreBioinformatics#142: corrected and tested the issue with one-off exons, for padding. * This should fix and test EI-CoreBioinformatics#142 for good. * Removed spurious warning/error messages * EI-CoreBioinformatics#142: solved a bug which caused truncated transcripts at the 5' end not to be padded. * EI-CoreBioinformatics#142: solved a problem which caused a false abort for transcripts on the - strand with changed stop codon. * EI-CoreBioinformatics#142: fixing previous commit * Pushing the fix for EI-CoreBioinformatics#182 onto the development branch * Fix EI-CoreBioinformatics#183 * Fix EI-CoreBioinformatics#183 and previous commit * EI-CoreBioinformatics#183: now Mikado configure will set a seed when generating the configuration file. The seed will be explicitly mentioned in the log. * EI-CoreBioinformatics#177: made ORF loading slightly faster with pysam. Also made XML serialisation much faster using SQL sessions and multiprocessing.Pool instead of queues. * Solved annoying bug that caused Mikado to crash with TAIR GFF3s. * Development (EI-CoreBioinformatics#184) * This should address EI-CoreBioinformatics#173 (both configuration file and docs) and EI-CoreBioinformatics#158 * Fix EI-CoreBioinformatics#181 and small bug fix for parsing Mikado annotations. * Progress for EI-CoreBioinformatics#142 - this should fix the wrong ORF calculation for cases when the CDS was open at the 5' end. * Fixed previous commit (always for EI-CoreBioinformatics#142) * EI-CoreBioinformatics#142: corrected and tested the issue with one-off exons, for padding. * This should fix and test EI-CoreBioinformatics#142 for good. * Removed spurious warning/error messages * EI-CoreBioinformatics#142: solved a bug which caused truncated transcripts at the 5' end not to be padded. * EI-CoreBioinformatics#142: solved a problem which caused a false abort for transcripts on the - strand with changed stop codon. * EI-CoreBioinformatics#142: fixing previous commit * Pushing the fix for EI-CoreBioinformatics#182 onto the development branch * Fix EI-CoreBioinformatics#183 * Fix EI-CoreBioinformatics#183 and previous commit * EI-CoreBioinformatics#183: now Mikado configure will set a seed when generating the configuration file. The seed will be explicitly mentioned in the log. * EI-CoreBioinformatics#177: made ORF loading slightly faster with pysam. Also made XML serialisation much faster using SQL sessions and multiprocessing.Pool instead of queues. * Solved annoying bug that caused Mikado to crash with TAIR GFF3s.

…ioinformatics#191) * Fix EI-CoreBioinformatics#189 * Fix EI-CoreBioinformatics#186 * EI-CoreBioinformatics#183: added static seed from CLI for pick. * EI-CoreBioinformatics#186: introduced a maximum intron length parameter for mikado prepare (prepare/max_intron_length), with a default value of 1M bps and a minimum value of 20. * EI-CoreBioinformatics#186: there was a very serious bug in the evaluation of negative truncated ORFs, which potentially led to a lot of them being called incorrectly at the serialisation stage. Refactored the function responsible for the mishap and added a unit-test which confirmed fixing of the bug.

Changes: - Mikado now will not hang if a subprocess dies, it will immediately exit. - Ensured that Mikado runs are fully reproducible using a random seed (EI-CoreBioinformatics#183) - Solved a bug that crashed Mikado prepare in the presence of incorrect transcripts - Removed the cause for locked interprocess-exchange databases in Mikado pick. Switched to WAL and increased the timeout limit.

lucventurini assigned lucventurini and gemygk Jun 17, 2019

lucventurini added this to the 1.5 milestone Jun 18, 2019

lucventurini closed this as completed Jun 18, 2019

lucventurini reopened this Jun 18, 2019

lucventurini closed this as completed in 9e5e517 Jun 19, 2019

lucventurini added a commit to lucventurini/mikado that referenced this issue Aug 2, 2019

This should fix EI-CoreBioinformatics#205 and EI-CoreBioinformatics#183.

2f61779

lucventurini mentioned this issue Aug 5, 2019

Issue 205 lucventurini/mikado#2

Merged

lucventurini added a commit to lucventurini/mikado that referenced this issue Aug 12, 2019

Increasing coverage of unit tests (EI-CoreBioinformatics#183)

fa3b4a9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Static random seed #183

Static random seed #183

lucventurini commented Jun 17, 2019

lucventurini commented Jun 18, 2019

Static random seed #183

Static random seed #183

Comments

lucventurini commented Jun 17, 2019

lucventurini commented Jun 18, 2019