-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detection of known CNS errors when failed run #1018
Conversation
Co-authored-by: Anna Engel <113177776+AljaLEngel@users.noreply.github.com>
Co-authored-by: Anna Engel <113177776+AljaLEngel@users.noreply.github.com>
This will add additional I/O / CPU requirements... May-be only worth to do it if an error is detected and not systematically for all out.gz files |
Yes, already the case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test in batch modo to check that the .err files generated by slurm are not interfering with this. Those should be cleaned by the machinery ideally (if not empty)
Good point. |
Maybe indeed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better! Just some comments
tox
tests pass. Runtox
command inside the repository folder-test.cfg
examples execute without errors. Insideexamples/
runpython run_tests.py -b
This PR add a Known CNS error detection machinery triggered when
tolerance
threshold is not met (basically when the run will stop), which is often the case when the user will provide some weird input files.In the module output directory,
*.cnserr
/*.cnserr.gz
files are read and searched for known errors.The new script
src/haddock/gear/known_cns_errors.py
holds the check functions and the list of known errors.Can be simply updated by newly discovered error types by adding the string to search as key and the hint message to be returned to the user as values.
The
CNSJob
class has been updated to search for those errors at runtime, when STDOUT is still in memory.If an error is detected, dump a
.cnserr
file that is later compressed, containing the error.CNS modules have been modified to send the path to the potential
xxxx.cnserr
file, so we know where to write it.Also has uni and integration tests to check that it is properly functional.
Closes #1012