Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detection of known CNS errors when failed run #1018

Merged
merged 30 commits into from
Sep 19, 2024
Merged

Detection of known CNS errors when failed run #1018

merged 30 commits into from
Sep 19, 2024

Conversation

VGPReys
Copy link
Contributor

@VGPReys VGPReys commented Sep 11, 2024

  • You have sticked to Python. Please talk to us before adding other programming languages to HADDOCK3
  • Your PR is about CNS
  • Your code is well documented: proper docstrings and explanatory comments for those tricky parts
  • You structured the code into small functions as much as possible. You can use classes if there is a (state) purpose
  • Your code follows our coding style
  • You wrote tests for the new code
  • tox tests pass. Run tox command inside the repository folder
  • -test.cfg examples execute without errors. Inside examples/ run python run_tests.py -b
  • PR does not add any dependencies, unless permission granted by the HADDOCK team
  • PR does not break licensing
  • Your PR is about writing documentation for already existing code 🔥
  • Your PR is about writing tests for already existing code :godmode:

This PR add a Known CNS error detection machinery triggered when tolerance threshold is not met (basically when the run will stop), which is often the case when the user will provide some weird input files.
In the module output directory, *.cnserr / *.cnserr.gz files are read and searched for known errors.

The new script src/haddock/gear/known_cns_errors.py holds the check functions and the list of known errors.
Can be simply updated by newly discovered error types by adding the string to search as key and the hint message to be returned to the user as values.

The CNSJob class has been updated to search for those errors at runtime, when STDOUT is still in memory.
If an error is detected, dump a .cnserr file that is later compressed, containing the error.
CNS modules have been modified to send the path to the potential xxxx.cnserr file, so we know where to write it.

Also has uni and integration tests to check that it is properly functional.

Closes #1012

@VGPReys VGPReys requested a review from a team September 11, 2024 14:56
src/haddock/gear/known_cns_errors.py Outdated Show resolved Hide resolved
src/haddock/gear/known_cns_errors.py Outdated Show resolved Hide resolved
src/haddock/gear/known_cns_errors.py Outdated Show resolved Hide resolved
src/haddock/gear/known_cns_errors.py Show resolved Hide resolved
src/haddock/gear/known_cns_errors.py Outdated Show resolved Hide resolved
VGPReys and others added 2 commits September 12, 2024 08:10
Co-authored-by: Anna Engel <113177776+AljaLEngel@users.noreply.github.com>
Co-authored-by: Anna Engel <113177776+AljaLEngel@users.noreply.github.com>
@VGPReys VGPReys marked this pull request as ready for review September 12, 2024 06:11
amjjbonvin
amjjbonvin previously approved these changes Sep 12, 2024
src/haddock/gear/known_cns_errors.py Outdated Show resolved Hide resolved
src/haddock/modules/__init__.py Show resolved Hide resolved
@VGPReys VGPReys self-assigned this Sep 12, 2024
AljaLEngel
AljaLEngel previously approved these changes Sep 12, 2024
@amjjbonvin
Copy link
Member

This will add additional I/O / CPU requirements... May-be only worth to do it if an error is detected and not systematically for all out.gz files

@VGPReys
Copy link
Contributor Author

VGPReys commented Sep 12, 2024

Yes, already the case
Reading the *.out or .out.gz is only triggered when the tolerance threshold is not met and the run is about to fail.

@VGPReys VGPReys requested a review from amjjbonvin September 12, 2024 10:35
src/haddock/core/exceptions.py Show resolved Hide resolved
src/haddock/gear/known_cns_errors.py Show resolved Hide resolved
src/haddock/gear/known_cns_errors.py Outdated Show resolved Hide resolved
src/haddock/modules/__init__.py Outdated Show resolved Hide resolved
AljaLEngel
AljaLEngel previously approved these changes Sep 17, 2024
@VGPReys VGPReys added CNS Improvements in the CNS engine enhancement Enhancing an existing feature of adding a new one workflow All the general parts of HADDOCK3 not related to any module in particular labels Sep 17, 2024
@VGPReys VGPReys requested a review from a team September 17, 2024 09:34
Copy link
Member

@amjjbonvin amjjbonvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test in batch modo to check that the .err files generated by slurm are not interfering with this. Those should be cleaned by the machinery ideally (if not empty)

src/haddock/gear/clean_steps.py Show resolved Hide resolved
@VGPReys
Copy link
Contributor Author

VGPReys commented Sep 17, 2024

Did you test in batch modo to check that the .err files generated by slurm are not interfering with this. Those should be cleaned by the machinery ideally (if not empty)

Good point.
Did not try.
Should I modify the extension of CNS related .err files to .cnserr (or .cns.err) ?

@amjjbonvin
Copy link
Member

Did you test in batch modo to check that the .err files generated by slurm are not interfering with this. Those should be cleaned by the machinery ideally (if not empty)

Good point. Did not try. Should I modify the extension of CNS related .err files to .cnserr (or .cns.err) ?

Maybe indeed .cnserr - then the .err files (if generated) can be cleaned.

@VGPReys VGPReys requested a review from a team September 17, 2024 15:04
@VGPReys VGPReys requested a review from amjjbonvin September 18, 2024 08:32
amjjbonvin
amjjbonvin previously approved these changes Sep 18, 2024
Copy link
Member

@rvhonorato rvhonorato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better! Just some comments

src/haddock/libs/libsubprocess.py Outdated Show resolved Hide resolved
src/haddock/libs/libsubprocess.py Outdated Show resolved Hide resolved
src/haddock/libs/libsubprocess.py Show resolved Hide resolved
src/haddock/gear/known_cns_errors.py Show resolved Hide resolved
src/haddock/libs/libsubprocess.py Outdated Show resolved Hide resolved
src/haddock/modules/__init__.py Outdated Show resolved Hide resolved
@VGPReys VGPReys merged commit 3af39a4 into main Sep 19, 2024
4 checks passed
@VGPReys VGPReys deleted the known-cns-errors branch September 19, 2024 07:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CNS Improvements in the CNS engine enhancement Enhancing an existing feature of adding a new one workflow All the general parts of HADDOCK3 not related to any module in particular
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve CNS-related error messages
4 participants