-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CalcJob
: Fix bug causing exception after restarting daemon
#5886
CalcJob
: Fix bug causing exception after restarting daemon
#5886
Conversation
This is a fix for a critical bug that affects v2.2.0 and up. Ideally I want to have this reviewed and released a.s.a.p. as it affects the latest releases heavily. Anyone got time to review it @chrisjsewell @mbercx @ramirezfranciscof ? |
In `v2.2.0`, a feature was added to attach monitors to a `CalcJob` for which the `Waiting` (see `aiida.engine.processes.calcjob.tasks`) state of the process was modified. The `_command` attribute was moved inside the `data` argument passed to the constructor and was initialized only in the constructor. However, when the process is loaded from a serialized checkpoint, such as after a daemon restart, the constructor is not called but `load_instance_state` is. In this case, the `_command` attribute would not be set and when referenced it would cause an exception. The solution is to add the `_command` attribute to the `auto_persist` decorator as this will be called on `load_instance_state` and will reinitialize these attributes from the persisted checkpoint. A similar problem was lurking with the `_monitors` attribute which was also only initialized in the constructor. Since this has a custom data type `CalcJobMonitors` which may not be serializable by `auto_persist` a property `monitors` is added that will recreate the attribute and populate it with the `CalcJobMonitors` instance with the monitors specified in the inputs.
9ad9042
to
6f161db
Compare
@ramirezfranciscof can you please have a look a.s.a.p. As mentioned in the meeting, this bug is critically breaking the last release and we need to ship a patch a.s.a.p. Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks fine to me, at a glance. But as discussed, you guys worked on this code
Cheers @chrisjsewell . I will hold off till tomorrow afternoon to give @ramirezfranciscof the chance to also have a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All seems good to me! The only thing I was thinking is that I would watch out if we have many test that need to interact with the daemon, this might slow down the testing process, right? But in that case we just need to move them to the nightly testing set.
Thanks for the review @ramirezfranciscof . I agree that these tests take quite a bit of time for a "unit test", but I think this one is really crucial. Making sure that |
Fixes #5882
In
v2.2.0
, a feature was added to attach monitors to aCalcJob
for which theWaiting
(seeaiida.engine.processes.calcjob.tasks
) state of the process was modified. The_command
attribute was moved inside thedata
argument passed to the constructor and was initialized only in the constructor. However, when the process is loaded from a serialized checkpoint, such as after a daemon restart, the constructor is not called butload_instance_state
is. In this case, the_command
attribute would not be set and when referenced it would cause an exception.The solution is to add the
_command
attribute to theauto_persist
decorator as this will be called onload_instance_state
and will reinitialize these attributes from the persisted checkpoint.A similar problem was lurking with the
_monitors
attribute which was also only initialized in the constructor. Since this has a custom data typeCalcJobMonitors
which may not be serializable byauto_persist
a propertymonitors
is added that will recreate the attribute and populate it with theCalcJobMonitors
instance with the monitors specified in the inputs.