Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORM: ProcessNode.is_valid_cache is False for unsealed nodes #6147

Merged
merged 2 commits into from
Oct 14, 2023

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Oct 13, 2023

Fixes #6100

When a ProcessNode is not yet sealed, it has not officially been
terminated. At this point it cannot yet be considered a valid cache
source. However, this condition was not considered by the property
ProcessNode.is_valid_cache.

This bug manifested itself in very rare situations where a race
condition could lead to a process being cached from an unsealed node.
When a node is stored from the cache, it copies all the attribute except
for the sealed key and adds the outputs. The sealing is then left to the
engine which will complete the cached process as if it had run normally.
The problem arises when the cache source was not yet sealed, and so the
outputs had not yet been added. The cached node will then miss the
output nodes.

The `Sealable` mixin is used by the `ProcessNode` which allows it to be
sealed. By having the `seal` method return `self`, which will be the
`ProcessNode` instance, it brings the behavior on par with the `store`
method.
When a `ProcessNode` is not yet sealed, it has not officially been
terminated. At this point it cannot yet be considered a valid cache
source. However, this condition was not considered by the property
`ProcessNode.is_valid_cache`.

This bug manifested itself in very rare situations where a race
condition could lead to a process being cached from an unsealed node.
When a node is stored from the cache, it copies all the attribute except
for the sealed key and adds the outputs. The sealing is then left to the
engine which will complete the cached process as if it had run normally.
The problem arises when the cache source was not yet sealed, and so the
outputs had not yet been added. The cached node will then miss the
output nodes.
@sphuber sphuber requested a review from unkcpz October 13, 2023 07:31
@sphuber sphuber force-pushed the fix/6100/caching-unsealed branch from c99ef8a to 2c2c095 Compare October 13, 2023 07:55
@unkcpz
Copy link
Member

unkcpz commented Oct 13, 2023

Thanks @sphuber, I'll take a look later today.

Copy link
Member

@unkcpz unkcpz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, it is an elegant solution. Looks all good to me!

@sphuber sphuber merged commit a1f456d into aiidateam:main Oct 14, 2023
17 checks passed
@sphuber sphuber deleted the fix/6100/caching-unsealed branch October 14, 2023 06:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CalcJob has exit_status=0 but missing required output node
2 participants