-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Engine: fix bug that allowed non-storable inputs to be passed to process #5532
Engine: fix bug that allowed non-storable inputs to be passed to process #5532
Conversation
The basic assumption for a `Process` in `aiida-core` is that all of its inputs should be storable in the database as nodes. Under the current link model, this means that they should be instances of the `Data` class or subclasses thereof. There is a noticeable exception for ports that are explicitly marked as `non_db=True`, in which case the value is not linked as a node, but is stored as an attribute directly on the process node itself, or not stored whatsoever. This basic rule was never explicitly enforced, which made it possible to define processes that would happily take non-storable inputs. The input would not get stored in the database, but would be available within the processes lifetimes from the `inputs` property allowing it to be used. This will most likely result into unintentional loss of provenance. The reason is that the default `valid_type` of the top-level inputs namespace of the `Process` class was never being set to `Data`. This meant that any type would be excepted for a `Process` and all its subclasses unless the valid type of a port was explicitly overridden. This meant that for normal dynamic namespaces, even non-storable types would be accepted just fine. Setting `valid_type=Data` for the input namespace of the `Process` class fixes the problem therefore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sphuber , this looks fine to me.
Just for me to understand: this only affected sub-namespaces and ports therein or also input ports defined at the top level?
If the latter, then I could imagine some plugins already relying on this.
Let me know if you want me to approve or rather wait for the review by Marnik.
spec.input_namespace('namespace.nested', dynamic=True) | ||
spec.input('namespace.a', valid_type=int) | ||
spec.input('namespace.c', valid_type=dict) | ||
spec.input_namespace('namespace.nested', dynamic=True, non_db=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmyeah, interesting that there were even tests that relied on this behavior ;-)
This affected the top-level input namespace so this includes all sub-namespaces and input ports at the top-level. If a plugin indeed relied on specifying an input port without a
I think we can merge this with just a single review. Thinking of it now, Marnik will actually be away for a few weeks |
I agree that this is a pattern we would like developers to get away from; just saying that this may break plugins. That said, a quick search on Github for strings like |
Just wanted to point out one use case I stumbled upon that is somewhat related to this: Having a dynamic input namespace (e.g. This approach still seems to be allowed, even after this fix, although the |
…ess (#5532) The basic assumption for a `Process` in `aiida-core` is that all of its inputs should be storable in the database as nodes. Under the current link model, this means that they should be instances of the `Data` class or subclasses thereof. There is a noticeable exception for ports that are explicitly marked as `non_db=True`, in which case the value is not linked as a node, but is stored as an attribute directly on the process node itself, or not stored whatsoever. This basic rule was never explicitly enforced, which made it possible to define processes that would happily take non-storable inputs. The input would not get stored in the database, but would be available within the processes lifetimes from the `inputs` property allowing it to be used. This will most likely result into unintentional loss of provenance. The reason is that the default `valid_type` of the top-level inputs namespace of the `Process` class was never being set to `Data`. This meant that any type would be excepted for a `Process` and all its subclasses unless the valid type of a port was explicitly overridden. This meant that for normal dynamic namespaces, even non-storable types would be accepted just fine. Setting `valid_type=Data` for the input namespace of the `Process` class fixes the problem therefore. Cherry-pick: 5c1eb3f
Fixes #5128
The basic assumption for a
Process
inaiida-core
is that all of itsinputs should be storable in the database as nodes. Under the current
link model, this means that they should be instances of the
Data
classor subclasses thereof. There is a noticeable exception for ports that
are explicitly marked as
non_db=True
, in which case the value is notlinked as a node, but is stored as an attribute directly on the process
node itself, or not stored whatsoever.
This basic rule was never explicitly enforced, which made it possible to
define processes that would happily take non-storable inputs. The input
would not get stored in the database, but would be available within the
processes lifetimes from the
inputs
property allowing it to be used.This will most likely result into unintentional loss of provenance.
The reason is that the default
valid_type
of the top-level inputsnamespace of the
Process
class was never being set toData
. Thismeant that any type would be excepted for a
Process
and all itssubclasses unless the valid type of a port was explicitly overridden.
This meant that for normal dynamic namespaces, even non-storable types
would be accepted just fine. Setting
valid_type=Data
for the inputnamespace of the
Process
class fixes the problem therefore.