Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process functions: Add serialization for Python base type defaults #5744

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Nov 3, 2022

Fixes #5743

As of 0680209, process functions will automatically add the to_aiida_type serializer for arguments such that Python base types are automatically converted to the corresponding AiiDA data type if not already a Data instance.

The same was not added for defaults however, so if a user defined a calcfunction with a Python base type default, an exception would be raised once the dynamic FunctionProcess would be constructed. The default would be passed to the InputPort constructor as is and so would not validate against the type check.

Here we update the dynamic process spec generation to detect if a default is provided for an argument and if so serialize it to the node equivalent. Note that this is done indirectly through a lambda as using constructed node instances as defaults in process specs can cause issues for example when the spec is exposed into another process spec, the ports are deep copied, including the defaults of the ports and deep copying of data nodes is not supported.

@sphuber sphuber force-pushed the fix/5743/process-function-default-serialization branch from e55aafd to 3ff58f4 Compare November 3, 2022 22:44
@sphuber sphuber requested a review from unkcpz November 4, 2022 08:00
@sphuber
Copy link
Contributor Author

sphuber commented Nov 9, 2022

@unkcpz it would be great if you could give this a look

@superstar54
Copy link
Member

@sphuber @unkcpz I can review this PR. I am interested in this.

@superstar54 superstar54 self-requested a review November 9, 2022 19:13
Copy link
Member

@superstar54 superstar54 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is done indirectly through a lambda as using constructed node instances as defaults in process specs can cause issues for example when the spec is exposed into another process spec, the ports are deep copied, including the defaults of the ports and deep copying of data nodes is not supported.

Could you add a test in this case?

aiida/engine/processes/functions.py Show resolved Hide resolved
tests/engine/test_process_function.py Outdated Show resolved Hide resolved
tests/engine/test_process_function.py Outdated Show resolved Hide resolved
@sphuber
Copy link
Contributor Author

sphuber commented Nov 9, 2022

Thanks a lot for the review @superstar54 !

Could you add a test in this case?

Do you mean to add a test that verifies that bad things happen when a node instance is provided as a default? If so, I think that is a bit out of scope for this PR and should already be tested elsewhere. I will try and find the related issues/PRs.

Edit: I found the relevant PR: #3466
Essentially this occurred in the context of WorkChains where if you define an unstored node as a default, when the WorkChain is exposed into another one, its input ports (including the defaults) are copied, and for this deepcopy is used, to make them fully independent. However, this led to weird side-effects.

Just as in normal Python it is not recommended to use mutable defaults (for context) we recommend not to use unstored nodes as defaults. That's why we introduced a warning to be emitted when users define them anyway and the recommended solution is to use a lambda instead. That is why in this PR I had to make a special case for defaults specified as lambda, e.g.:

from aiida.engine import calcfunction
from aiida.orm import Int

@calcfunction
def add(a, b = lambda: Int(2)):
    return a + b

Please let me know if I misunderstood your comment though.

As of 0680209, process functions will
automatically add the `to_aiida_type` serializer for arguments such that
Python base types are automatically converted to the corresponding AiiDA
data type if not already a `Data` instance.

The same was not added for defaults however, so if a user defined a
calcfunction with a Python base type default, an exception would be
raised once the dynamic `FunctionProcess` would be constructed. The
default would be passed to the `InputPort` constructor as is and so
would not validate against the type check.

Here we update the dynamic process spec generation to detect if a
default is provided for an argument and if so serialize it to the node
equivalent. Note that this is done indirectly through a lambda as using
constructed node instances as defaults in process specs can cause issues
for example when the spec is exposed into another process spec, the
ports are deep copied, including the defaults of the ports and deep
copying of data nodes is not supported.
@sphuber sphuber force-pushed the fix/5743/process-function-default-serialization branch from 3ff58f4 to 025d91a Compare November 9, 2022 22:47
@superstar54
Copy link
Member

superstar54 commented Nov 10, 2022

@sphuber Thanks, clear to me now!

@superstar54 superstar54 reopened this Nov 10, 2022
@codecov
Copy link

codecov bot commented Nov 10, 2022

Codecov Report

Base: 79.85% // Head: 79.86% // Increases project coverage by +0.01% 🎉

Coverage data is based on head (025d91a) compared to base (ed620c0).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5744      +/-   ##
==========================================
+ Coverage   79.85%   79.86%   +0.01%     
==========================================
  Files         496      496              
  Lines       37015    37019       +4     
==========================================
+ Hits        29556    29561       +5     
+ Misses       7459     7458       -1     
Impacted Files Coverage Δ
aiida/engine/processes/functions.py 93.53% <100.00%> (+0.16%) ⬆️
aiida/transports/plugins/local.py 82.84% <0.00%> (+0.25%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@sphuber
Copy link
Contributor Author

sphuber commented Nov 10, 2022

@sphuber Thanks, clear to me now!

Cheers. If I have addressed your requests satisfactorily, could you please approve the request?

@sphuber sphuber requested a review from superstar54 November 10, 2022 07:43
Copy link
Member

@superstar54 superstar54 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sphuber Thanks! LGTM.

@sphuber sphuber merged commit 920fae7 into aiidateam:main Nov 10, 2022
@sphuber sphuber deleted the fix/5743/process-function-default-serialization branch November 10, 2022 07:50
Copy link
Member

@unkcpz unkcpz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@superstar54 @sphuber thanks! I am late for the party 😄. I have one question about the to_aiida_type.

Comment on lines +263 to +267
if (
default is not None and default != UNSPECIFIED and not isinstance(default, Data) and
not callable(default)
):
indirect_default = lambda value=default: to_aiida_type(value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think aiida data type can apply to the Data and get the same type, which means we don't need isinstance(default, Data)?
Suppose value = orm.Int(5) then to_aiida_type(value) should still work. If the to_aiida_type is designed to be should only apply to the raw python data type (which I think not, it should also support pass orm.Str to it and return the aiida type), then this isinstance(default, Data) can be put into to_aiida_type and raise an exception.

Also, is that callable(default) would rule out the possible to serialize the python function to AiiDA type if it get supported in the future?

Copy link
Contributor Author

@sphuber sphuber Nov 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think aiida data type can apply to the Data and get the same type, which means we don't need isinstance(default, Data)?

Nope, that raises:

In [1]: from aiida.orm import to_aiida_type

In [2]: to_aiida_type(Int(2))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-d0b2d1ab4290> in <module>
----> 1 to_aiida_type(Int(2))

/usr/lib/python3.9/functools.py in wrapper(*args, **kw)
    886                             '1 positional argument')
    887 
--> 888         return dispatch(args[0].__class__)(*args, **kw)
    889 
    890     funcname = getattr(func, '__name__', 'singledispatch function')

~/code/aiida/env/dev/aiida-core/aiida/orm/nodes/data/base.py in to_aiida_type(value)
     19 def to_aiida_type(value):
     20     """Turns basic Python types (str, int, float, bool) into the corresponding AiiDA types."""
---> 21     raise TypeError(f'Cannot convert value of type {type(value)} to AiiDA type.')
     22 
     23 

TypeError: Cannot convert value of type <class 'aiida.orm.nodes.data.int.Int'> to AiiDA type.

Copy link
Contributor Author

@sphuber sphuber Nov 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is that callable(default) would rule out the possible to serialize the python function to AiiDA type if it get supported in the future?

You mean that an arbitrary callable can get stored as a Data node? I guess you are right, but so far we haven't really had use cases where users want to pass arbitrary callables to process functions.

Now that you mention it, I have actually implemented such a Data plugin in aiida-shell for a different purpose (see this PR sphuber/aiida-shell#19). It allows to store a function that is then called by the parser as a hook, allowing to provide custom parsing on the fly. For now it is used as an input to a CalcJob but I guess it could also be used as an input to a process function. I guess you are right that if we were to add this plugin to aiida-core then people could not rely on the automatic serialization of the default, but they can always do it manually, e.g.:

from aiida_shell.data import PickledData
from aiida.engine import calcfunction

def my_callable():
    print('hello world')

@calcfunction
def test(func: callable = lambda: PickledData(my_callable)):
    func.load()()

test()  # will print 'hello world'

But since this is functionality that doesn't exist in aiida-core yet, I think it is fine to keep as is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is just a nitpick from my side. If the user knows things like serializing the function they won't have a problem finding how it should be done of passing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for defaults of process function arguments that are Python base types
3 participants