-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
numpy arrays cannot be persisted in WorkChain context #3941
Comments
Find a simple solution - make AiiDALoader a subclass of But I think it is reasonable to assume the checkpoints persisted in the database safe? Of course, a potential attack vector can be someone break into my database, find my ongoing processes and make daemon do strange stuff (only when it restarts, given that the tempered checkpoint is not overwritten in the meantime) |
We were already discussing with @greschd in #3709 to just switch to unsafe loading for checkpoints in order to support more object types. We also discussed the safety of it and like you said there are potential attack vectors. Currently one does not even have to break into your database, simply creating a malicious export and having you import this will suffice. We think however that just always ignoring checkpoint attributes when importing. Anyway this should never happen because that means the process wasn't yet terminated when it was exported, but just to be safe we drop it if we find any. |
Oops, sorry I should be checked #3709 out! Closing this issue as it is duplicated. |
The context of a WorkChain is serialized when checkpointed and deserialized on demand, e.g. when the daemon is restarted. At the moment only a few python objects can be used in the context, mainly the those that are json compatible. For example, lists/dictionary of strings and integers. AiiDA data structures can be serialized because there are loaders and dumpers defined in serialize.py.
However, it would be good if numpy arrays can be used. At the moment they are serializble but cannot be deserialized. As a result, any workchain with numpy array in the context will except when the daemon attempts to deserialise the checkpoint when it restarts.
This problem can be reproduced with the following snipplet:
This will raise an exception:
Perhaps the solution is to add additional loaders to serialize.py?
OK, it seems that using
yaml.unsafe_load(sarr)
works. See also #3675.The text was updated successfully, but these errors were encountered: