-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when loading checkpoint #3709
Comments
This might have to do with the recent update to another version of |
Running
which I think is correct for the latest The processes were created after the update.. and I think I restarted everything that holds python state (daemon, jupyter kernels), but can't be 100% sure now. I think I'll just leave this issue open for a while in case it happens again or someone else has the same issue. |
Happened again, now I'm fairly certain there weren't any more kernels / daemons alive with the old code version. |
can you dump the response of |
Not sure if I should post the whole thing, but this is the offending part:
|
This is one of my workchains, but it had also happened on a pw relax workchain before. |
Ok, so it seems that is simply has trouble deserializing the |
As far as standard python data structures go: I just tried dumping / loading a
|
Hmmm, I created a unit test to try and hit this problem but for me it is working. Can you check out this branch and try to run |
Might that not just be because you are doing it in a shell or top-level script, causing the module path to be |
Ah, true. |
Yup, test fails for me.. |
Wait, no... I get the same behavior when using a bare SimpleNamespace:
|
Python version is 3.6.8, in case that might matter. |
Can you try to do |
I guess it's this? yaml/pyyaml#364 |
Yeah, that fixes it. Not sure how I ended up with |
That is probably because currently in |
Hmm, when looking at the files on PyPI, |
But yeah, I don't think for loading checkpoints it's really needed to do safe loading. Maybe we can strip checkpoints when importing (if we don't already do that) to completely avoid the "sharing malicious code" vector. |
We don't yet, but I think this would be indeed an option. In the export file it is just shared as a key in the JSON serialized object. The yaml serialized checkpoint is just a string at that point. When importing the nodes, we can just simply omit all |
I've just come across this issue again - @sphuber I have a few questions on how to fix this:
|
I think it is only used in the engine: specifically it is used by the |
Ok, thanks!
If we're concerned about malicious export files, the check on import is definitely necessary (and more important than the export change), because the exporter could always change the code on his side. |
very good point 👍 |
Yeah, just realized I still have that as a hotfix in my installation (because I don't need to worry about the malicious checkpoints issue). @ltalirz would be great if you could make a PR, I think the two things that need doing are:
|
Hi @greschd , I haven't followed the thread here and I haven't touched that part of the codebase before - my suggestion was just to upgrade the pyyaml version to 5.2. I'm happy to make the changes but would need some help - so, should I
|
Simply updating
Yes. If we remove checkpoints on import the explanation about malicious process instances isn't needed. I guess it'd need double-checking that this is the only place where
Haven't touched this either.. maybe @sphuber give some pointers? EDIT: Well, these were my guesses as to where import stripping should happen - no guarantees that it actually is correct 😄
The above should fix the
So I guess we should also go chasing after that. |
A checkpoint is just an attribute stored on the node, with the key |
Hi there, |
@mikibonacci I've used the patch below as a workaround. ⚠WARNING:⚠ The code is unsafe in the sense that when importing an archive from an untrusted source, it can lead to arbitrary code execution. The reason is that the archive can contain "checkpoints" crafted to execute any code. diff --git a/aiida/orm/utils/serialize.py b/aiida/orm/utils/serialize.py
index 23aa68348..70fc09070 100644
--- a/aiida/orm/utils/serialize.py
+++ b/aiida/orm/utils/serialize.py
@@ -176,7 +176,7 @@ class AiiDADumper(yaml.Dumper):
return super().represent_data(data)
-class AiiDALoader(yaml.FullLoader):
+class AiiDALoader(yaml.UnsafeLoader):
"""AiiDA specific yaml loader
.. note:: we subclass the `FullLoader` which is the one that since `pyyaml>=5.1` is the loader that prevents @ltalirz is there any progress on this? |
Sorry @greschd I didn't have time to look into this - please feel free to if you manage. |
Hm... that is incorrect, sorry https://github.com/aiidateam/plumpy/blob/develop/setup.py#L33 I think I saw it somewhere, but obviously not in the setup.py (also conda-forge feedstock is (correctly) locked to ~=5.1.2. Edit: Ok, I saw it in kiwipy |
I don't think the resolution of this is blocked by being able to use the new |
Yeah, unfortunately this isn't super high on my to-do list; but I think it's actually a fairly important issue because end users sometimes bump up against that. |
Not sure about the version compatibilities, but note that this issue is not mainly about the version (i.e., loosening the constraint wouldn't resolve this issue). The main tasks are:
|
Just for the record, this issue has increased in priority, because versions prior to 5.3.1 are currently vulnerable to arbitrary code execution when processing untrusted input values, see: GHSA-6757-jp84-gxfx |
#4730 can in principle be merged. |
To allow e.g. numpy arrays to be serialized to a process checkpoint, the `AiiDALoader` is based on `yaml.UnsafeLoader` instead of `yaml.FullLoader`. Since this could pose a security risk when sharing databases with maliciously crafted checkpoints, the checkpoints are removed upon importing an archive. Fixes aiidateam#3709.
After moving from
1.0.0
to the latestdevelop
, I'm getting errors where workchains except because their checkpoint could not be loaded (see full traceback below).These are new processes, i.e. they did not exist before the code update.
At least once, the issue occurred right after the daemon was restarted. This might just be the trigger for actually having to dump / load the checkpoint, though.
I have tried isolating a minimal failing example, but without luck so far. Will update once I've got an example.
@sphuber, could this be related to the switch to safe YAML loading?
The text was updated successfully, but these errors were encountered: