-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ProcessBuilder
: fix nested namespaces stay ProcessBuilderNamespace
#4983
ProcessBuilder
: fix nested namespaces stay ProcessBuilderNamespace
#4983
Conversation
@giovannipizzi I think this should fix the problem of the original issue and also indirectly fixes the problem you were experiencing. Would be good if you could verify this for your explicit test case. |
Codecov Report
@@ Coverage Diff @@
## develop #4983 +/- ##
===========================================
+ Coverage 80.12% 80.12% +0.01%
===========================================
Files 515 515
Lines 36676 36692 +16
===========================================
+ Hits 29382 29397 +15
- Misses 7294 7295 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Thanks @sphuber for the quick fix! I get this:
I can't debug more now (it's late :-) ) but it seems it's trying to "remove" the code and so validation fails? |
09fe198
to
5ca57ed
Compare
Apologies @giovannipizzi . It should be fixed now. I added a test that replicates your problem and tested the exact same example and I think it does the correct thing now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This indeed works and seems ok!
Only one comment (on the commit message, and on the docstring of _update, where it says that the method functions just as collections.abc.MutableMapping.update
).
This is not true. Taking the example from your test, if you run this in python:
a = {'nested': {'namespace': {'int': 1, 'float': 2.0}}}
a.update({'nested': {'namespace': {'int': 5}}})
now a
will be {'nested': {'namespace': {'int': 5}}}
, and not {'nested': {'namespace': {'int': 5, 'float': 2.0}}}
(what our _update
does).
That is, python will not "merge" recursively but update only at the first nesting level.
I think what python does is OK, and on the other hand our merging is OK because this is what we want to do in most of the cases (or at least most of the cases I've encountered).
Maybe a better name would have been _merge
but I wouldn't change this now (probably).
Rather, I would correct the docstring to say "this is NOT like the update
method of a python dictionary" and explain why with an example (e.g. the one above).
I'm approving anyway for now, in case you want to merge this anyway and move this task to a different issue, but maybe it's better if this is done here as well?
I will fix it here before merging. No need to reapprove since I am just fixing docstring. |
Thinking about it some more though, what you originally reported as a bug then was actually correct behavior. You called update with a nested dictionary and it behaved as |
I see, and I'm ok - the only question is: how do we express this when specifying a namespace? What I really need to do in the common-workflows EOS is this: inputs = {
'structure': structure,
#'scale_factors': orm.List(list=[0.96, 0.98, 1, 1.02, 1.04]),
'generator_inputs': { # code-agnostic inputs for the relaxation
'engines': engines,
'protocol': 'precise',
'relax_type': RelaxType.NONE,
'electronic_type': ElectronicType.METAL,
'spin_type': SpinType.NONE,
},
'sub_process_class': sub_process_cls_name,
'sub_process' : { # optional code-dependent overrides
'base': {
'pw': {
'settings' : orm.Dict(dict= {
'cmdline': ['-nk', '18'],
})
}
}
}
} and I really want to only replace |
It is the generic |
37bd083
to
e5bd506
Compare
@giovannipizzi since we are actually changing the code after first review, I pushed without temporarily unchecking the setting to not invalidate your review. Would be good if you can give a quick look to see if it is still ok. Since the reasons for the fixes are clearer now, I have also split them into two separate commits. One is really fixing a bug and the second is just adding the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, even if I'm wondering if it's a good idea to keep _recursive_merge as a method of the class - since we're using private methods _merge
and _update
that should actually be used by the user, I think finding _recursive_merge
will be quite confusing (actually it might be more intuitive to think the actual method is _recursive_merge
) on which one to use. I'd prefer to move it as a module function so it's not visibile in the class namespace.
If you don't agree just merge, otherwise can you please move out?
One comment on these private methods, that I just realised. Since we just don't want to occupy the namespace, maybe the better way to do it (while at the same time avoiding to use private methods) is to append and underscore, rather than prepending? So This would avoid asking users to use private methods, and should be OK because (IIRC) also appended underscores are not valid for port names. What do you think? If you agree, maybe we move to an issue? (This will have to be introduced with a deprecation pattern - or even without deprecating, we can keep the private method, and just define the new |
One final thing, more related to this PR - when would be the right moment to "release" this? We could do 1.6.4 but it's a new feature. And I think we don't want to have a 1.7? So we need to wait, I guess? |
Hi @sphuber - beside the update_ vs _update, I was thinking even more about this. I guess having the update and merge functionality is good anyway, but we might want to have more flexibility for the common workflows. Just mentioning it here (and maybe we try to merge this in the meantime), but we can follow up in the corresponding issue on the common workflows repo. |
Yeh just to add my two-cents, I really think the whole Anyway, its very unlikely we would reverse course now, but we should definitely be wary about implementing any more in the future. |
Other than my word of warning though, yeh cheers I think the fix is probably as good as you can get |
Quickly responding to @giovannipizzi comments:
|
309b98a
to
1f70265
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks!
(Note that there is no test for the keyword use of _merge
, as Codecov points out - if you have time you could copy-paste the test - in this case feel free to just consider me approving that test addition!
1f70265
to
dc734d9
Compare
…ce` class When a `ProcessBuilder` is constructed from a `PortNamespace` each `PortNamespace` is converted into a `ProcessBuilderNamespace` including the nested `PortNamespace`s. This guarantees that even nested port namespaces have the same autocompletion and automatic validation when assigning values as the top level namespace. There was a bug however that when assigning a dictionary to a namespace of the builder, that namespace would lose its `ProcessBuilderNamespace` type and become a plain dictionary. Subsequent assignments to ports within this namespace would then no longer be validated. The source of the bug was in the `__setattr__` of `ProcessBuilderNamespace` which would simply assign the value that is being passed to the internal `_data` container, losing the `ProcessBuilderNamespace` class if the value is a plain dictionary. The solution is to, in the case where an attribute that is being set corresponds to a `PortNamespace`, to first assign a new `ProcessBuilderNamespace` in the internal `_data` container for that given attribute, and then iterate over the contents of the value dictionary being assigned and call `setattr` for that on the new `ProcessBuilderNamespace`. By calling `setattr` again, the process described above is applied recursively, as it should.
The class already had the `_update` method, but like the normal `update` method of a mapping, this will not recursively merge the contents of nested dictionaries with any existing values. Anything below the top level will simply get overwritten. However, often one wants to merge the contents of the dictionary with the existing namespace structure only overriding leaf nodes. To this end the `_merge` method is added which recursively merges the content of the new dictionary with the existing content.
dc734d9
to
ef310f1
Compare
Fixes #4630
When a
ProcessBuilder
is constructed from aPortNamespace
eachPortNamespace
is converted into aProcessBuilderNamespace
includingthe nested
PortNamespace
s. This guarantees that even nested portnamespaces have the same autocompletion and automatic validation when
assigning values as the top level namespace.
There was a bug however that when assigning a dictionary to a namespace
of the builder, that namespace would lose its
ProcessBuilderNamespace
type and become a plain dictionary. Subsequent assignments to ports
within this namespace would then no longer be validated.
The source of the bug was in the
__setattr__
ofProcessBuilderNamespace
which would simply assign the value that is being passed to the internal
_data
container, losing theProcessBuilderNamespace
class if thevalue is a plain dictionary. The solution is to, in the case where an
attribute that is being set corresponds to a
PortNamespace
, to firstassign a new
ProcessBuilderNamespace
in the internal_data
containerfor that given attribute, and then iterate over the contents of the
value dictionary being assigned and call
setattr
for that on the newProcessBuilderNamespace
. By callingsetattr
again, the processdescribed above is applied recursively, as it should.
This fix also solves another bug that was a direct result of the problem
described above. When passing a dictionary to
_update
, any existingcontent would be fully overwritten instead of merging the existing and
new dictionary, which is what the behavior of update should be, just as
for the normal
update
method of plain Python dictionaries.