Fix bug in `verdi process show` where not all called processes are shown #2962

sphuber · 2019-06-04T13:45:33Z

The utility get_node_info was called by verdi process show to format
a multiline string of the given process node, including all the inputs
and called processes. The function used the LinkManager.nested to
reconstruct a nested representation of the relevant set of neighboring
nodes. This is useful for the inputs and outputs that can in fact be
nested and have unique labels, however, for called links, the link
labels are not necessarily unique and so the dictionary returned by the
nested method could only represent a single called process with the same
link label. To fix this we use a different formatter that simply takes
the list of neighboring nodes, which then does not care about label
uniqueness.

Since the nested method was silently overwriting duplicate keys, which
caused this bug, its behavior is adapted to raise a KeyError in this
situation.

coveralls · 2019-06-04T14:09:03Z

Coverage increased (+0.007%) to 72.914% when pulling d1fa180 on sphuber:fix_2868_verdi_process_show_duplicate_links into 34188d8 on aiidateam:develop.

ltalirz · 2019-06-04T15:05:22Z

To fix this, we have to generate unique labels if duplicates exist.

If link names are not unique, would this not rather indicate that a list should be returned instead of a dictionary? What if someone calls a link CALL_123 or FILE_1?

Of course, this depends a bit where this is intended to be used.

Concerning the method 1-line description:

Return the matched nodes in the original nested form by expanding the collapsed link labels.

This is assuming a lot of knowledge about the context (original form, collapsed link labels, ...). Could this be rewritten in a more intuitive form (perhaps with an additional sentence in the description?)

sphuber · 2019-06-04T15:29:20Z

If link names are not unique, would this not rather indicate that a list should be returned instead of a dictionary? What if someone calls a link CALL_123?

This would be fine, even if there is another link with CALL_123 then the second one will simply be changed to CALL_123 <pk>. This is not a new "problem" of AiiDA. For example when doing node.outputs. and hitting tab, where node is a StructureData that has been used many times as input with the label structure. This happens all the time, but since all the labels are identical, you cannot address any specific one unambiguously. This is why in places where this occurs, the old solution was to suffix the label with a pk.

Here we have a similar situation. The LinkManager is what is returned by calling Node.get_incoming or Node.get_outgoing and provides utility functions to return subsets of the collection of neighboring nodes matched by the initial method. The nested one returns the collection in the form of a nested dictionary, where imploded link labels that represent nested namespaces, are exploded. If you pass an input for example in {'namespace': {'test': Node}}, its link label becomes namespace__test. This exploded dictionary is useful in places where I need that nested structure again instead of the flat link labels. Currently there is two places

To populate the restart process builder from a completed ProcessNode.
To display a nested representation of node links in verdi process show, e.g.:

Property       Value
-------------  ------------------------------------
type           CalcJobNode
pk             294746
uuid           2ad0110f-08ac-4ba5-85e1-0e4fc75839c3
label
description
ctime          2019-06-04 13:05:16.614230+02:00
mtime          2019-06-04 13:35:38.218381+02:00
process state  Finished
exit status    0
computer       [2] daint-gpu

Called by        PK  Type
-----------  ------  -------------
CALL_CALC    294740  WorkChainNode

Inputs      PK      Type
----------  ------  -------------
pseudos
    Nb      58      UpfData
    K       45      UpfData
    Cl      87      UpfData
code        293421  Code
kpoints     294743  KpointsData
parameters  294744  Dict
settings    294745  Dict
structure   305     StructureData

Note that the pseudos are nested, because it is a namespace.

This is assuming a lot of knowledge (original form, collapsed link labels, ...).

Yes, but I don't think I can explain the data structure and design of AiiDA in every docstring.
In this case I would then have to explain port namespaces, how they are mapped onto flat links etc.
At some point one needs to know a bit about the internal design to understand certain methods.
But maybe you have a suggestion how this can be made clear in relatively little text.

ltalirz · 2019-06-04T16:33:07Z

And here a suggestion for the docstring:

"""Construct dictionary of nodes that mirrors the nesting of link namespaces.

While the actual link labels flatten the nested structure of input/output namespaces, this function returns a dictionary of nodes that mirrors the nesting of link namespaces.
This is useful e.g. to group links together that belong to the same namespace.

The utility `get_node_info` was called by `verdi process show` to format a multiline string of the given process node, including all the inputs and called processes. The function used the `LinkManager.nested` to reconstruct a nested representation of the relevant set of neighboring nodes. This is useful for the inputs and outputs that can in fact be nested and have unique labels, however, for called links, the link labels are not necessarily unique and so the dictionary returned by the nested method could only represent a single called process with the same link label. To fix this we use a different formatter that simply takes the list of neighboring nodes, which then does not care about label uniqueness. Since the `nested` method was silently overwriting duplicate keys, which caused this bug, its behavior is adapted to raise a `KeyError` in this situation.

sphuber · 2019-06-04T17:09:34Z

@ltalirz OK I have updated the solution as discussed. Instead of adapting nested it now actually raises a KeyError in case of a conflict. I have also updated the docstring based on your suggestion but reworded. The get_node_info now uses a separate approach for the called processes formatting

ltalirz

thanks @sphuber
looks good!

sphuber · 2019-06-04T18:05:55Z

Thanks @ltalirz for the useful discussion, the eventual solution is far better than the original one

sphuber requested review from ltalirz and giovannipizzi June 4, 2019 13:45

sphuber force-pushed the fix_2868_verdi_process_show_duplicate_links branch from b156501 to 5ba688a Compare June 4, 2019 17:06

sphuber changed the title ~~Generate ad-hoc unique labels in LinkManager.nested~~ Fix bug in verdi process show where not all called processes are shown Jun 4, 2019

ltalirz approved these changes Jun 4, 2019

View reviewed changes

Merge branch 'develop' into fix_2868_verdi_process_show_duplicate_links

d1fa180

sphuber merged commit 5ff132e into aiidateam:develop Jun 4, 2019

sphuber deleted the fix_2868_verdi_process_show_duplicate_links branch June 4, 2019 18:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bug in `verdi process show` where not all called processes are shown #2962

Fix bug in `verdi process show` where not all called processes are shown #2962

sphuber commented Jun 4, 2019 •

edited

Loading

coveralls commented Jun 4, 2019 •

edited

Loading

ltalirz commented Jun 4, 2019 •

edited

Loading

sphuber commented Jun 4, 2019

ltalirz commented Jun 4, 2019

sphuber commented Jun 4, 2019

ltalirz left a comment

sphuber commented Jun 4, 2019

Fix bug in verdi process show where not all called processes are shown #2962

Fix bug in verdi process show where not all called processes are shown #2962

Conversation

sphuber commented Jun 4, 2019 • edited Loading

coveralls commented Jun 4, 2019 • edited Loading

ltalirz commented Jun 4, 2019 • edited Loading

sphuber commented Jun 4, 2019

ltalirz commented Jun 4, 2019

sphuber commented Jun 4, 2019

ltalirz left a comment

Choose a reason for hiding this comment

sphuber commented Jun 4, 2019

Fix bug in `verdi process show` where not all called processes are shown #2962

Fix bug in `verdi process show` where not all called processes are shown #2962

sphuber commented Jun 4, 2019 •

edited

Loading

coveralls commented Jun 4, 2019 •

edited

Loading

ltalirz commented Jun 4, 2019 •

edited

Loading