Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: add the "How to explore the provenance graph" section #4491

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 148 additions & 0 deletions docs/source/howto/exploring.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
.. _how-to:exploring:

***********************************
How to explore the provenance graph
***********************************

.. _how-to:exploring:incoming-outgoing:

Incoming and outgoing links
===========================

The provenance graph in AiiDA is a :ref:`directed graph <topics:provenance:concepts>`.
The vertices of the graph are the *nodes* and the edges that connect them are called *links*.
Since the graph is directed, any node can have *incoming* and *outgoing* links that connect it to neighboring nodes.

To discover the neighbors of a given node, you can use the methods :meth:`~aiida.orm.nodes.node.Node.get_incoming` and :meth:`~aiida.orm.nodes.node.Node.get_outgoing`.
They have the exact same interface but will return the neighbors connected to the current node with link coming into it, or with links going out of it, respectively.
For example, for a given ``node``, to inspect all the neighboring nodes from which a link is incoming to the ``node``:

.. code-block:: python

node.get_incoming()

This will return an instance of the :class:`~aiida.orm.utils.links.LinkManager`.
From that manager you can request the results in a specific format.
If you are only interested in the neighboring nodes themselves, you can call the :class:`~aiida.orm.utils.links.LinkManager.all_nodes` method:

.. code-block:: python

node.get_incoming().all_nodes()

This will return a list of :class:`~aiida.orm.nodes.node.Node` instances that correspond to the nodes that are neighbors of ``node``, where the link is going towards ``node``.
Calling the :meth:`~aiida.orm.utils.links.LinkManager.all` method of the manager instead will return a list of :class:`~aiida.orm.utils.links.LinkTriple` named tuples.
These tuples contain, in addition to the neighboring node, also the link label and the link type with which they are connected to the origin ``node``.
For example, to list all the neighbors of a node from which a link is incoming:

.. code-block:: python

for link_triple in node.get_incoming().all():
print(link_triple.node, link_triple.link_type, link_triple.link_label)

Note that the :class:`~aiida.orm.utils.links.LinkManager` provides many convenience methods to get information from the neigboring nodes, such as :meth:`~aiida.orm.utils.links.LinkManager.all_link_labels` if you only need the list of link labels.

The :meth:`~aiida.orm.nodes.node.Node.get_incoming` and :meth:`~aiida.orm.nodes.node.Node.get_outgoing` methods accept various arguments that allow one to filter what neighboring nodes should be matched:

* ``node_class``: accepts a subclass of :class:`~aiida.orm.nodes.node.Node`, only neighboring nodes with a class that matches this will be returned
* ``link_type``: accepts a value of :class:`~aiida.common.links.LinkType`, only neighboring nodes that are linked with this link type will be returned
* ``link_label_filter``: accepts a string expression (with optional wildcards using the syntax of SQL ``LIKE`` patterns, see below), only neighboring nodes that are linked with a link label that matches the pattern will be returned

As an example:

.. code-block:: python

node.get_incoming(node_class=Data, link_type=LinkType.INPUT_CALC, link_label_filter='output%node_').all_nodes()

will return only neighboring data nodes that are linked to the ``node`` with a link of type ``LinkType.INPUT_CALC`` and where the link label matches the pattern ``'output%node_'``.
Reminder on the syntax of SQL `LIKE` patterns: the ``%`` character matches any string of zero or more characters, while the ``_`` character matches exactly one character.
These two special characters can be escaped by prepending them with a backslash (note that when putting a backslash in a Python string you have to escape the backslash itself, so you will need two backslashes: e.g., to match exactly a link label ``a_b`` you need to pass ``link_label_filter='a\\_b'``).


.. _how-to:exploring:inputs-outputs:

Inputs and outputs
==================

The :meth:`~aiida.orm.nodes.node.Node.get_incoming` and :meth:`~aiida.orm.nodes.node.Node.get_outgoing` methods, described in the :ref:`previous section <how-to:exploring:incoming-outgoing>`, can be used to access all neighbors from a certain node and provide advanced filtering options.
However, often one doesn't need this expressivity and simply wants to retrieve all neighboring nodes with a syntax that is as succint as possible.
A prime example is to retrieve the *inputs* or *outputs* of :ref:`a process <topics:processes:concepts>`.
Instead of using :meth:`~aiida.orm.nodes.node.Node.get_incoming` and :meth:`~aiida.orm.nodes.node.Node.get_outgoing`, to get the inputs and outputs of a ``process_node`` one can do:

.. code-block:: python

inputs = process_node.inputs
outputs = process_node.outputs

These properties do not return the actual inputs and outputs directly, but instead return an instance of :class:`~aiida.orm.utils.managers.NodeLinksManager`.
The reason is because through the manager, the inputs or outputs are accessible through their link label (that, for inputs and outputs of processes, is unique) and can be tab-completed.
For example, if the ``process_node`` has an output with the label ``result``, it can be retrieved as:

.. code-block:: python

process_node.outputs.result

The inputs or outputs can also be accessed through key dereferencing:

.. code-block:: python

process_node.outputs['result']

If there is no neighboring output with the given link label, a :class:`~aiida.common.exceptions.NotExistentAttributeError` or :class:`~aiida.common.exceptions.NotExistentKeyError` will be raised, respectively.

.. note::

The ``inputs`` and ``outputs`` properties are only defined for :class:`~aiida.orm.nodes.process.process.ProcessNode`'s.
This means that you cannot *chain* these calls, because an input or output of a process node is guaranteed to be a :class:`~aiida.orm.nodes.data.Data` node, which does not have inputs or outputs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put this comment below, after explaining creator, to say that inputs+creator allows to go up in the provenance graph as much as you want, chaining them. Instead, going down, you cannot for the reason you mention

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure where exactly you intend this to be placed. It is true that chaining creator and inputs alternatively works, but that is just one example and not every node has a creator. Instead, I kept this as is and added another note after the explanation of creator that gives the example you propose.



.. _how-to:exploring:creator-caller-called:

Creator, caller and called
==========================

Similar to the ``inputs`` and ``outputs`` properties of process nodes, there are some more properties that make exploring the provenance graph easier:

* :meth:`~aiida.orm.nodes.process.process.ProcessNode.called`: defined for :class:`~aiida.orm.nodes.process.process.ProcessNode`'s and returns the list of process nodes called by this node.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not relevant for this PR, but I realise now that we could have defined this only for WorkflowNodes, actually, since the list will always be empty for CalculationNodes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, and I think the ProcessNode should also have a inputs and outputs implementation. Now I cannot reference it

If this process node did not call any other processes, this property returns an empty list.
* :meth:`~aiida.orm.nodes.process.process.ProcessNode.caller`: defined for :class:`~aiida.orm.nodes.process.process.ProcessNode`'s and returns the process node that called this node.
If this node was not called by a process, this property returns ``None``.
* :meth:`~aiida.orm.nodes.data.Data.creator`: defined for :class:`~aiida.orm.nodes.data.Data` nodes and returns the process node that created it.
If the node was not created by a process, this property returns ``None``.

.. note::

Using the ``creator`` and ``inputs`` properties, one can easily move *up* the provenance graph.
For example, starting from some data node that represents the result of a long workflow, one can move up the provenance graph to find an initial input node of interest: ``result.creator.inputs.some_input.creator.inputs.initial_input``.

.. _how-to:exploring:calcjob-results:

Calculation job results
=======================

:class:`~aiida.orm.nodes.process.calculation.calcjob.CalcJobNode`'s provide the :meth:`~aiida.orm.nodes.process.calculation.calcjob.CalcJobNode.res` property, that can give easy access to the results of the calculation job.
The requirement is that the :class:`~aiida.engine.processes.calcjobs.calcjob.CalcJob` class that produced the node, defines a *default output node* in its spec.
This node should be a :class:`~aiida.orm.nodes.data.dict.Dict` output that will always be created.
An example is the :class:`~aiida.calculations.templatereplacer.TemplatereplacerCalculation` plugin, that has the ``output_parameters`` output that is specified as its default output node.

The :meth:`~aiida.orm.nodes.process.calculation.calcjob.CalcJobNode.res` property will give direct easy access to all the keys within this dictionary output.
For example, the following:

.. code-block:: python

list(node.res)

will return a list of all the keys in the output node.
Individual keys can then be accessed through attribute dereferencing:

.. code-block:: python

node.res.some_key

In an interactive shell, the available keys are also tab-completed.
If you type ``node.res.`` followed by the tab key twice, a list of the available keys is printed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add that, if the default output node link label is output_parameters, then node.res.some_key is equivalent to node.outputs.output_parameters.dict.some_key and stress that we are looking at the keys of a different node (this, together with the fact that verdi node outputcat/outputls act on a different node, does not seem to be super intuitive at first for new users, in my experience)


.. note::

The :meth:`~aiida.orm.nodes.process.calculation.calcjob.CalcJobNode.res` property is really just a shortcut to quickly and easily access an attribute of the default output node of a calculation job.
For example, if the default output node link label is ``output_parameters``, then ``node.res.some_key`` is exactly equivalent to ``node.outputs.output_parameters.dict.some_key``.
That is to say, when using ``res``, one is accessing attributes of one of the output nodes, and not of the calculation job node itself.
1 change: 1 addition & 0 deletions docs/source/howto/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ How-To Guides
plugin_codes
workflows
data
exploring
share_data
visualising_graphs/visualising_graphs
installation
Expand Down
1 change: 1 addition & 0 deletions docs/source/redirects.txt
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@ install/troubleshooting.rst intro/troubleshooting.rst
verdi/verdi_user_guide.rst topics/cli.rst
working_with_aiida/index.rst howto/index.rst
working_with_aiida/python_api.rst reference/api/public.rst
working_with_aiida/resultmanager.rst howto/exploring.rst
working_with_aiida/scripting.rst howto/interact.rst
109 changes: 0 additions & 109 deletions docs/source/working_with_aiida/resultmanager.rst

This file was deleted.