Skip to content

Commit

Permalink
📚 DOCS: Add storage architecture section (#5424)
Browse files Browse the repository at this point in the history
  • Loading branch information
chrisjsewell authored Mar 9, 2022
1 parent 7f18781 commit ffedc8b
Show file tree
Hide file tree
Showing 33 changed files with 384 additions and 212 deletions.
2 changes: 1 addition & 1 deletion aiida/storage/psql_dos/models/authinfo.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@


class DbAuthInfo(Base):
"""Database model to keep computer authentication data, per user.
"""Database model to store data for :py:class:`aiida.orm.AuthInfo`, and keep computer authentication data, per user.
Specifications are user-specific of how to submit jobs in the computer.
The model also has an ``enabled`` logical switch that indicates whether the device is available for use or not.
Expand Down
5 changes: 4 additions & 1 deletion aiida/storage/psql_dos/models/comment.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,10 @@


class DbComment(Base):
"""Database model to store comments, relating to a node."""
"""Database model to store data for :py:class:`aiida.orm.Comment`.
Comments can be attach to the nodes by the users.
"""

__tablename__ = 'db_dbcomment'

Expand Down
5 changes: 4 additions & 1 deletion aiida/storage/psql_dos/models/computer.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,10 @@


class DbComputer(Base):
"""Database model to store computers.
"""Database model to store data for :py:class:`aiida.orm.Computer`.
Computers represent (and contain the information of) the physical hardware resources available.
Nodes can be associated with computers if they are remote codes, remote folders, or processes that had run remotely.
Computers are identified within AiiDA by their ``label`` (and thus it must be unique for each one in the database),
whereas the ``hostname`` is the label that identifies the computer within the network from which one can access it.
Expand Down
4 changes: 3 additions & 1 deletion aiida/storage/psql_dos/models/group.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,9 @@ class DbGroupNode(Base):


class DbGroup(Base):
"""Database model to store groups of nodes.
"""Database model to store :py:class:`aiida.orm.Group` data.
A group may contain many different nodes, but also each node can be included in different groups.
Users will typically identify and handle groups by using their ``label``
(which, unlike the ``labels`` in other models, must be unique).
Expand Down
2 changes: 1 addition & 1 deletion aiida/storage/psql_dos/models/log.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@


class DbLog(Base):
"""Database model to store log levels and messages relating to a process node."""
"""Database model to data for :py:class:`aiida.orm.Log`, corresponding to :py:class:`aiida.orm.ProcessNode`."""
__tablename__ = 'db_dblog'

id = Column(Integer, primary_key=True) # pylint: disable=invalid-name
Expand Down
4 changes: 2 additions & 2 deletions aiida/storage/psql_dos/models/node.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@


class DbNode(Base):
"""Database model to store nodes.
"""Database model to store data for :py:class:`aiida.orm.Node`.
Each node can be categorized according to its ``node_type``,
which indicates what kind of data or process node it is.
Expand Down Expand Up @@ -170,7 +170,7 @@ def __str__(self):


class DbLink(Base):
"""Database model to store links between nodes.
"""Database model to store links between :py:class:`aiida.orm.Node`.
Each entry in this table contains not only the ``id`` information of the two nodes that are linked,
but also some extra properties of the link themselves.
Expand Down
4 changes: 3 additions & 1 deletion aiida/storage/psql_dos/models/user.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,9 @@


class DbUser(Base):
"""Database model to store users.
"""Database model to store data for :py:class:`aiida.orm.User`.
Every node that is created has a single user as its author.
The user information consists of the most basic personal contact details.
"""
Expand Down
4 changes: 2 additions & 2 deletions aiida/storage/sqlite_zip/backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -288,8 +288,8 @@ def get_info(self, detailed: bool = False, **kwargs) -> dict:
class ZipfileBackendRepository(_RoBackendRepository):
"""A read-only backend for a zip file.
The zip file should contain repository files with the key format: ``<folder>/<sha256 hash>``,
i.e. files named by the sha256 hash of the file contents, inside a ``<folder>`` directory.
The zip file should contain repository files with the key format: ``repo/<sha256 hash>``,
i.e. files named by the sha256 hash of the file contents, inside a ``repo`` directory.
"""

def __init__(self, path: str | Path):
Expand Down
1 change: 1 addition & 0 deletions aiida/storage/sqlite_zip/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@ def create_orm_cls(klass: base.Base) -> SqliteBase:
klass.__name__,
(SqliteBase,),
{
'__doc__': klass.__doc__,
'__tablename__': tbl.name,
'__table__': tbl,
**{col.name if col.name != 'metadata' else '_metadata': col for col in tbl.columns},
Expand Down
4 changes: 4 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -260,9 +260,13 @@ def setup(app: Sphinx):
'aiida.engine.Process': 'aiida.engine.processes.process.Process',
'aiida.engine.WorkChain': 'aiida.engine.processes.workchains.workchain.WorkChain',
'aiida.engine.WorkChainSpec': 'aiida.engine.processes.workchains.workchain.WorkChainSpec',
'aiida.orm.QueryBuilder': 'aiida.orm.querybuilder.QueryBuilder',
'aiida.orm.ArrayData': 'aiida.orm.nodes.data.array.array.ArrayData',
'aiida.orm.AuthInfo': 'aiida.orm.authinfos.AuthInfo',
'aiida.orm.Computer': 'aiida.orm.computers.Computer',
'aiida.orm.Comment': 'aiida.orm.comments.Comment',
'aiida.orm.Group': 'aiida.orm.groups.Group',
'aiida.orm.Log': 'aiida.orm.logs.Log',
'aiida.orm.Node': 'aiida.orm.nodes.node.Node',
'aiida.orm.User': 'aiida.orm.users.User',
'aiida.orm.CalculationNode': 'aiida.orm.nodes.process.calculation.calculation.CalculationNode',
Expand Down
138 changes: 0 additions & 138 deletions docs/source/internals/database.rst

This file was deleted.

6 changes: 1 addition & 5 deletions docs/source/internals/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,12 @@ Internal architecture
=====================

.. toctree::
:maxdepth: 1

database
repository
archive_format
storage/index
plugin_system
engine
rest_api

.. todo::

global_design
orm
32 changes: 0 additions & 32 deletions docs/source/internals/orm.rst

This file was deleted.

72 changes: 72 additions & 0 deletions docs/source/internals/storage/architecture.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
.. _internal_architecture:storage:architecture:

General architecture
====================

The storage of data is an important aspect of the AiiDA system.
The design for this subsystem is illustrated below.

.. figure:: static/storage-uml.svg
:width: 80%
:align: center

UML diagram of the storage architecture.

Blue indicates frontend classes, red indicates backend classes, and green indicates singletons.

Separate data is stored per ``Profile``, forming a single provenance graph.
A :py:class:`~aiida.manage.configuration.profile.Profile` instance represents a dictionary that includes the configuration details for accessing the storage for that profile, such as a database URI, etc.
Multiple ``Profile`` can be stored in a :py:class:`~aiida.manage.configuration.config.Config` instance, which is stored in the configuration file (``config.json``).

Within a single Python process, a single :py:class:`~aiida.manage.manager.Manager` instance can be loaded, to manage access to a globally loaded ``Profile`` and its :py:class:`~aiida.orm.implementation.storage_backend.StorageBackend` instance.

The storage API subsystem is based on an Object Relational Mapper (ORM) and is divided into two main parts: the frontend and the backend.
The frontend is responsible for the user interface, and is agnostic of any particular storage technologies,
and the backend is responsible for implementing interfaces with specific technologies (such as SQL databases).

.. _internal_architecture:storage:architecture:frontend:

Frontend ORM
------------

The frontend ORM comprises of a number of :py:class:`~aiida.orm.entities.Collection` and :py:class:`~aiida.orm.entities.Entity` subclasses, representing access to a single ORM type.

:py:class:`~aiida.orm.User`
Represents the author of a particular entity.
:py:class:`~aiida.orm.Node`
Represents a node in a provenance graph, containing data for a particular process (:py:class:`~aiida.orm.ProcessNode`) or process input/output (:py:class:`~aiida.orm.Data`).
Nodes are connected by links, that form an acyclic graph.
Nodes also have a :py:class:`~aiida.repository.repository.Repository` instance, which is used to store binary data of the node (see also :ref:`internal-architecture:repository`).
:py:class:`~aiida.orm.Comment`
Represents a comment on a node, by a particular user.
:py:class:`~aiida.orm.Log`
Represents a log message on a :py:class:`~aiida.orm.ProcessNode`, by a particular user.
:py:class:`~aiida.orm.Group`
Represents a group of nodes.
A single node can be part of multiple groups (i.e. a one-to-many relationship).
:py:class:`~aiida.orm.Computer`
Represents a compute resource on which a process is executed.
A single computer can be attached to multiple :py:class:`~aiida.orm.ProcessNode` (i.e. a one-to-many relationship).
:py:class:`~aiida.orm.AuthInfo`
Represents a authentication information for a particular computer and user.

The :py:class:`~aiida.orm.QueryBuilder` allows for querying of specific entities and their associated data.

Backend Implementations
-----------------------

Backend implementations must implement the classes outlines in :py:mod:`aiida.orm.implementation`.

There are currently two core backend implementations:

- ``psql_dos`` is implemented as the primary storage backend, see :ref:`internal_architecture:storage:psql_dos`.
- ``sqlite_zip`` is implemented as a storage backend for the AiiDA archive, see :ref:`internal_architecture:storage:sqlite_zip`.

Storage maintenance and profile locking
---------------------------------------

The :py:meth:`~aiida.orm.implementation.storage_backend.StorageBackend.maintain` method is allows for maintenance operations on the storage (for example, to optimise memory usage), and is called by `verdi storage maintain`.

During "full" maintenance, to guarantee the safety of its procedures, it may be necessary that the storage is not accessed by other processes.
The :py:class`~aiida.manage.profile_access.ProfileAccessManager` allows for profile access requests, and locking of profiles during such procedures.
:py:meth:`~aiida.manage.profile_access.ProfileAccessManager.request_access` is called within :py:meth:`~aiida.manage.manager.Manager.get_profile_storage`.
9 changes: 9 additions & 0 deletions docs/source/internals/storage/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Storage
=======

.. toctree::

architecture
repository
psql_dos
sqlite_zip
Loading

0 comments on commit ffedc8b

Please sign in to comment.