AEP number | 009 |
---|---|
Title | Improved Docker images |
Authors | Jusong Yu (unkcpz) |
Champions | Jusong Yu (unkcpz) |
Type | S - Standard |
Created | 16-July-2023 |
Status | implemented |
AiiDA provides and maintains a Docker image to offer a clean and isolated environment that is easy to setup for users and developers. However, the the current Docker stack has a number of problems and limitations:
- The
aiidateam/aiida-prerequisites
image, on which it is based, is not maintained and updated regularly. (Partly because theaiidalab-docker-stack
replacedaiida-prerequisites
withjupyter/minimal-notebook
as its base image.) - The image is not able to run on
linux/arm64
architecture, therefore not able to run on Apple M1 chip which is now widely is use by the community. - The RabbitMQ and PostgreSQL services are not properly installed and configured inside the container, causing the user to have to start them manually. As a result, it is not straightforward to create a container and start using AiiDA, as evidenced by the length of the documentation.
One of the main reasons for providing a Docker image with aiida-core
was to provide a base image for the aiidalab-docker-stack
to build on.
However, thanks to the fantastic work by @csadorf, the Docker stack of the aiidalab
organization is now built on top of jupyter/minimal-notebook
instead of aiidateam/aiida-core
.
The aiidalab-docker-stack
comes in four variants:
base
– A minimal image that comes with AiiDA pre-installed and an AiiDA profile set up.base-with-services
– Likebase
, but the services required by AiiDA (PostgreSQL and RabbitMQ) are installed in the container and automatically launched on startup.lab
– Likebase
, but uses the AiiDAlab home app as the primary interface (the standard JupyterLab interface is also available).full-stack
– Our most comprehensive image, likelab
, but also comes with services pre-installed and launched.
Due to the switch to jupyter/minimal-notebook
as the base image, the aiida-prerequisites
image has become superfluous.
And since the aiidalab
organization no longer needs aiida-prerequisites
, they no longer have a reason to maintain it.
This poses a problem for the aiidateam/aiida-core
image, which currently uses aiida-prerequisites
as a base, and therefore effectively is also no longer maintained.
However, there is still a need for a maintained Docker image with a basic AiiDA setup, but the images of the aiidalabe-dockers-stack
are not suitable:
- The
lab
andfull-stack
images are mostly designed to provide AiiDAlab, which is not necessary for a simple AiiDA setup. - The
base
andbase-with-services
, in contrast, only provides AiiDA, however, the usejupyter/minimal-notebook
as a base image has multiple downsides:- The
jupyter/minimal-notebook
starts a jupyter notebook as the main service of the container, and maps its corresponding port - A known issue from using jupyter/minimal-notebook as the base image, the system user name default set to
jovyan
and can only be changed by running container asroot
. If the container starts as theroot
user, it is hard to set start-up scripts to configure AiiDA for the system user of the container. In addition,root
will be the default user to login, which increases the complexity to integrate vscode with the container. - It is not possible to easily keep the latest
aiida-core
version. When a new version ofaiida-core
is released, or new fixes are pushed tomain
, users will have to wait foraiidalab-docker-stack
uploads a new build that includes the new features and fixes.
- The
- It can be confusing to have to use an image from the
aiidalab
organization on Docker hub, instead ofaiidateam
. Although the docker hub namespacesaiidateam
andaiidalab
are all managed by us, we still try to not confuse and mix up the images with these two different namespaces. - The services required by AiiDA are not automatically started and stopped in a graceful manner on container startup and shutdown.
The services are configured and started inside the container using
before-notebook.d
andafter-notebook.d
scripts. However, the services are not started and stopped gracefully, where the container will exit with an error if the services are not started properly and the services are not stopped properly when the container is stopped.
-
Archive
aiidateam/aiida-prerequisites
-
Provide Docker images with the
aiida-core
repository, which contain a minimal but functional AiiDA setup that is deployed to theaiidateam
Docker hub namespace. -
Images support both
linux/amd64
andlinux/arm64
, and include basic system tools:vim
,git
,conda
/mamba
for quick development environment setup. -
Docker run command will start the container and provide a ready to use AiiDA environment for users to use.
Two new Docker images will be maintained within the aiida-core
repository and pushed to the aiidateam
docker/ghcr hub namespace:
aiidateam/aiida-core-base
: Image contains justaiida-core
installed, which can be used with independent RabbitMQ & PostgreSQL services, e.g., throughdocker-compose
.aiidateam/aiida-core-with-services
: Image built on top ofaiida-core-base
that includes the automatic and graceful starting and stopping of RabbitMQ & PostgreSQL services.
These images are used as base images for other images and are not intended for use for regular users. The main purposes of these images included but were not limited to:
- As a base image for the aiidalab docker stack that is able to be deployed to the cloud for tutorial or testing, refer to aiidalab-on-azure.
- For quick and clean development environment build and launch.
- As the container for codespace environment for vscode remote development.
- As the base image for further service integration, such as
aiidalab-docker-stack
andqe-input-generator
.
Using ubuntu
as the base image, instead of jupyter/minimal-notebook
, solves the problems of having the useless jupyter notebook service active and the fixing of the system user name.
It is worthwhile to mention that the phusion/baseimage
was used as the base image for aiidateam/aiida-core
(from aiidateam/aiida-prerequisite
) before.
The reason to use phusion/baseimage
was to have an init
system inside the container which can help to start and stop the services gracefully.
Moreover, the phusion/baseimage
was the image that solved the zombie process problem of the ubuntu
base image.
The zombie process problem of ubuntu
base image can also be solved by using s6-overlay
as the system service manager (See section s6-overly for details).
s6-overlay is an easy-to-install (just extract a tarball or two!) set of scripts and utilities allowing you to use existing Docker images while using s6 as a pid 1 for your container and process supervisor for your services. It can:
- be used on top of any Docker image
- provide users with a turnkey s6 installation that will give them a stable pid 1, a fast and orderly init sequence and shutdown sequence, and the power of process supervision and automatically rotated logs.
A runner for an arm64 CI test was registered on buildjet.io
.
To avoid the runner being used unnessary, the arm64 CI test will only be triggered when the trigger comes from the main aiidateam/aiida-core
repo.
The container images are uploaded to both docker hub and ghcr.io
.
ghcr.io
is used also as the test image registry, which means the PR will build the image and upload it to ghcr.io
for testing.
Name of the PR tag will be pr-<PR number>
, the image will be uploaded to ghcr.io
with the tag pr-<PR number>
.
The image will be uploaded to docker hub only when the PR is merged to the main branch or the tag v*
is pushed to the main branch.
The tags of the image of release image will be vX.X.X
and X.X.X
as aiida-core version, along with postgresql-X.X.X
and python-X.X.X
as the version of PostgreSQL and Python.
The main disadvantage of this new docker stack is that s6
requires many files (some even empty) in the .docker
folder, which may be confusing for future maintenance.
The following system requirements will be included in the images:
- vim
- git
- ssh
- conda/mamba
- pytest
- psql (in
aiida-core-with-services
)
The reasoning here is that system requirements that are commonly used should come pre-installed.
This because installing them might not be straight forward as they require root permissions.
Other dependencies, such as Python dependencies, are left up to the user to install as anyway they will have to install specific AiiDA plugins for their purposes.
These dependencies can easily be installed in user space using conda (which comes preinstalled) and pip
.
There is a notorius incompatibility of aiida-core
with RabbitMQ versions newer than 3.8.15
(See the problem of message timeout in queue).
I chose to install the latest version of RabbitMQ nevertheless but counteract it by setting the consumer_timeout
config parameter to 1000 hours to avoid the problem.
The reason is that install the RabbitMQ from binary package has to compatible with the system erlang version which will break the < 3.8.15
version of RabbitMQ.
It is also not possible to install RabbitMQ from conda-forge because the arm64 version of RabbitMQ is not available on conda-forge.
s6-overlay
supports two ways to start services:
oneshot
: for services that only need to be started oncelongrun
: for services that need to be started and kept running.
RabbitMQ is started as a longrun
service with rabbitmq-server
without the -d
option, which means the service will be started and kept running in the background and restarted when the service is stopped.
PostgreSQL is started as a oneshot
service with pg_ctl -D /var/lib/postgresql/data start
which means the service will be started once and stopped when the container is stopped.
This is mostly because I don't know how to start PostgreSQL as a longrun
service.
The AiiDA daemon is started as a oneshot
service with verdi daemon start
and is stopped with verdi daemon stop
.
The reason is that after the container is started with docker run -it <image> bash
, the user will see an interactive bash shell with an AiiDA daemon that is running.
s6-overlay
supports starting services in a specific order.
The dependencies of the services are depicted below (nested services will be started before the parent service):
- aiida-daemon-start (main)
- aiida-prepare
- postgresql
- postgresql-init
- postgresql-prepare (check if postgresql is ready)
- rabbitmq (independent)
- rabbitmq-init