Skip to content

Commit

Permalink
Add instructions on how to avoid accidental airflow upgrade/downgrade (
Browse files Browse the repository at this point in the history
…#30813)

Some of our users raised issues that when extending the image, airflow
suddenly started reporting problem with database versions and migration
not aplied or out-of-sync. This almost always turns out to be a
dependency conflict, that leads to automated downgrate or upgrade of
installed airflow version. This is - obviously - undesired (you should
be upgrading airflow consciously rather than accidentally). However
there is no way to do it implicitly - `pip` might decide to upgrade or
downgrade airflow as it sees fit. From the point of view - airflow is
just one of the packages and has no special meaning.

The only way to "keep" airflow version is to specify it together with
other requirements, pinned to the specific version. This PR updates
our examples to do this and explains why airflow is added there.

There is - of course - another risk that the user will forget to
update the version of airflow when they upgrade, however, sinc this
is explicit action performed during image extension, it is much easier
to diagnose and notice. We also warn the users that they should upgrade
when airflow is upgraded.
  • Loading branch information
potiuk committed Apr 22, 2023
1 parent 29fb38c commit bf6ebe9
Show file tree
Hide file tree
Showing 7 changed files with 60 additions and 5 deletions.
4 changes: 3 additions & 1 deletion docker_tests/test_examples_of_prod_image_building.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,9 +55,11 @@ def test_dockerfile_example(dockerfile):
rel_dockerfile_path = Path(dockerfile).relative_to(DOCKER_EXAMPLES_DIR)
image_name = str(rel_dockerfile_path).lower().replace("/", "-")
content = Path(dockerfile).read_text()
latest_released_version: str = get_latest_airflow_version_released()
new_content = re.sub(
r"FROM apache/airflow:.*", rf"FROM apache/airflow:{get_latest_airflow_version_released()}", content
r"FROM apache/airflow:.*", rf"FROM apache/airflow:{latest_released_version}", content
)
new_content = re.sub(r"apache-airflow==\S*", rf"apache-airflow=={latest_released_version}", new_content)
try:
run_command(
["docker", "build", ".", "--tag", image_name, "-f", "-"],
Expand Down
52 changes: 52 additions & 0 deletions docs/docker-stack/build.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,17 @@ The following example adds ``lxml`` python package from PyPI to the image. When
``pip`` you need to use the ``airflow`` user rather than ``root``. Attempts to install ``pip`` packages
as ``root`` will fail with an appropriate error message.

.. note::
In the example below, we also add apache-airflow package to be installed - in the very same version
that the image version you used it from. This is not strictly necessary, but it is a good practice
to always install the same version of apache-airflow as the one you are using. This way you can
be sure that the version you are using is the same as the one you are extending. In some cases where
your new packages have conflicting dependencies, ``pip`` might decide to downgrade or upgrade
apache-airflow for you, so adding it explicitly is a good practice - this way if you have conflicting
requirements, you will get an error message with conflict information, rather than a surprise
downgrade or upgrade of airflow. If you upgrade airflow base image, you should also update the version
to match the new version of airflow.

.. exampleinclude:: docker-examples/extending/add-pypi-packages/Dockerfile
:language: Dockerfile
:start-after: [START Dockerfile]
Expand All @@ -67,11 +78,26 @@ The following example adds few python packages from ``requirements.txt`` from Py
Note that similarly when adding individual packages, you need to use the ``airflow`` user rather than
``root``. Attempts to install ``pip`` packages as ``root`` will fail with an appropriate error message.

.. note::
In the example below, we also add apache-airflow package to be installed - in the very same version
that the image version you used it from. This is not strictly necessary, but it is a good practice
to always install the same version of apache-airflow as the one you are using. This way you can
be sure that the version you are using is the same as the one you are extending. In some cases where
your new packages have conflicting dependencies, ``pip`` might decide to downgrade or upgrade
apache-airflow for you, so adding it explicitly is a good practice - this way if you have conflicting
requirements, you will get an error message with conflict information, rather than a surprise
downgrade or upgrade of airflow. If you upgrade airflow base image, you should also update the version
to match the new version of airflow.


.. exampleinclude:: docker-examples/extending/add-requirement-packages/Dockerfile
:language: Dockerfile
:start-after: [START Dockerfile]
:end-before: [END Dockerfile]

.. exampleinclude:: docker-examples/extending/add-requirement-packages/requirements.txt
:language: text


Embedding DAGs
..............
Expand Down Expand Up @@ -385,11 +411,25 @@ The following example adds few python packages from ``requirements.txt`` from Py
Note that similarly when adding individual packages, you need to use the ``airflow`` user rather than
``root``. Attempts to install ``pip`` packages as ``root`` will fail with an appropriate error message.

.. note::
In the example below, we also add apache-airflow package to be installed - in the very same version
that the image version you used it from. This is not strictly necessary, but it is a good practice
to always install the same version of apache-airflow as the one you are using. This way you can
be sure that the version you are using is the same as the one you are extending. In some cases where
your new packages have conflicting dependencies, ``pip`` might decide to downgrade or upgrade
apache-airflow for you, so adding it explicitly is a good practice - this way if you have conflicting
requirements, you will get an error message with conflict information, rather than a surprise
downgrade or upgrade of airflow. If you upgrade airflow base image, you should also update the version
to match the new version of airflow.

.. exampleinclude:: docker-examples/extending/add-requirement-packages/Dockerfile
:language: Dockerfile
:start-after: [START Dockerfile]
:end-before: [END Dockerfile]

.. exampleinclude:: docker-examples/extending/add-requirement-packages/requirements.txt
:language: text


Example when writable directory is needed
.........................................
Expand Down Expand Up @@ -558,6 +598,18 @@ You can use ``docker-context-files`` for the following purposes:
* you can place ``requirements.txt`` and add any ``pip`` packages you want to install in the
``docker-context-file`` folder. Those requirements will be automatically installed during the build.

.. note::
In the example below, we also add apache-airflow package to be installed - in the very same version
that the image version you used it from. This is not strictly necessary, but it is a good practice
to always install the same version of apache-airflow as the one you are using. This way you can
be sure that the version you are using is the same as the one you are extending. In some cases where
your new packages have conflicting dependencies, ``pip`` might decide to downgrade or upgrade
apache-airflow for you, so adding it explicitly is a good practice - this way if you have conflicting
requirements, you will get an error message with conflict information, rather than a surprise
downgrade or upgrade of airflow. If you upgrade airflow base image, you should also update the version
to match the new version of airflow.


.. exampleinclude:: docker-examples/customizing/own-requirements.sh
:language: bash
:start-after: [START build]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ mkdir -p docker-context-files

cat <<EOF >./docker-context-files/requirements.txt
beautifulsoup4==4.10.0
apache-airflow==2.6.0.dev0
EOF

export DOCKER_BUILDKIT=1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,5 +25,5 @@ RUN apt-get update \
&& rm -rf /var/lib/apt/lists/*
USER airflow
ENV JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
RUN pip install --no-cache-dir apache-airflow-providers-apache-spark==2.1.3
RUN pip install --no-cache-dir apache-airflow-providers-apache-spark==2.1.3 apache-airflow==2.6.0dev0
# [END Dockerfile]
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,5 @@
# This is an example Dockerfile. It is not intended for PRODUCTION use
# [START Dockerfile]
FROM apache/airflow:2.6.0.dev0
RUN pip install --no-cache-dir lxml
RUN pip install --no-cache-dir lxml apache-airflow==2.6.0.dev0
# [END Dockerfile]
Original file line number Diff line number Diff line change
Expand Up @@ -17,5 +17,5 @@
# [START Dockerfile]
FROM apache/airflow:2.6.0.dev0
COPY requirements.txt /
RUN pip install --no-cache-dir -r /requirements.txt
RUN pip install --no-cache-dir -r /requirements.txt apache-airflow==2.6.0.dev0
# [END Dockerfile]
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,5 @@
# This is an example Dockerfile. It is not intended for PRODUCTION use
# [START Dockerfile]
FROM apache/airflow:2.6.0.dev0
RUN pip install --no-cache-dir apache-airflow-providers-docker==2.5.1
RUN pip install --no-cache-dir apache-airflow-providers-docker==2.5.1 apache-airflow==2.6.0.dev0
# [END Dockerfile]

0 comments on commit bf6ebe9

Please sign in to comment.