Skip to content

Commit

Permalink
Switch from --user to venv for PROD image and enable uv
Browse files Browse the repository at this point in the history
This PR introduces a joint way to treat the .local (--user) folder
as both - venv and `--user` package installation. It fixes a number
of problems the `--user` installation created us in the past and
does it in fully backwards compatible way.

This improves both "production" use for end user as well as local
iteration on the PROD image during tests - but also for CI.

Improvements for "end user":

* user does not have to use `pip install --user` to install new
  packages any more and it is not enabled by default with PIP_USER
  flag.

* users can use uv to install packages when they extend the image
  (but it's not obligatory - pip continues working as it did)

* users can use `uv` to build custom production image, which gives
  40%-50% saving for image build time compring to `pip`.

* python -m venv --system-site-packages continues to use the
  .local packages from the .local installation (and not uses
  them if --system-site-packages is not used) - so we have full
  compatibility with previous images.

Improvements for development:

* when image is built from sources (no --use-docker-context-files
  are specified), airflow is installed in --editable mode, which
  means that airflow + all providers are installed locally from
  airflow sources, not from packages - which means that both
  airflow and providers have the latest version inside the
  prod image.

* when local sources changes and you want to run k8s tests locally,
  it is now WAY faster (several minutes) to iterate with your changes
  because you do not have to rebuild the base image - the only thing
  needed is to copy sources to the PROD image to "/opt/airflow" which
  is where editable installlation is done from. You only need to
  rebuild the image if dependencies change.

* By default `uv` is used for local source build for k8s tests so
  even if you have to rebuild it, it is way faster (60%-80%) during
  iterating with the image.

CI/DEV tooling improvements:

* this PR switches to use `uv` by default for most prod images we
  build in CI, but it adds a check if the image still builds with `pip`.

* we also switch to more PEP standard way of installing packages
  from local filesystem (package-name @ file:///FILE)

Fixes: #37785
Fixes: #37815

Update contributing-docs/testing/k8s_tests.rst

Co-authored-by: Niko Oliveira <onikolas@amazon.com>

Update contributing-docs/testing/k8s_tests.rst

Co-authored-by: Niko Oliveira <onikolas@amazon.com>

Update docs/docker-stack/build.rst

Co-authored-by: Niko Oliveira <onikolas@amazon.com>

Update docs/docker-stack/build.rst

Co-authored-by: Niko Oliveira <onikolas@amazon.com>

Update docs/docker-stack/build.rst

Co-authored-by: Niko Oliveira <onikolas@amazon.com>

Update docs/docker-stack/build.rst

Co-authored-by: Niko Oliveira <onikolas@amazon.com>

Update docs/docker-stack/build.rst

Co-authored-by: Niko Oliveira <onikolas@amazon.com>

Update scripts/docker/install_airflow.sh

Co-authored-by: Niko Oliveira <onikolas@amazon.com>

Update docs/docker-stack/changelog.rst

Co-authored-by: Niko Oliveira <onikolas@amazon.com>

Update docs/docker-stack/build.rst

Co-authored-by: Niko Oliveira <onikolas@amazon.com>
  • Loading branch information
potiuk and o-nikolas committed Mar 5, 2024
1 parent 2867951 commit dd6753f
Show file tree
Hide file tree
Showing 48 changed files with 917 additions and 512 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/build-images.yml
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,7 @@ jobs:
RUNS_ON: "${{ needs.build-info.outputs.runs-on }}"
BACKEND: sqlite
VERSION_SUFFIX_FOR_PYPI: "dev0"
USE_UV: "true"
steps:
- name: Cleanup repo
run: docker run -v "${GITHUB_WORKSPACE}:/workspace" -u 0:0 bash -c "rm -rf /workspace/*"
Expand Down Expand Up @@ -258,6 +259,7 @@ jobs:
BACKEND: sqlite
VERSION_SUFFIX_FOR_PYPI: "dev0"
INCLUDE_NOT_READY_PROVIDERS: "true"
USE_UV: "true"
steps:
- name: Cleanup repo
run: docker run -v "${GITHUB_WORKSPACE}:/workspace" -u 0:0 bash -c "rm -rf /workspace/*"
Expand Down
60 changes: 60 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -282,6 +282,7 @@ jobs:
# Force more parallelism for build even on public images
PARALLELISM: 6
VERSION_SUFFIX_FOR_PYPI: "dev0"
USE_UV: "true"
steps:
- name: Cleanup repo
run: docker run -v "${GITHUB_WORKSPACE}:/workspace" -u 0:0 bash -c "rm -rf /workspace/*"
Expand Down Expand Up @@ -1863,6 +1864,7 @@ jobs:
BACKEND: sqlite
VERSION_SUFFIX_FOR_PYPI: "dev0"
DEBUG_RESOURCES: ${{needs.build-info.outputs.debug-resources}}
USE_UV: "true"
steps:
- name: Cleanup repo
run: docker run -v "${GITHUB_WORKSPACE}:/workspace" -u 0:0 bash -c "rm -rf /workspace/*"
Expand Down Expand Up @@ -1898,6 +1900,58 @@ jobs:
PYTHON_VERSIONS: ${{needs.build-info.outputs.all-python-versions-list-as-string}}
DEBUG_RESOURCES: ${{ needs.build-info.outputs.debug-resources }}

build-prod-images-pip:
strategy:
matrix:
python-version: ${{ fromJson(needs.build-info.outputs.python-versions) }}
timeout-minutes: 80
name: ${{needs.build-info.outputs.build-job-description}} PROD image pip (main) ${{matrix.python-version}}
runs-on: ["ubuntu-22.04"]
needs: [build-info, build-ci-images]
env:
DEFAULT_BRANCH: ${{ needs.build-info.outputs.default-branch }}
DEFAULT_CONSTRAINTS_BRANCH: ${{ needs.build-info.outputs.default-constraints-branch }}
RUNS_ON: "${{needs.build-info.outputs.runs-on}}"
BACKEND: sqlite
VERSION_SUFFIX_FOR_PYPI: "dev0"
DEBUG_RESOURCES: ${{needs.build-info.outputs.debug-resources}}
USE_UV: "false"
steps:
- name: Cleanup repo
run: docker run -v "${GITHUB_WORKSPACE}:/workspace" -u 0:0 bash -c "rm -rf /workspace/*"
if: >
needs.build-info.outputs.in-workflow-build == 'true' &&
needs.build-info.outputs.default-branch == 'main'
- uses: actions/checkout@v4
with:
ref: ${{ needs.build-info.outputs.targetCommitSha }}
persist-credentials: false
if: >
needs.build-info.outputs.in-workflow-build == 'true' &&
needs.build-info.outputs.default-branch == 'main'
- name: "Install Breeze"
uses: ./.github/actions/breeze
with:
python-version: ${{ env.REPRODUCIBLE_PYTHON_VERSION }}
if: >
needs.build-info.outputs.in-workflow-build == 'true' &&
needs.build-info.outputs.default-branch == 'main'
- name: Build PROD Image pip ${{ matrix.python-version }}:${{env.IMAGE_TAG}}
uses: ./.github/actions/build-prod-images
if: >
needs.build-info.outputs.in-workflow-build == 'true' &&
needs.build-info.outputs.default-branch == 'main'
with:
build-provider-packages: ${{ needs.build-info.outputs.default-branch == 'main' }}
chicken-egg-providers: ${{ needs.build-info.outputs.chicken-egg-providers }}
python-version: ${{ matrix.python-version }}
env:
UPGRADE_TO_NEWER_DEPENDENCIES: ${{ needs.build-info.outputs.upgrade-to-newer-dependencies }}
DOCKER_CACHE: ${{ needs.build-info.outputs.cache-directive }}
PYTHON_VERSIONS: ${{needs.build-info.outputs.all-python-versions-list-as-string}}
DEBUG_RESOURCES: ${{ needs.build-info.outputs.debug-resources }}
IMAGE_TAG: "pip-${{ github.event.pull_request.head.sha || github.sha }}"

build-prod-images-bullseye:
strategy:
matrix:
Expand All @@ -1914,6 +1968,7 @@ jobs:
BACKEND: sqlite
VERSION_SUFFIX_FOR_PYPI: "dev0"
DEBUG_RESOURCES: ${{needs.build-info.outputs.debug-resources}}
USE_UV: "true"
steps:
- name: Cleanup repo
run: docker run -v "${GITHUB_WORKSPACE}:/workspace" -u 0:0 bash -c "rm -rf /workspace/*"
Expand Down Expand Up @@ -1970,6 +2025,7 @@ jobs:
BACKEND: sqlite
VERSION_SUFFIX_FOR_PYPI: "dev0"
DEBUG_RESOURCES: ${{needs.build-info.outputs.debug-resources}}
USE_UV: "true"
steps:
- name: Cleanup repo
run: docker run -v "${GITHUB_WORKSPACE}:/workspace" -u 0:0 bash -c "rm -rf /workspace/*"
Expand Down Expand Up @@ -2027,6 +2083,7 @@ jobs:
BACKEND: sqlite
VERSION_SUFFIX_FOR_PYPI: "dev0"
DEBUG_RESOURCES: ${{needs.build-info.outputs.debug-resources}}
USE_UV: "true"
steps:
- name: Cleanup repo
run: docker run -v "${GITHUB_WORKSPACE}:/workspace" -u 0:0 bash -c "rm -rf /workspace/*"
Expand Down Expand Up @@ -2078,6 +2135,7 @@ jobs:
BACKEND: sqlite
VERSION_SUFFIX_FOR_PYPI: "dev0"
DEBUG_RESOURCES: ${{needs.build-info.outputs.debug-resources}}
USE_UV: "true"
steps:
- name: Cleanup repo
run: docker run -v "${GITHUB_WORKSPACE}:/workspace" -u 0:0 bash -c "rm -rf /workspace/*"
Expand Down Expand Up @@ -2134,6 +2192,7 @@ jobs:
BACKEND: sqlite
VERSION_SUFFIX_FOR_PYPI: "dev0"
DEBUG_RESOURCES: ${{needs.build-info.outputs.debug-resources}}
USE_UV: "true"
steps:
- name: Cleanup repo
run: docker run -v "${GITHUB_WORKSPACE}:/workspace" -u 0:0 bash -c "rm -rf /workspace/*"
Expand Down Expand Up @@ -2530,6 +2589,7 @@ jobs:
RUNS_ON: "${{needs.build-info.outputs.runs-on}}"
# Force more parallelism for build even on small instances
PARALLELISM: 6
USE_UV: "true"
if: >
needs.build-info.outputs.in-workflow-build == 'true' &&
needs.build-info.outputs.canary-run != 'true'
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -452,7 +452,7 @@ repos:
name: Update extras in documentation
entry: ./scripts/ci/pre_commit/pre_commit_insert_extras.py
language: python
files: ^setup\.py$|^contributing-docs/12_airflow_dependencies_and_extras.rst$|^INSTALL$|^airflow/providers/.*/provider\.yaml$
files: ^contributing-docs/12_airflow_dependencies_and_extras.rst$|^INSTALL$|^airflow/providers/.*/provider\.yaml$|^Dockerfile.*
pass_filenames: false
additional_dependencies: ['rich>=12.4.4', 'tomli']
- id: check-extras-order
Expand Down
Loading

0 comments on commit dd6753f

Please sign in to comment.