Skip to content

Commit

Permalink
Add Python 3.11 support (#27264)
Browse files Browse the repository at this point in the history
Add Python 3.11 support

Python 3.11 has been released as scheduled on October 25, 2022 and
finally, after all dependencies got upgraded - we can support it.

---------

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
(cherry picked from commit c5597d1)
  • Loading branch information
potiuk authored and Elad Kalif committed Jun 8, 2023
1 parent bb86b46 commit bd7cd41
Show file tree
Hide file tree
Showing 54 changed files with 1,603 additions and 1,089 deletions.
8 changes: 4 additions & 4 deletions CI.rst
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,10 @@ Container Registry used as cache
We are using GitHub Container Registry to store the results of the ``Build Images``
workflow which is used in the ``Tests`` workflow.

Currently in main version of Airflow we run tests in 4 different versions of Python (3.7, 3.8, 3.9, 3.10)
which means that we have to build 8 images (4 CI ones and 4 PROD ones). Yet we run around 12 jobs
with each of the CI images. That is a lot of time to just build the environment to run. Therefore
we are utilising ``pull_request_target`` feature of GitHub Actions.
Currently in main version of Airflow we run tests in all versions of Python supported,
which means that we have to build multiple images (one CI and one PROD for each Python version).
Yet we run many jobs (>15) - for each of the CI images. That is a lot of time to just build the
environment to run. Therefore we are utilising ``pull_request_target`` feature of GitHub Actions.

This feature allows to run a separate, independent workflow, when the main workflow is run -
this separate workflow is different than the main one, because by default it runs using ``main`` version
Expand Down
12 changes: 6 additions & 6 deletions LOCAL_VIRTUALENV.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ Required Software Packages
Use system-level package managers like yum, apt-get for Linux, or
Homebrew for macOS to install required software packages:

* Python (One of: 3.7, 3.8, 3.9, 3.10)
* Python (One of: 3.7, 3.8, 3.9, 3.10, 3.11)
* MySQL 5.7+
* libxml

Expand Down Expand Up @@ -102,7 +102,7 @@ Creating a Local virtualenv

To use your IDE for Airflow development and testing, you need to configure a virtual
environment. Ideally you should set up virtualenv for all Python versions that Airflow
supports (3.7, 3.8, 3.9, 3.10).
supports (3.7, 3.8, 3.9, 3.10, 3.11).

To create and initialize the local virtualenv:

Expand All @@ -122,7 +122,7 @@ To create and initialize the local virtualenv:

.. code-block:: bash
conda create -n airflow python=3.7 # or 3.8, 3.9, 3.10
conda create -n airflow python=3.7 # or 3.8, 3.9, 3.10, 3.11
conda activate airflow
2. Install Python PIP requirements:
Expand Down Expand Up @@ -150,7 +150,7 @@ for different python versions). For development on current main source:

.. code-block:: bash
# use the same version of python as you are working with, 3.7, 3.8, 3.9, or 3.10
# use the same version of python as you are working with, 3.7, 3.8, 3.9, 3.10 or 3.11
pip install -e ".[devel,<OTHER EXTRAS>]" \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-source-providers-3.7.txt"
Expand All @@ -163,7 +163,7 @@ You can also install Airflow in non-editable mode:

.. code-block:: bash
# use the same version of python as you are working with, 3.7, 3.8, 3.9, or 3.10
# use the same version of python as you are working with, 3.7, 3.8, 3.9, 3.10 or 3.11
pip install ".[devel,<OTHER EXTRAS>]" \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-source-providers-3.7.txt"
Expand All @@ -173,7 +173,7 @@ sources, unless you set ``INSTALL_PROVIDERS_FROM_SOURCES`` environment variable

.. code-block:: bash
# use the same version of python as you are working with, 3.7, 3.8, 3.9, or 3.10
# use the same version of python as you are working with, 3.7, 3.8, 3.9, 3.10 or 3.11
INSTALL_PROVIDERS_FROM_SOURCES="true" pip install ".[devel,<OTHER EXTRAS>]" \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-source-providers-3.7.txt"
Expand Down
135 changes: 17 additions & 118 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,6 @@ Use Airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The
- [Support for Python and Kubernetes versions](#support-for-python-and-kubernetes-versions)
- [Base OS support for reference Airflow images](#base-os-support-for-reference-airflow-images)
- [Approach to dependencies of Airflow](#approach-to-dependencies-of-airflow)
- [Release process for Providers](#release-process-for-providers)
- [Contributing](#contributing)
- [Who uses Apache Airflow?](#who-uses-apache-airflow)
- [Who Maintains Apache Airflow?](#who-maintains-apache-airflow)
Expand Down Expand Up @@ -87,15 +86,15 @@ Airflow is not a streaming solution, but it is often used to process real-time d

Apache Airflow is tested with:

| | Main version (dev) | Stable version (2.6.1) |
|---------------------|------------------------------|------------------------------------|
| Python | 3.7, 3.8, 3.9, 3.10 | 3.7, 3.8, 3.9, 3.10 |
| Platform | AMD64/ARM64(\*) | AMD64/ARM64(\*) |
| Kubernetes | 1.23, 1.24, 1.25, 1.26 | 1.21, 1.22, 1.23, 1.24, 1.25, 1.26 |
| PostgreSQL | 11, 12, 13, 14, 15 | 11, 12, 13, 14, 15 |
| MySQL | 5.7, 8 | 5.7, 8 |
| SQLite | 3.15.0+ | 3.15.0+ |
| MSSQL | 2017(\*), 2019(\*) | 2017(\*), 2019(\*) |
| | Main version (dev) | Stable version (2.6.1) |
|------------|------------------------------|---------------------------|
| Python | 3.7, 3.8, 3.9, 3.10, 3.11 | 3.7, 3.8, 3.9, 3.10, 3.11 |
| Platform | AMD64/ARM64(\*) | AMD64/ARM64(\*) |
| Kubernetes | 1.23, 1.24, 1.25, 1.26, 1.27 | 1.23, 1.24, 1.25, 1.26 |
| PostgreSQL | 11, 12, 13, 14, 15 | 11, 12, 13, 14, 15 |
| MySQL | 5.7, 8 | 5.7, 8 |
| SQLite | 3.15.0+ | 3.15.0+ |
| MSSQL | 2017(\*), 2019(\*) | 2017(\*), 2019(\*) |

\* Experimental

Expand Down Expand Up @@ -397,117 +396,17 @@ The important dependencies are:

### Approach for dependencies in Airflow Providers and extras

The main part of the Airflow is the Airflow Core, but the power of Airflow also comes from a number of
providers that extend the core functionality and are released separately, even if we keep them (for now)
in the same monorepo for convenience. You can read more about the providers in the
[Providers documentation](https://airflow.apache.org/docs/apache-airflow-providers/index.html). We also
have set of policies implemented for maintaining and releasing community-managed providers as well
as the approach for community vs. 3rd party providers in the [providers](PROVIDERS.rst) document.

Those `extras` and `providers` dependencies are maintained in `provider.yaml` of each provider.

By default, we should not upper-bound dependencies for providers, however each provider's maintainer
might decide to add additional limits (and justify them with comment)

## Release process for Providers

### Minimum supported version of Airflow

Providers released by the community (with roughly monthly cadence) have
limitation of a minimum supported version of Airflow. The minimum version of
Airflow is the `MINOR` version (2.2, 2.3 etc.) indicating that the providers
might use features that appeared in this release. The default support timespan
for the minimum version of Airflow (there could be justified exceptions) is
that we increase the minimum Airflow version, when 12 months passed since the
first release for the MINOR version of Airflow.

For example this means that by default we upgrade the minimum version of Airflow supported by providers
to 2.4.0 in the first Provider's release after 30th of April 2023. The 30th of April 2022 is the date when the
first `PATCHLEVEL` of 2.3 (2.3.0) has been released.

When we increase the minimum Airflow version, this is not a reason to bump `MAJOR` version of the providers
(unless there are other breaking changes in the provider). The reason for that is that people who use
older version of Airflow will not be able to use that provider (so it is not a breaking change for them)
and for people who are using supported version of Airflow this is not a breaking change on its own - they
will be able to use the new version without breaking their workflows. When we upgraded min-version to
2.2+, our approach was different but as of 2.3+ upgrade (November 2022) we only bump `MINOR` version of the
provider when we increase minimum Airflow version.

### Mixed governance model

Providers are often connected with some stakeholders that are vitally interested in maintaining backwards
compatibilities in their integrations (for example cloud providers, or specific service providers). But,
we are also bound with the [Apache Software Foundation release policy](https://www.apache.org/legal/release-policy.html)
which describes who releases, and how to release the ASF software. The provider's governance model is something we name
"mixed governance" - where we follow the release policies, while the burden of maintaining and testing
the cherry-picked versions is on those who commit to perform the cherry-picks and make PRs to older
branches.

The "mixed governance" (optional, per-provider) means that:

* The Airflow Community and release manager decide when to release those providers.
This is fully managed by the community and the usual release-management process following the
[Apache Software Foundation release policy](https://www.apache.org/legal/release-policy.html)
* The contributors (who might or might not be direct stakeholders in the provider) will carry the burden
of cherry-picking and testing the older versions of providers.
* There is no "selection" and acceptance process to determine which version of the provider is released.
It is determined by the actions of contributors raising the PR with cherry-picked changes and it follows
the usual PR review process where maintainer approves (or not) and merges (or not) such PR. Simply
speaking - the completed action of cherry-picking and testing the older version of the provider make
it eligible to be released. Unless there is someone who volunteers and perform the cherry-picking and
testing, the provider is not released.
* Branches to raise PR against are created when a contributor commits to perform the cherry-picking
(as a comment in PR to cherry-pick for example)

Usually, community effort is focused on the most recent version of each provider. The community approach is
that we should rather aggressively remove deprecations in "major" versions of the providers - whenever
there is an opportunity to increase major version of a provider, we attempt to remove all deprecations.
However, sometimes there is a contributor (who might or might not represent stakeholder),
willing to make their effort on cherry-picking and testing the non-breaking changes to a selected,
previous major branch of the provider. This results in releasing at most two versions of a
provider at a time:

* potentially breaking "latest" major version
* selected past major version with non-breaking changes applied by the contributor

Cherry-picking such changes follows the same process for releasing Airflow
patch-level releases for a previous minor Airflow version. Usually such cherry-picking is done when
there is an important bugfix and the latest version contains breaking changes that are not
coupled with the bugfix. Releasing them together in the latest version of the provider effectively couples
them, and therefore they're released separately. The cherry-picked changes have to be merged by the committer following the usual rules of the
community.

There is no obligation to cherry-pick and release older versions of the providers.
The community continues to release such older versions of the providers for as long as there is an effort
of the contributors to perform the cherry-picks and carry-on testing of the older provider version.

The availability of stakeholder that can manage "service-oriented" maintenance and agrees to such a
responsibility, will also drive our willingness to accept future, new providers to become community managed.

### Suspending releases for providers

In case a provider is found to require old dependencies that are not compatible with upcoming versions of
the Apache Airflow or with newer dependencies required by other providers, the provider's release
process can be suspended.

This means:

* The provider's status is set to "suspended"
* No new releases of the provider will be made until the problem with dependencies is solved
* Sources of the provider remain in the repository for now (in the future we might add process to remove them)
* No new changes will be accepted for the provider (other than the ones that fix the dependencies)
* The provider will be removed from the list of Apache Airflow extras in the next Airflow release
(including patch-level release if it is possible/easy to cherry-pick the suspension change)
* Tests of the provider will not be run on our CI (in main branch)
* Dependencies of the provider will not be installed in our main branch CI image nor included in constraints
* We can still decide to apply security fixes to released providers - by adding fixes to the main branch
but cherry-picking, testing and releasing them in the patch-level branch of the provider similar to the
mixed governance model described above.

The suspension may be triggered by any committer after the following criteria are met:

* The maintainers of dependencies of the provider are notified about the issue and are given a reasonable
time to resolve it (at least 1 week)
* Other options to resolve the issue have been exhausted and there are good reasons for upgrading
the old dependencies in question
* Explanation why we need to suspend the provider is stated in a public discussion in the devlist. Followed
by LAZY CONSENSUS or VOTE (with the majority of the voters agreeing that we should suspend the provider)

The suspension will be lifted when the dependencies of the provider are made compatible with the Apache
Airflow and with other providers.
might decide to add additional limits (and justify them with comment).

## Contributing

Expand Down
1 change: 1 addition & 0 deletions airflow/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@
PY38 = sys.version_info >= (3, 8)
PY39 = sys.version_info >= (3, 9)
PY310 = sys.version_info >= (3, 10)
PY311 = sys.version_info >= (3, 11)

# Things to lazy import in form {local_name: ('target_module', 'target_name')}
__lazy_imports: dict[str, tuple[str, str]] = {
Expand Down
2 changes: 1 addition & 1 deletion airflow/providers/amazon/aws/hooks/dms.py
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,7 @@ def wait_for_task_status(self, replication_task_arn: str, status: DmsTaskWaiterS
raise TypeError("Status must be an instance of DmsTaskWaiterStatus")

dms_client = self.get_conn()
waiter = dms_client.get_waiter(f"replication_task_{status}")
waiter = dms_client.get_waiter(f"replication_task_{status.value}")
waiter.wait(
Filters=[
{
Expand Down
5 changes: 5 additions & 0 deletions airflow/providers/apache/hive/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,11 @@ dependencies:
- sasl>=0.3.1; python_version>="3.9"
- thrift>=0.9.2

# Excluded because python-sasl is not yet compatible
# with 3.11. See https://github.com/cloudera/python-sasl/issues/30
excluded-python-versions:
- "3.11"

integrations:
- integration-name: Apache Hive
external-doc-url: https://hive.apache.org/
Expand Down
36 changes: 17 additions & 19 deletions airflow/providers/apache/hive/transfers/mysql_to_hive.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
from __future__ import annotations

from collections import OrderedDict
from contextlib import closing
from tempfile import NamedTemporaryFile
from typing import TYPE_CHECKING, Sequence

Expand Down Expand Up @@ -131,28 +132,25 @@ def type_map(cls, mysql_type: int) -> str:
def execute(self, context: Context):
hive = HiveCliHook(hive_cli_conn_id=self.hive_cli_conn_id, auth=self.hive_auth)
mysql = MySqlHook(mysql_conn_id=self.mysql_conn_id)

self.log.info("Dumping MySQL query results to local file")
conn = mysql.get_conn()
cursor = conn.cursor()
cursor.execute(self.sql)
with NamedTemporaryFile("wb") as f:
csv_writer = csv.writer(
f,
delimiter=self.delimiter,
quoting=self.quoting,
quotechar=self.quotechar,
escapechar=self.escapechar,
encoding="utf-8",
)
field_dict = OrderedDict()
if cursor.description is not None:
for field in cursor.description:
field_dict[field[0]] = self.type_map(field[1])
csv_writer.writerows(cursor)
with closing(mysql.get_conn()) as conn:
with closing(conn.cursor()) as cursor:
cursor.execute(self.sql)
csv_writer = csv.writer(
f,
delimiter=self.delimiter,
quoting=self.quoting,
quotechar=self.quotechar if self.quoting != csv.QUOTE_NONE else None,
escapechar=self.escapechar,
encoding="utf-8",
)
field_dict = OrderedDict()
if cursor.description is not None:
for field in cursor.description:
field_dict[field[0]] = self.type_map(field[1])
csv_writer.writerows(cursor)
f.flush()
cursor.close()
conn.close() # type: ignore[misc]
self.log.info("Loading file into Hive")
hive.load_file(
f.name,
Expand Down
2 changes: 1 addition & 1 deletion airflow/utils/log/file_task_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@ def add_triggerer_suffix(full_path, job_id=None):
triggerer instances.
"""
full_path = Path(full_path).as_posix()
full_path += f".{LogType.TRIGGER}"
full_path += f".{LogType.TRIGGER.value}"
if job_id:
full_path += f".{job_id}.log"
return full_path
Expand Down
2 changes: 1 addition & 1 deletion dev/README_RELEASE_AIRFLOW.md
Original file line number Diff line number Diff line change
Expand Up @@ -666,7 +666,7 @@ the older branches, you should set the "skip" field to true.
## Verify production images
```shell script
for PYTHON in 3.7 3.8 3.9 3.10
for PYTHON in 3.7 3.8 3.9 3.10 3.11
do
docker pull apache/airflow:${VERSION}-python${PYTHON}
breeze prod-image verify --image-name apache/airflow:${VERSION}-python${PYTHON}
Expand Down
4 changes: 2 additions & 2 deletions dev/breeze/src/airflow_breeze/global_constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
APACHE_AIRFLOW_GITHUB_REPOSITORY = "apache/airflow"

# Checked before putting in build cache
ALLOWED_PYTHON_MAJOR_MINOR_VERSIONS = ["3.7", "3.8", "3.9", "3.10"]
ALLOWED_PYTHON_MAJOR_MINOR_VERSIONS = ["3.7", "3.8", "3.9", "3.10", "3.11"]
DEFAULT_PYTHON_MAJOR_MINOR_VERSION = ALLOWED_PYTHON_MAJOR_MINOR_VERSIONS[0]
ALLOWED_ARCHITECTURES = [Architecture.X86_64, Architecture.ARM]
ALLOWED_BACKENDS = ["sqlite", "mysql", "postgres", "mssql"]
Expand Down Expand Up @@ -183,7 +183,7 @@ def get_default_platform_machine() -> str:
PYTHONDONTWRITEBYTECODE = True

PRODUCTION_IMAGE = False
ALL_PYTHON_MAJOR_MINOR_VERSIONS = ["3.7", "3.8", "3.9", "3.10"]
ALL_PYTHON_MAJOR_MINOR_VERSIONS = ["3.7", "3.8", "3.9", "3.10", "3.11"]
CURRENT_PYTHON_MAJOR_MINOR_VERSIONS = ALL_PYTHON_MAJOR_MINOR_VERSIONS
CURRENT_POSTGRES_VERSIONS = ["11", "12", "13", "14", "15"]
DEFAULT_POSTGRES_VERSION = CURRENT_POSTGRES_VERSIONS[0]
Expand Down
4 changes: 2 additions & 2 deletions dev/breeze/tests/test_cache.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,8 @@
[
("backend", "mysql", (True, ["sqlite", "mysql", "postgres", "mssql"]), None),
("backend", "xxx", (False, ["sqlite", "mysql", "postgres", "mssql"]), None),
("python_major_minor_version", "3.8", (True, ["3.7", "3.8", "3.9", "3.10"]), None),
("python_major_minor_version", "3.5", (False, ["3.7", "3.8", "3.9", "3.10"]), None),
("python_major_minor_version", "3.8", (True, ["3.7", "3.8", "3.9", "3.10", "3.11"]), None),
("python_major_minor_version", "3.5", (False, ["3.7", "3.8", "3.9", "3.10", "3.11"]), None),
("missing", "value", None, AttributeError),
],
)
Expand Down
Loading

0 comments on commit bd7cd41

Please sign in to comment.