diff --git a/CI.rst b/CI.rst index 3947f3cc24f4..2844a5c3c20c 100644 --- a/CI.rst +++ b/CI.rst @@ -67,8 +67,8 @@ The following components are part of the CI infrastructure CI run types ============ -The following CI Job runs are currently run for Apache Airflow, and each of the runs have different -purpose and context. +The following CI Job run types are currently run for Apache Airflow (run by ci.yaml workflow and +quarantined.yaml workflows) and each of the run types have different purpose and context. Pull request run ---------------- @@ -126,7 +126,17 @@ DockerHub when pushing ``v1-10-stable`` manually. All runs consist of the same jobs, but the jobs behave slightly differently or they are skipped in different run categories. Here is a summary of the run categories with regards of the jobs they are running. Those jobs often have matrix run strategy which runs several different variations of the jobs -(with different Backend type / Python version, type of the tests to run for example) +(with different Backend type / Python version, type of the tests to run for example). The following chapter +describes the workflows that execute for each run. + +Workflows +========= + +CI Build Workflow +----------------- + +This workflow is a regular workflow that performs the regular checks - none of the jobs should fail. +The tests to run do not contain quarantined tests. +---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+ | Job | Description | Pull Request Run | Direct Push/Merge Run | Scheduled Run | @@ -148,13 +158,13 @@ Those jobs often have matrix run strategy which runs several different variation +---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+ | Tests Kubernetes | Run Kubernetes test | Yes (if tests-triggered) | Yes | Yes * | +---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+ -| Quarantined tests | Those are tests that are flaky and we need to fix them | Yes (if tests-triggered) | Yes | Yes * | +| Test OpenAPI client gen | Tests if OpenAPIClient continues to generate | Yes | Yes | Yes * | +---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+ | Helm tests | Runs tests for the Helm chart | Yes | Yes | Yes * | +---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+ -| Constraints | Upgrade constraints to latest eagerly pushed ones (only if tests successful) | - | Yes | Yes * | +| Constraints | Upgrade constraints to latest eagerly pushed ones (only if tests successful) | - | Yes | Yes * | +---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+ -| Constraints push | Pushes updated constraints (only if tests successful) | - | Yes | - | +| Constraints push | Pushes updated constraints (only if tests successful) | - | Yes | Yes * | +---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+ | Push Prod images | Pushes production images to GitHub Private Image Registry to cache the build images for following runs | - | Yes | - | +---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+ @@ -162,3 +172,48 @@ Those jobs often have matrix run strategy which runs several different variation +---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+ | Tag Repo nightly | Tags the repository with nightly tagIt is a lightweight tag that moves nightly | - | - | Yes. Triggers DockerHub build for public registry | +---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+ + +Quarantined build workflow +-------------------------- + +This workflow runs only quarantined tests. Those tests do not fail the build even if some tests fail (only if +the whole pytest execution fails). Instead this workflow updates one of the issues where we keep status +of quarantined tests. Once the test succeeds in NUM_RUNS subsequent runs, it is marked as stable and +can be removed from quarantine. You can read more about quarantine in ``_ + +The issues are only updated if the test is run as direct push or scheduled run and only in the +``apache/airflow`` repository - so that the issues are not updated in forks. + +The issues that gets updated are different for different branches: + +* master: `Quarantine tests master `_ +* v1-10-stable: `Quarantine tests v1-10-stable `_ +* v1-10-test: `Quarantine tests v1-10-test `_ + ++---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+ +| Job | Description | Pull Request Run | Direct Push/Merge Run | Scheduled Run | ++===========================+================================================================================================================+====================================+=================================+======================================================================+ +| Cancel previous workflow | Cancels the previously running workflow run if there is one running | Yes | Yes | Yes * | ++---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+ +| Trigger tests | Checks if tests should be triggered | Yes | Yes | Yes * | ++---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+ +| Quarantined tests | Those are tests that are flaky and we need to fix them | Yes (if tests-triggered) | Yes (Updates quarantine issue) | Yes * (updates quarantine issue) | ++---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+ + +Cancel other workflow runs workflow +----------------------------------- + +This workflow is run only on schedule (every 5 minutes) it's only purpose is to cancel other running +``CI Build`` workflows if important jobs failed in those runs. This is to save runners for other runs +in case we know that the build will not succeed anyway without some basic fixes to static checks or +documentation - effectively implementing missing "fail-fast" (on a job level) in Github Actions +similar to fail-fast in matrix strategy. + +The jobs that are considered as "fail-fast" are: + +* Static checks +* Docs +* Prepare Backport packages +* Helm tests +* Build Prod Image +* TTest OpenAPI client gen diff --git a/scripts/ci/docker-compose/base.yml b/scripts/ci/docker-compose/base.yml index 5eac8c465ddc..5058d9c797fd 100644 --- a/scripts/ci/docker-compose/base.yml +++ b/scripts/ci/docker-compose/base.yml @@ -40,6 +40,10 @@ services: - RUN_INTEGRATION_TESTS - ONLY_RUN_LONG_RUNNING_TESTS - ONLY_RUN_QUARANTINED_TESTS + - GITHUB_TOKEN + - GITHUB_REPOSITORY + - ISSUE_ID + - NUM_RUNS - BREEZE - INSTALL_AIRFLOW_VERSION - DB_RESET diff --git a/scripts/ci/in_container/quarantine_issue_header.md b/scripts/ci/in_container/quarantine_issue_header.md new file mode 100644 index 000000000000..d672a4de7bf4 --- /dev/null +++ b/scripts/ci/in_container/quarantine_issue_header.md @@ -0,0 +1,32 @@ + + +# Quarantined issues + +Please do not update status or list of the issues manually. It is automatically updated during +Quarantine workflow, when the workflow executes in the context of Apache Airflow repository. +This happens on schedule (4 times a day) or when a change has been merged or pushed +to the relevant branch. + +You can update "Comment" column in the issue list - the update process will read and preserve this column. + +# Status update +Last status update (UTC): {{ DATE_UTC_NOW }} + +# List of Quarantined issues diff --git a/scripts/ci/in_container/run_ci_tests.sh b/scripts/ci/in_container/run_ci_tests.sh index 0e50c5eb5b97..704577ead582 100755 --- a/scripts/ci/in_container/run_ci_tests.sh +++ b/scripts/ci/in_container/run_ci_tests.sh @@ -33,6 +33,31 @@ if [[ "${RES}" == "0" && ${CI:="false"} == "true" ]]; then bash <(curl -s https://codecov.io/bash) fi +MAIN_GITHUB_REPOSITORY="apache/airflow" + +if [[ ${ONLY_RUN_QUARANTINED_TESTS:=} = "true" ]]; then + if [[ ${GITHUB_REPOSITORY} == "${MAIN_GITHUB_REPOSITORY}" ]]; then + if [[ ${RES} == "1" || ${RES} == "0" ]]; then + echo + echo "Pytest exited with ${RES} result. Updating Quarantine Issue!" + echo + "${IN_CONTAINER_DIR}/update_quarantined_test_status.py" "${RESULT_LOG_FILE}" + else + echo + echo "Pytest exited with ${RES} result. NOT Updating Quarantine Issue!" + echo + fi + else + echo + echo "Github repository '${GITHUB_REPOSITORY}'. NOT Updating Quarantine Issue!" + echo + fi +else + echo + echo "Regular tests. NOT Updating Quarantine Issue!" + echo +fi + if [[ ${CI:=} == "true" ]]; then dump_airflow_logs fi diff --git a/scripts/ci/in_container/update_quarantined_test_status.py b/scripts/ci/in_container/update_quarantined_test_status.py new file mode 100755 index 000000000000..179ff91b37b4 --- /dev/null +++ b/scripts/ci/in_container/update_quarantined_test_status.py @@ -0,0 +1,243 @@ +#!/usr/bin/env python +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import os +import re +import sys +from datetime import datetime +from os.path import dirname, join, realpath +from typing import Dict, List, NamedTuple, Optional +from urllib.parse import urlsplit + +import jinja2 +from bs4 import BeautifulSoup +from github3 import login +from jinja2 import StrictUndefined +from tabulate import tabulate + + +class TestResult(NamedTuple): + test_id: str + file: str + name: str + classname: str + line: str + result: bool + + +class TestHistory(NamedTuple): + test_id: str + name: str + url: str + states: List[bool] + comment: str + + +test_results = [] + +user = "" +repo = "" +issue_id = 0 +num_runs = 10 + +url_pattern = re.compile(r'\[([^]]*)]\(([^)]*)\)') + +status_map: Dict[str, bool] = { + ":heavy_check_mark:": True, + ":x:": False, +} + +reverse_status_map: Dict[bool, str] = {status_map[key]: key for key in status_map.keys()} + + +def get_url(result: TestResult) -> str: + return f"[{result.name}](https://github.com/{user}/{repo}/blob/" \ + f"master/{result.file}?test_id={result.test_id}#L{result.line})" + + +def parse_state_history(history_string: str) -> List[bool]: + history_array = history_string.split(' ') + status_array: List[bool] = [] + for value in history_array: + if value: + status_array.append(status_map[value]) + return status_array + + +def parse_test_history(line: str) -> Optional[TestHistory]: + values = line.split("|") + match_url = url_pattern.match(values[1].strip()) + if match_url: + name = match_url.group(1) + url = match_url.group(0) + http_url = match_url.group(2) + parsed_url = urlsplit(http_url) + the_id = parsed_url[3].split("=")[1] + comment = values[4] if len(values) >= 5 else "" + # noinspection PyBroadException + try: + states = parse_state_history(values[3]) + except Exception: + states = [] + return TestHistory( + test_id=the_id, + name=name, + states=states, + url=url, + comment=comment, + ) + return None + + +def parse_body(body: str) -> Dict[str, TestHistory]: + parse = False + test_history_map: Dict[str, TestHistory] = {} + for line in body.splitlines(keepends=False): + if line.startswith("|-"): + parse = True + continue + if parse: + if not line.startswith("|"): + break + # noinspection PyBroadException + try: + status = parse_test_history(line) + except Exception: + continue + if status: + test_history_map[status.test_id] = status + return test_history_map + + +def update_test_history(history: TestHistory, last_status: bool): + print(f"Adding status to test history: {history}, {last_status}") + return TestHistory( + test_id=history.test_id, + name=history.name, + url=history.url, + states=([last_status] + history.states)[0:num_runs], + comment=history.comment, + ) + + +def create_test_history(result: TestResult) -> TestHistory: + print(f"Creating test history {result}") + return TestHistory( + test_id=result.test_id, + name=result.name, + url=get_url(result), + states=[result.result], + comment="" + ) + + +def get_history_status(history: TestHistory): + if len(history.states) < num_runs: + if all(history.states): + return "So far, so good" + return "Flaky" + if all(history.states): + return "Stable" + if all(history.states[0:num_runs - 1]): + return "Just one more" + if all(history.states[0:int(num_runs / 2)]): + return "Almost there" + return "Flaky" + + +def get_table(history_map: Dict[str, TestHistory]) -> str: + headers = ["Test", "Last run", f"Last {num_runs} runs", "Status", "Comment"] + the_table: List[List[str]] = [] + for ordered_key in sorted(history_map.keys()): + history = history_map[ordered_key] + the_table.append([ + history.url, + "Succeeded" if history.states[0] else "Failed", + " ".join([reverse_status_map[state] for state in history.states]), + get_history_status(history), + history.comment + ]) + return tabulate(the_table, headers, tablefmt="github") + + +if __name__ == '__main__': + if len(sys.argv) < 2: + print("Provide XML JUNIT FILE as first argument") + sys.exit(1) + + with open(sys.argv[1], "r") as f: + text = f.read() + y = BeautifulSoup(text, "html.parser") + res = y.testsuites.testsuite.findAll("testcase") + for test in res: + print("Parsing: " + test['classname'] + "::" + test['name']) + if len(test.contents) > 0 and test.contents[0].name == 'skipped': + print(f"skipping {test['name']}") + continue + test_results.append(TestResult( + test_id=test['classname'] + "::" + test['name'], + file=test['file'], + line=test['line'], + name=test['name'], + classname=test['classname'], + result=len(test.contents) == 0 + )) + + token = os.environ.get("GITHUB_TOKEN") + print(f"Token: {token}") + github_repository = os.environ.get('GITHUB_REPOSITORY') + if not github_repository: + raise Exception("Github Repository must be defined!") + user, repo = github_repository.split("/") + print(f"User: {user}, Repo: {repo}") + issue_id = int(os.environ.get('ISSUE_ID', 0)) + num_runs = int(os.environ.get('NUM_RUNS', 10)) + + if issue_id == 0: + raise Exception("You need to define ISSUE_ID as environment variable") + + gh = login(token=token) + + quarantined_issue = gh.issue(user, repo, issue_id) + print("-----") + print(quarantined_issue.body) + print("-----") + parsed_test_map = parse_body(quarantined_issue.body) + new_test_map: Dict[str, TestHistory] = {} + + for test_result in test_results: + previous_results = parsed_test_map.get(test_result.test_id) + if previous_results: + updated_results = update_test_history( + previous_results, test_result.result) + new_test_map[previous_results.test_id] = updated_results + else: + new_history = create_test_history(test_result) + new_test_map[new_history.test_id] = new_history + table = get_table(new_test_map) + print() + print("Result:") + print() + print(table) + print() + with open(join(dirname(realpath(__file__)), "quarantine_issue_header.md"), "r") as f: + header = jinja2.Template(f.read(), autoescape=True, undefined=StrictUndefined).\ + render(DATE_UTC_NOW=datetime.utcnow()) + quarantined_issue.edit(title=None, + body=header + "\n\n" + str(table), + state='open' if len(test_results) > 0 else 'closed') diff --git a/scripts/ci/libraries/_initialization.sh b/scripts/ci/libraries/_initialization.sh index 9d791fbed517..965ad052cfac 100644 --- a/scripts/ci/libraries/_initialization.sh +++ b/scripts/ci/libraries/_initialization.sh @@ -437,6 +437,14 @@ function get_environment_for_builds_on_ci() { export CI_SOURCE_REPO="${CI_TARGET_REPO}" export CI_SOURCE_BRANCH="${CI_TARGET_BRANCH}" fi + elif [[ "${LOCAL_CI_TESTING:=}" == "true" ]]; then + export CI_TARGET_REPO="apache/airflow" + export CI_TARGET_BRANCH="${DEFAULT_BRANCH:="master"}" + export CI_BUILD_ID="0" + export CI_JOB_ID="0" + export CI_EVENT_TYPE="pull_request" + export CI_SOURCE_REPO="apache/airflow" + export CI_SOURCE_BRANCH="${DEFAULT_BRANCH:="master"}" else export CI_SOURCE_REPO="${CI_TARGET_REPO}" export CI_SOURCE_BRANCH="${CI_TARGET_BRANCH}" diff --git a/setup.py b/setup.py index 181b24b6f4d3..2e315cc45fb9 100644 --- a/setup.py +++ b/setup.py @@ -415,6 +415,7 @@ def write_version(filename=os.path.join(*[my_dir, "airflow", "git_version"])): 'flake8-colors', 'flaky', 'freezegun', + 'gitpython', 'ipdb', 'jira', 'mock;python_version<"3.3"', @@ -428,7 +429,7 @@ def write_version(filename=os.path.join(*[my_dir, "airflow", "git_version"])): 'pytest-cov', 'pytest-instafail', 'pytest-rerunfailures', - 'pytest-timeout', + 'pytest-timeouts', 'pytest-xdist', 'pywinrm', 'qds-sdk>=1.9.6',