Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-10623] Add workflow to run Beam python tests on Linux/Windows/Mac platforms #12452

Merged
merged 2 commits into from
Aug 18, 2020

Conversation

TobKed
Copy link
Contributor

@TobKed TobKed commented Aug 3, 2020

Workflow run with python tests

Before merging

Before merging it is required to setup related secrets:

  • GCP_TESTING_BUCKET - beam-github-actions-tests
  • GCP_PROJECT_ID - apache-beam-testing
  • GCP_REGION - us-central1

Before merging it Service Account should have elevated permissions:

  • Storage Admin (roles/storage.admin)
  • Dataflow Admin (roles/dataflow.admin)

it was requested in https://issues.apache.org/jira/browse/INFRA-20675


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

Post-Commit Tests Status (on master branch)

Lang SDK Dataflow Flink Samza Spark Twister2
Go Build Status --- Build Status --- Build Status ---
Java Build Status Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status
Build Status
Build Status
Build Status
Python Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
--- Build Status ---
XLang Build Status --- Build Status --- Build Status ---

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website
Non-portable Build Status Build Status
Build Status
Build Status
Build Status Build Status
Portable --- Build Status --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

@TobKed
Copy link
Contributor Author

TobKed commented Aug 3, 2020

@TobKed
Copy link
Contributor Author

TobKed commented Aug 3, 2020

example run executed on my fork (all jobs):
https://github.com/TobKed/beam/actions/runs/193255595

Copy link
Member

@aaltay aaltay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very reasonable. Asked a few questions. I would also like @tvalentyn to review.


build_python_sdk_source:
name: 'Build Python SDK Source'
if: github.repository_owner == 'apache' && (github.event_name == 'push' || github.event_name == 'schedule')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the size of this run, maybe we can limit this to 'schedule' only. Do you know how long it takes to run this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current schedule is once a day as per line 22. How about we run it every 4 hours?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, all actions workflow run in parallel, right? so it takes roughly the same time as running a regular precommit? In such case we could consider running it on PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does github impose any limits on how many workflow could run in parallel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full workflow run takes approximately 30min. check on PR ~15 minutes.
IMHO It is not short, but also not too long.
If you thinks it is still too long, maybe we could keep schedule and push to master/release branch events only? WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does github impose any limits on how many workflow could run in parallel?

Limits for concurrent jobs are between 20-180 depends on the plan (docs).
I heard that apache has Enterprise plan fot gh-actions but I am not sure about it. Do you have more information about it @gmcdonald ?

run: pip install wheel
- name: Build source
working-directory: ./sdks/python
run: python setup.py sdist --formats=zip
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does sdist require wheel? If sdist contains binaries, are they compatible with other python version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked it and sdist does not require wheel. Source distribution does not contain binaries and is not python/platform specific. I simplified both this job and passing source dist to dataflow job.

@@ -66,6 +66,7 @@ class InteractiveRunnerTest(unittest.TestCase):
def setUp(self):
ie.new_env()

@unittest.skipIf(sys.platform == "win32", "[BEAM-10627]")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevingg - assigned this jira to you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acked. It's a duplicate of BEAM-10514. The reason is the file system in test has a temp directory with a very long prefix, causing the file path exceeding windows default 260 character limit.

A normal windows temp directory prefix: c:\windows\temp\,
In the test: D:\a\beam\beam\sdks\python\target\.tox\py35-win\tmp\.

The length diff breaks the limit since usually Interactive Beam uses about 240 characters in path length.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kevingg . Should I put BEAM-10514 here or keep BEAM-10627?

@@ -284,6 +284,7 @@ def create_options(self):
return options


@pytest.mark.skipif(sys.platform == "win32", reason="[BEAM-10625]")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tvalentyn - - assigned this jira to you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a look and replied on the bug. It may be an issue in GRPC or the networking configuration of the Windows host that Github uses. I do not see this error in the test suite that we run internally (on Py2.7).

@TobKed do you have a recommendation how Beam Devs could recreate the same environment as used by GitHub actions for the purposes of debugging a test or run a single test via actions? It would be good to document this in our Actions users guide.

The goal is to minimize the feedback loop as much as possible.

Feel free to reply on https://issues.apache.org/jira/browse/BEAM-10625 to keep this conversation in one place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I added comment about specific environment in the Jira. I will add/extend documentation since #12405 is merged.

@@ -284,6 +284,7 @@ def create_options(self):
return options


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we using pytest here insted of unittest? @tvalentyn

@@ -174,6 +175,7 @@ def test_infer_typehints_schema(self, _, data, schema):
@parameterized.expand([(d["name"], d["data"], d["pyarrow_schema"])
for d in TEST_DATA])
@unittest.skipIf(pa is None, "PyArrow is not installed")
@unittest.skipIf(sys.platform == "win32", "[BEAM-10624]")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TheNeuralBit - - assigned this jira to you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack I'll take a look. It may just take a pyarrow upgrade, which will be unblocked once we drop python 2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pyarrow upgrade can be decoupled from dropping py2, unless we need code changes in Beam:

('pyarrow>=0.15.1,<0.18.0; python_version >= "3.0" or '

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's not actually related to pyarrow version. I think we should fix it by making the dtype explicit for the numpy arrays created here:

I have a personal Windows machine I could clone Beam on and verify, but it might be simpler to just test the fix as part of this PR.

Copy link
Contributor Author

@TobKed TobKed Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TheNeuralBit could take a look on 4004309 , please?
Is it the solution you mentioned?
I tested it on my local fork (run results here) and it seems to work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that's what I was suggesting!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @TheNeuralBit. I cherry-picked change to the PR.

@aaltay aaltay requested a review from tvalentyn August 3, 2020 23:37
fail-fast: false
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
python: [3.5, 3.6, 3.7, 3.8]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to extract py versions into a constant on top of the file so that we have to modify them in one place when we add/remove a version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it is not possible right now. At least not in convenient way, especially that in this workflow there are two two type of slightly different matrixes.

Copy link
Contributor

@tvalentyn tvalentyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @TobKed .

General comments/questions:

  • Is there a way how an author or a reviewer can rerun the action to retry the test or the only way is to re-push the PR?
  • When we run unit tests, we create xml files with test results (see generated xml file: D:\a\beam\beam\sdks\python\pytest_py36-win.xml entries in console logs). It would be great if we could somehow visualize that to see failing tests/messages easily without having to dig through console logs. Are there any options to do that? This could arrive in future PRs.

# https://github.com/protocolbuffers/protobuf/issues/7765
- os: windows-latest
python: 3.8
steps:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do actions allow to extract a common subroutine and call it from other actions? For example following steps seem to be repeated several times, and perhaps could be defined in one place:

    steps:
      - name: Checkout code
        uses: actions/checkout@v2
      - name: Install python
        uses: actions/setup-python@v2
        with:
          python-version: ${{ matrix.python }}
      - name: Get build dependencies
        working-directory: ./sdks/python
        run: pip install -r build-requirements.txt
      - name: Install requirements
        working-directory: ./sdks/python
        run: pip install setuptools --upgrade && pip install -e .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right know it is not possible, however GitHub Actions is evolving so fast it should available soon.
Related links about upcoming related feature:
actions/runner#438
actions/runner#554

@@ -296,8 +296,10 @@ def test_sink_transform_int96(self):
path, self.SCHEMA96, num_shards=1, shard_name_template='')

def test_sink_transform(self):
with tempfile.NamedTemporaryFile() as dst:
path = dst.name
fd, path = tempfile.mkstemp()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the context of these changes?
(note: it's would be better to have them in a separate commit and tag a JIRA in that commit).

Will #12150 (comment) work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was problem on windows since with tempfile.NamedTemporaryFile() as dst provide tempfile which is already opened. On windows any opened file seems to be read only which coused PermissionError. I know this workaround is ugly but it the best way I found to do it.
Is separate commit with JIRA will not be lost after merging?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use tempfile.TemporaryDirectory() to generate a fresh temp directory and use any name as a file name within this directory with the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated it with tempfile.TemporaryDirectory(), looks much more better and intuitive now. Thanks a lot!

@@ -284,6 +284,7 @@ def create_options(self):
return options


@pytest.mark.skipif(sys.platform == "win32", reason="[BEAM-10625]")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a look and replied on the bug. It may be an issue in GRPC or the networking configuration of the Windows host that Github uses. I do not see this error in the test suite that we run internally (on Py2.7).

@TobKed do you have a recommendation how Beam Devs could recreate the same environment as used by GitHub actions for the purposes of debugging a test or run a single test via actions? It would be good to document this in our Actions users guide.

The goal is to minimize the feedback loop as much as possible.

Feel free to reply on https://issues.apache.org/jira/browse/BEAM-10625 to keep this conversation in one place.

@@ -87,7 +87,8 @@ def _verify_fn_log_handler(self, num_log_entries):
self.assertEqual(
'%s: %s' % (msg, num_received_log_entries), log_entry.message)
self.assertTrue(
re.match(r'.*/log_handler_test.py:\d+', log_entry.log_location),
re.match(
r'.*(/|\\)log_handler_test.py:\d+', log_entry.log_location),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use os.sep

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added fix. Thanks.

Copy link
Contributor Author

@TobKed TobKed Aug 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used rf'.*{os.sep}log_handler_test.py:\d+' but it caused:

   ....
    raise source.error("bad escape %s" % escape, len(escape))
    sre_constants.error: bad escape \l at position 2

https://github.com/apache/beam/pull/12452/checks?check_run_id=949595653
I reverted change.

Should I use it somehow differently?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use re.escape(). Not a big deal since this is a test, feel free to leave as is or simplify to r'.*log_handler_test.py:\d+'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had problems with re.escape() as well (escaped \d as well). For the most clean solution I just simplified it as you suggested it.

@@ -174,6 +175,7 @@ def test_infer_typehints_schema(self, _, data, schema):
@parameterized.expand([(d["name"], d["data"], d["pyarrow_schema"])
for d in TEST_DATA])
@unittest.skipIf(pa is None, "PyArrow is not installed")
@unittest.skipIf(sys.platform == "win32", "[BEAM-10624]")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pyarrow upgrade can be decoupled from dropping py2, unless we need code changes in Beam:

('pyarrow>=0.15.1,<0.18.0; python_version >= "3.0" or '


fd, path = tempfile.mkstemp()
try:
os.close(fd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above - this approach is not intuitive.

Copy link
Contributor Author

@TobKed TobKed Aug 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about adding comment in such places with JIRA ? (as you mentioned creating JIRA in #12452 (comment))

Copy link
Contributor

@tvalentyn tvalentyn Aug 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference to JIRA in #12452 (comment) was that you could make test changes in a separate commit such as:

[BEAM-123456] Fix tests that don't pass on Windows

and have another commit:

[BEAM-10623] Add workflow to run Beam python tests on Linux/Windows/Mac platforms.

In this case the reason behind not obvious code would be easier to infer from commit history.


build_python_sdk_source:
name: 'Build Python SDK Source'
if: github.repository_owner == 'apache' && (github.event_name == 'push' || github.event_name == 'schedule')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, all actions workflow run in parallel, right? so it takes roughly the same time as running a regular precommit? In such case we could consider running it on PR.



env:
GCP_WHEELS_STAGING_PATH: "gs://${{ secrets.GCP_BUCKET }}/${GITHUB_REF##*/}/${GITHUB_SHA}-${GITHUB_RUN_ID}/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unnecessary copy-paste leftover. Deleted.

@TobKed
Copy link
Contributor Author

TobKed commented Aug 5, 2020

@tvalentyn answering your question:

Also, all actions workflow run in parallel, right? so it takes roughly the same time as running a regular precommit? In such case we could consider running it on PR.

Building sdk source dist is used as prerequisite for python_wordcount_dataflow job . Building source distribution is already validated for PR in .github/workflows/build_wheels.yml workflow.

@TobKed TobKed force-pushed the github-actions-test-suite-python branch from 0d41dd7 to b7883ff Compare August 5, 2020 15:30
@TobKed
Copy link
Contributor Author

TobKed commented Aug 5, 2020

@tvalentyn thanks for review.

  • Is there a way how an author or a reviewer can rerun the action to retry the test or the only way is to re-push the PR?

There is an event type called workflow_dispatch which enables button to manually trigger workflow I just added it.
https://github.blog/changelog/2020-07-06-github-actions-manual-triggers-with-workflow_dispatch/

  • When we run unit tests, we create xml files with test results (see generated xml file: D:\a\beam\beam\sdks\python\pytest_py36-win.xml entries in console logs). It would be great if we could somehow visualize that to see failing tests/messages easily without having to dig through console logs. Are there any options to do that? This could arrive in future PRs.

I am not sure about visualisation, I have to investigate it more depth, however I add uploads of xml files as artifacts.

@tvalentyn
Copy link
Contributor

@tvalentyn answering your question:

Also, all actions workflow run in parallel, right? so it takes roughly the same time as running a regular precommit? In such case we could consider running it on PR.

Building sdk source dist is used as prerequisite for python_wordcount_dataflow job . Building source distribution is already validated for PR in .github/workflows/build_wheels.yml workflow.

Not sure I follow. I meant to say: if Github Action test suites run comparative amount of time as Jenkins precommit suites, we can continue running Github Action test suites on each PR, assuming Actions Quota is not a concern. I wanted to know whether different suites from the matrix (Py3.5 on Windows, Py3.6 on MacOS) run in parallel or not.

@TobKed
Copy link
Contributor Author

TobKed commented Aug 10, 2020

@tvalentyn answering your question:

Also, all actions workflow run in parallel, right? so it takes roughly the same time as running a regular precommit? In such case we could consider running it on PR.

Building sdk source dist is used as prerequisite for python_wordcount_dataflow job . Building source distribution is already validated for PR in .github/workflows/build_wheels.yml workflow.

Not sure I follow. I meant to say: if Github Action test suites run comparative amount of time as Jenkins precommit suites, we can continue running Github Action test suites on each PR, assuming Actions Quota is not a concern. I wanted to know whether different suites from the matrix (Py3.5 on Windows, Py3.6 on MacOS) run in parallel or not.

Yes, they run in parallel. It is not always the case since parallel jobs are limited to some amount, but AFAIK since apache has Enteprise plan so this number is a quite big (github actions docs - usage limits)

@TobKed TobKed force-pushed the github-actions-test-suite-python branch from a600b5c to 8ead399 Compare August 10, 2020 18:50
@TobKed
Copy link
Contributor Author

TobKed commented Aug 10, 2020

I rebased on the latest master, added documentation and badges, applied TemporaryDirectory context manager for python2.7 backward tests compatibility.

@TobKed TobKed requested review from aaltay and tvalentyn August 10, 2020 19:40
CI.md Outdated
| Job | Description | Pull Request Run | Direct Push/Merge Run | Scheduled Run | Requires GCP Credentials |
|-------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------|-----------------------|---------------|--------------------------|
| Check are GCP variables set | Checks are GCP variables are set. Jobs which required them depends on the output of this job. | Yes | Yes | Yes | Yes/No |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Consider using Checks that GCP variables are set or Check GCP variables throughout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used Check GCP variables and also changed it in workflow itself. Looks much more cleaner and simpler.

CI.md Outdated
* Storage Object Admin (roles/storage.objectAdmin)
| Job | Description | Pull Request Run | Direct Push/Merge Run | Scheduled Run | Requires GCP Credentials |
|----------------------------------|-----------------------------------------------------------------------------------------------------------------------|------------------|-----------------------|---------------|--------------------------|
| Check are GCP variables set | Checks are GCP variables are set. Jobs which required them depends on the output of this job. | Yes | Yes | Yes | Yes/No |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: depend

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

CI.md Outdated
Some of the jobs require variables stored as a [GitHub Secrets](https://docs.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets)
to perform operations on Google Cloud Platform. Currently these jobs are limited to Apache repository only.
These variables are:
* `GCP_PROJECT_ID` - ID of the Google Cloud project. Apache/Beam repository has it set to `apache-beam-testing`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my education, which repository we refer to here? Also, I would not hard code existing values in documentation, since the source-of-truth may be updated, and documentation may not. You could say: For example apache-beam-testing instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I changed it referring values as examples.

CI.md Outdated
@@ -75,8 +75,28 @@ run categories. Here is a summary of the run categories with regards of the jobs
Those jobs often have matrix run strategy which runs several different variations of the jobs
(with different platform type / Python version to run for example)

### Google Cloud Platform Credentials

Some of the jobs require variables stored as a [GitHub Secrets](https://docs.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: as [GitHub Secrets]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -334,3 +335,19 @@ def open_shards(glob_pattern, mode='rt', encoding='utf-8'):
out_file.write(in_file.read())
concatenated_file_name = out_file.name
return io.open(concatenated_file_name, mode, encoding=encoding)


class TemporaryDirectory(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I forgot we are still on Py2, but very soon we can start removing Py2 support.

We actually defined this at least twice already.

Feel free to reuse one of them and add a comment # TODO(BEAM-8371): Use tempfile.TemporaryDirectory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I was not aware of them. I used this first one since it returns path on enter.

@tvalentyn
Copy link
Contributor

tvalentyn commented Aug 11, 2020

Thanks, @TobKed , I don't have additional comments. Once all reviewers, whose feedback you expect, give their LGTM, please squash fixup commits, and leave only the commits that you want to be added to the commit history.

@TobKed TobKed force-pushed the github-actions-test-suite-python branch 3 times, most recently from 6d56be7 to e749602 Compare August 11, 2020 22:58
@TobKed
Copy link
Contributor Author

TobKed commented Aug 11, 2020

Thanks, @TobKed , I don't have additional comments. Once all reviewers, whose feedback you expect, give their LGTM, please squash fixup commits, and leave only the commits that you want to be added to the commit history.

I found I missed to use TemporaryDirectory() in sdks/python/apache_beam/typehints/typecheck_test_py3.py, I fixed it.
I also added input to manual trigger so dataflow tests have to be explicitly enabled (made them disabled by default).

Thanks for reviews @tvalentyn and all people involved in this PR. I will create PR to Apache Infra with request for setting up proper secrets.

@TobKed
Copy link
Contributor Author

TobKed commented Aug 12, 2020

Run Python PreCommit

@TobKed
Copy link
Contributor Author

TobKed commented Aug 12, 2020

Run Python2_PVR_Flink PreCommit

@TobKed
Copy link
Contributor Author

TobKed commented Aug 12, 2020

Run Python PreCommit

@TobKed
Copy link
Contributor Author

TobKed commented Aug 12, 2020

Created single PR to infra with all required changes for github secrets: https://issues.apache.org/jira/browse/INFRA-20675

@tvalentyn
Copy link
Contributor

I also added input to manual trigger so dataflow tests have to be explicitly enabled (made them disabled by default).
Are Dataflow tests enabled in automated execution or not?

@TobKed
Copy link
Contributor Author

TobKed commented Aug 12, 2020

I also added input to manual trigger so dataflow tests have to be explicitly enabled (made them disabled by default).
Are Dataflow tests enabled in automated execution or not?

Dataflow tests for automated execution are enabled as they were in previous commits (events: push/merge, scheduled). Behavior changed for manual trigger. When triggered by manual button they may be able do choose are dataflow tests shall run or not.

@TobKed TobKed force-pushed the github-actions-test-suite-python branch from e749602 to ad8f04c Compare August 17, 2020 18:44
@TobKed
Copy link
Contributor Author

TobKed commented Aug 17, 2020

I rebased on the latest master and solved conflicts.

@tvalentyn
Copy link
Contributor

Thanks, looks like 'Upload test logs' is flaky:

 Upload test logs1s
    pythonLocation: C:\hostedtoolcache\windows\Python\3.5.4\x64
Run actions/upload-artifact@v2
  with:
    name: pytest-windows-latest-3.5
    path: sdks/python/pytest**.xml
    if-no-files-found: warn
  env:
    pythonLocation: C:\hostedtoolcache\windows\Python\3.5.4\x64
With the provided path, there will be 2 file(s) uploaded
##[error]read ECONNRESET
Exponential backoff for retry #1. Waiting for 5006.55256840354 milliseconds before continuing the upload at offset 0
##[error]Unexpected response. Unable to upload chunk to https://pipelines.actions.githubusercontent.com/COKSSHEuHyOLM1pP93G8aBnLpu5tMJfYiveSzILlm0cIfDvSEl/_apis/resources/Containers/4328416?itemPath=pytest-ubuntu-latest-3.8%2Fpytest_py38_no_xdist.xml
##### Begin Diagnostic HTTP information #####
Status Code: 404
Status Message: Not Found
Header Information: {
  "content-length": "29",
  "content-type": "text/plain",
  "strict-transport-security": "max-age=2592000",
  "x-tfs-processid": "98f88b46-bb0d-4ece-af51-fbf9accea451",
  "x-tfs-serviceerror": "The+resource+cannot+be+found.",
  "x-msedge-ref": "Ref A: E85C8E28FC70417687E1D2AAE3FBDD17 Ref B: BN3EDGE0515 Ref C: 2020-08-17T18:55:44Z",
  "date": "Mon, 17 Aug 2020 18:55:53 GMT"
}
###### End Diagnostic HTTP information ######
##[warning]Aborting upload for /home/runner/work/beam/beam/sdks/python/pytest_py38_no_xdist.xml due to failure
  Post Checkout code

Is there a way we can configure additional retries there?

Also, how a PR author will re-run the actions tests on their PR? Is it possible to rerun only failed tests? (I remember I asked this question before, but I forgot what the answer was)

@tvalentyn
Copy link
Contributor

Also, how a PR author will re-run the actions tests on their PR? Is it possible to rerun only failed tests? (I remember I asked this question before, but I forgot what the answer was)

Found your reply above and also found a Rerun all jobs button, which I clicked on one of the failed jobs.

@TobKed TobKed force-pushed the github-actions-test-suite-python branch from ad8f04c to 7b19dc5 Compare August 18, 2020 15:11
@TobKed
Copy link
Contributor Author

TobKed commented Aug 18, 2020

I rebased on the latest master and solved conflicts in CI.md.

@tvalentyn regarding flaky upload I tested it so many times and it never happened before. IMHO it was a temporary problem with GitHub Actions itself.
Related issue actions/upload-artifact#116

@TobKed
Copy link
Contributor Author

TobKed commented Aug 18, 2020

@tvalentyn @aaltay I see that https://issues.apache.org/jira/browse/INFRA-20675 is done and just waits for confirmation.

@tvalentyn
Copy link
Contributor

Ack, thanks. Is this ready to merge?

@TobKed
Copy link
Contributor Author

TobKed commented Aug 18, 2020

@tvalentyn yes

@tvalentyn tvalentyn merged commit c99390a into apache:master Aug 18, 2020
@TobKed
Copy link
Contributor Author

TobKed commented Aug 19, 2020

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants