-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-10623] Add workflow to run Beam python tests on Linux/Windows/Mac platforms #12452
[BEAM-10623] Add workflow to run Beam python tests on Linux/Windows/Mac platforms #12452
Conversation
example run executed on my fork (all jobs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very reasonable. Asked a few questions. I would also like @tvalentyn to review.
.github/workflows/python_tests.yml
Outdated
|
||
build_python_sdk_source: | ||
name: 'Build Python SDK Source' | ||
if: github.repository_owner == 'apache' && (github.event_name == 'push' || github.event_name == 'schedule') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the size of this run, maybe we can limit this to 'schedule' only. Do you know how long it takes to run this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current schedule is once a day as per line 22. How about we run it every 4 hours?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, all actions workflow run in parallel, right? so it takes roughly the same time as running a regular precommit? In such case we could consider running it on PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does github impose any limits on how many workflow could run in parallel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Full workflow run takes approximately 30min. check on PR ~15 minutes.
IMHO It is not short, but also not too long.
If you thinks it is still too long, maybe we could keep schedule and push to master/release branch events only? WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does github impose any limits on how many workflow could run in parallel?
Limits for concurrent jobs are between 20-180 depends on the plan (docs).
I heard that apache has Enterprise plan fot gh-actions but I am not sure about it. Do you have more information about it @gmcdonald ?
.github/workflows/python_tests.yml
Outdated
run: pip install wheel | ||
- name: Build source | ||
working-directory: ./sdks/python | ||
run: python setup.py sdist --formats=zip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does sdist require wheel? If sdist contains binaries, are they compatible with other python version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked it and sdist does not require wheel. Source distribution does not contain binaries and is not python/platform specific. I simplified both this job and passing source dist to dataflow job.
@@ -66,6 +66,7 @@ class InteractiveRunnerTest(unittest.TestCase): | |||
def setUp(self): | |||
ie.new_env() | |||
|
|||
@unittest.skipIf(sys.platform == "win32", "[BEAM-10627]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kevingg - assigned this jira to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Acked. It's a duplicate of BEAM-10514. The reason is the file system in test has a temp directory with a very long prefix, causing the file path exceeding windows default 260 character limit.
A normal windows temp directory prefix: c:\windows\temp\
,
In the test: D:\a\beam\beam\sdks\python\target\.tox\py35-win\tmp\
.
The length diff breaks the limit since usually Interactive Beam uses about 240 characters in path length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kevingg . Should I put BEAM-10514
here or keep BEAM-10627
?
@@ -284,6 +284,7 @@ def create_options(self): | |||
return options | |||
|
|||
|
|||
@pytest.mark.skipif(sys.platform == "win32", reason="[BEAM-10625]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tvalentyn - - assigned this jira to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a look and replied on the bug. It may be an issue in GRPC or the networking configuration of the Windows host that Github uses. I do not see this error in the test suite that we run internally (on Py2.7).
@TobKed do you have a recommendation how Beam Devs could recreate the same environment as used by GitHub actions for the purposes of debugging a test or run a single test via actions? It would be good to document this in our Actions users guide.
The goal is to minimize the feedback loop as much as possible.
Feel free to reply on https://issues.apache.org/jira/browse/BEAM-10625 to keep this conversation in one place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I added comment about specific environment in the Jira. I will add/extend documentation since #12405 is merged.
@@ -284,6 +284,7 @@ def create_options(self): | |||
return options | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we using pytest here insted of unittest? @tvalentyn
@@ -174,6 +175,7 @@ def test_infer_typehints_schema(self, _, data, schema): | |||
@parameterized.expand([(d["name"], d["data"], d["pyarrow_schema"]) | |||
for d in TEST_DATA]) | |||
@unittest.skipIf(pa is None, "PyArrow is not installed") | |||
@unittest.skipIf(sys.platform == "win32", "[BEAM-10624]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TheNeuralBit - - assigned this jira to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack I'll take a look. It may just take a pyarrow upgrade, which will be unblocked once we drop python 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pyarrow upgrade can be decoupled from dropping py2, unless we need code changes in Beam:
Line 157 in 5ded4ae
('pyarrow>=0.15.1,<0.18.0; python_version >= "3.0" or ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it's not actually related to pyarrow version. I think we should fix it by making the dtype explicit for the numpy arrays created here:
("d", np.array([1, 2, 3])), |
I have a personal Windows machine I could clone Beam on and verify, but it might be simpler to just test the fix as part of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TheNeuralBit could take a look on 4004309 , please?
Is it the solution you mentioned?
I tested it on my local fork (run results here) and it seems to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep that's what I was suggesting!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @TheNeuralBit. I cherry-picked change to the PR.
fail-fast: false | ||
matrix: | ||
os: [ubuntu-latest, macos-latest, windows-latest] | ||
python: [3.5, 3.6, 3.7, 3.8] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to extract py versions into a constant on top of the file so that we have to modify them in one place when we add/remove a version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately it is not possible right now. At least not in convenient way, especially that in this workflow there are two two type of slightly different matrixes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @TobKed .
General comments/questions:
- Is there a way how an author or a reviewer can rerun the action to retry the test or the only way is to re-push the PR?
- When we run unit tests, we create xml files with test results (see
generated xml file: D:\a\beam\beam\sdks\python\pytest_py36-win.xml
entries in console logs). It would be great if we could somehow visualize that to see failing tests/messages easily without having to dig through console logs. Are there any options to do that? This could arrive in future PRs.
# https://github.com/protocolbuffers/protobuf/issues/7765 | ||
- os: windows-latest | ||
python: 3.8 | ||
steps: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do actions allow to extract a common subroutine and call it from other actions? For example following steps seem to be repeated several times, and perhaps could be defined in one place:
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Install python
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python }}
- name: Get build dependencies
working-directory: ./sdks/python
run: pip install -r build-requirements.txt
- name: Install requirements
working-directory: ./sdks/python
run: pip install setuptools --upgrade && pip install -e .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right know it is not possible, however GitHub Actions is evolving so fast it should available soon.
Related links about upcoming related feature:
actions/runner#438
actions/runner#554
@@ -296,8 +296,10 @@ def test_sink_transform_int96(self): | |||
path, self.SCHEMA96, num_shards=1, shard_name_template='') | |||
|
|||
def test_sink_transform(self): | |||
with tempfile.NamedTemporaryFile() as dst: | |||
path = dst.name | |||
fd, path = tempfile.mkstemp() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the context of these changes?
(note: it's would be better to have them in a separate commit and tag a JIRA in that commit).
Will #12150 (comment) work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was problem on windows since with tempfile.NamedTemporaryFile() as dst
provide tempfile
which is already opened. On windows any opened file seems to be read only which coused PermissionError. I know this workaround is ugly but it the best way I found to do it.
Is separate commit with JIRA will not be lost after merging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use tempfile.TemporaryDirectory()
to generate a fresh temp directory and use any name as a file name within this directory with the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated it with tempfile.TemporaryDirectory()
, looks much more better and intuitive now. Thanks a lot!
@@ -284,6 +284,7 @@ def create_options(self): | |||
return options | |||
|
|||
|
|||
@pytest.mark.skipif(sys.platform == "win32", reason="[BEAM-10625]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a look and replied on the bug. It may be an issue in GRPC or the networking configuration of the Windows host that Github uses. I do not see this error in the test suite that we run internally (on Py2.7).
@TobKed do you have a recommendation how Beam Devs could recreate the same environment as used by GitHub actions for the purposes of debugging a test or run a single test via actions? It would be good to document this in our Actions users guide.
The goal is to minimize the feedback loop as much as possible.
Feel free to reply on https://issues.apache.org/jira/browse/BEAM-10625 to keep this conversation in one place.
@@ -87,7 +87,8 @@ def _verify_fn_log_handler(self, num_log_entries): | |||
self.assertEqual( | |||
'%s: %s' % (msg, num_received_log_entries), log_entry.message) | |||
self.assertTrue( | |||
re.match(r'.*/log_handler_test.py:\d+', log_entry.log_location), | |||
re.match( | |||
r'.*(/|\\)log_handler_test.py:\d+', log_entry.log_location), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use os.sep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added fix. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used rf'.*{os.sep}log_handler_test.py:\d+'
but it caused:
....
raise source.error("bad escape %s" % escape, len(escape))
sre_constants.error: bad escape \l at position 2
https://github.com/apache/beam/pull/12452/checks?check_run_id=949595653
I reverted change.
Should I use it somehow differently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use re.escape()
. Not a big deal since this is a test, feel free to leave as is or simplify to r'.*log_handler_test.py:\d+'
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had problems with re.escape()
as well (escaped \d
as well). For the most clean solution I just simplified it as you suggested it.
@@ -174,6 +175,7 @@ def test_infer_typehints_schema(self, _, data, schema): | |||
@parameterized.expand([(d["name"], d["data"], d["pyarrow_schema"]) | |||
for d in TEST_DATA]) | |||
@unittest.skipIf(pa is None, "PyArrow is not installed") | |||
@unittest.skipIf(sys.platform == "win32", "[BEAM-10624]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pyarrow upgrade can be decoupled from dropping py2, unless we need code changes in Beam:
Line 157 in 5ded4ae
('pyarrow>=0.15.1,<0.18.0; python_version >= "3.0" or ' |
|
||
fd, path = tempfile.mkstemp() | ||
try: | ||
os.close(fd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above - this approach is not intuitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT about adding comment in such places with JIRA ? (as you mentioned creating JIRA in #12452 (comment))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reference to JIRA in #12452 (comment) was that you could make test changes in a separate commit such as:
[BEAM-123456] Fix tests that don't pass on Windows
and have another commit:
[BEAM-10623] Add workflow to run Beam python tests on Linux/Windows/Mac platforms
.
In this case the reason behind not obvious code would be easier to infer from commit history.
.github/workflows/python_tests.yml
Outdated
|
||
build_python_sdk_source: | ||
name: 'Build Python SDK Source' | ||
if: github.repository_owner == 'apache' && (github.event_name == 'push' || github.event_name == 'schedule') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, all actions workflow run in parallel, right? so it takes roughly the same time as running a regular precommit? In such case we could consider running it on PR.
.github/workflows/python_tests.yml
Outdated
|
||
|
||
env: | ||
GCP_WHEELS_STAGING_PATH: "gs://${{ secrets.GCP_BUCKET }}/${GITHUB_REF##*/}/${GITHUB_SHA}-${GITHUB_RUN_ID}/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unnecessary copy-paste leftover. Deleted.
@tvalentyn answering your question:
Building sdk source dist is used as prerequisite for |
0d41dd7
to
b7883ff
Compare
@tvalentyn thanks for review.
There is an event type called
I am not sure about visualisation, I have to investigate it more depth, however I add uploads of xml files as artifacts. |
Not sure I follow. I meant to say: if Github Action test suites run comparative amount of time as Jenkins precommit suites, we can continue running Github Action test suites on each PR, assuming Actions Quota is not a concern. I wanted to know whether different suites from the matrix (Py3.5 on Windows, Py3.6 on MacOS) run in parallel or not. |
Yes, they run in parallel. It is not always the case since parallel jobs are limited to some amount, but AFAIK since apache has Enteprise plan so this number is a quite big (github actions docs - usage limits) |
a600b5c
to
8ead399
Compare
I rebased on the latest master, added documentation and badges, applied |
CI.md
Outdated
| Job | Description | Pull Request Run | Direct Push/Merge Run | Scheduled Run | Requires GCP Credentials | | ||
|-------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------|-----------------------|---------------|--------------------------| | ||
| Check are GCP variables set | Checks are GCP variables are set. Jobs which required them depends on the output of this job. | Yes | Yes | Yes | Yes/No | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Consider using Checks that GCP variables are set
or Check GCP variables
throughout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used Check GCP variables
and also changed it in workflow itself. Looks much more cleaner and simpler.
CI.md
Outdated
* Storage Object Admin (roles/storage.objectAdmin) | ||
| Job | Description | Pull Request Run | Direct Push/Merge Run | Scheduled Run | Requires GCP Credentials | | ||
|----------------------------------|-----------------------------------------------------------------------------------------------------------------------|------------------|-----------------------|---------------|--------------------------| | ||
| Check are GCP variables set | Checks are GCP variables are set. Jobs which required them depends on the output of this job. | Yes | Yes | Yes | Yes/No | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: depend
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
CI.md
Outdated
Some of the jobs require variables stored as a [GitHub Secrets](https://docs.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets) | ||
to perform operations on Google Cloud Platform. Currently these jobs are limited to Apache repository only. | ||
These variables are: | ||
* `GCP_PROJECT_ID` - ID of the Google Cloud project. Apache/Beam repository has it set to `apache-beam-testing`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my education, which repository we refer to here? Also, I would not hard code existing values in documentation, since the source-of-truth may be updated, and documentation may not. You could say: For example apache-beam-testing
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I changed it referring values as examples.
CI.md
Outdated
@@ -75,8 +75,28 @@ run categories. Here is a summary of the run categories with regards of the jobs | |||
Those jobs often have matrix run strategy which runs several different variations of the jobs | |||
(with different platform type / Python version to run for example) | |||
|
|||
### Google Cloud Platform Credentials | |||
|
|||
Some of the jobs require variables stored as a [GitHub Secrets](https://docs.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: as [GitHub Secrets]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
@@ -334,3 +335,19 @@ def open_shards(glob_pattern, mode='rt', encoding='utf-8'): | |||
out_file.write(in_file.read()) | |||
concatenated_file_name = out_file.name | |||
return io.open(concatenated_file_name, mode, encoding=encoding) | |||
|
|||
|
|||
class TemporaryDirectory(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I forgot we are still on Py2, but very soon we can start removing Py2 support.
We actually defined this at least twice already.
class TemporaryDirectory: |
class TempDir(object): |
Feel free to reuse one of them and add a comment # TODO(BEAM-8371): Use tempfile.TemporaryDirectory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I was not aware of them. I used this first one since it returns path on enter.
Thanks, @TobKed , I don't have additional comments. Once all reviewers, whose feedback you expect, give their LGTM, please squash fixup commits, and leave only the commits that you want to be added to the commit history. |
6d56be7
to
e749602
Compare
I found I missed to use Thanks for reviews @tvalentyn and all people involved in this PR. I will create PR to Apache Infra with request for setting up proper secrets. |
Run Python PreCommit |
Run Python2_PVR_Flink PreCommit |
Run Python PreCommit |
Created single PR to infra with all required changes for github secrets: https://issues.apache.org/jira/browse/INFRA-20675 |
|
Dataflow tests for automated execution are enabled as they were in previous commits (events: push/merge, scheduled). Behavior changed for manual trigger. When triggered by manual button they may be able do choose are dataflow tests shall run or not. |
e749602
to
ad8f04c
Compare
I rebased on the latest master and solved conflicts. |
Thanks, looks like 'Upload test logs' is flaky:
Is there a way we can configure additional retries there? Also, how a PR author will re-run the actions tests on their PR? Is it possible to rerun only failed tests? (I remember I asked this question before, but I forgot what the answer was) |
Found your reply above and also found a |
ad8f04c
to
7b19dc5
Compare
I rebased on the latest master and solved conflicts in CI.md. @tvalentyn regarding flaky upload I tested it so many times and it never happened before. IMHO it was a temporary problem with GitHub Actions itself. |
@tvalentyn @aaltay I see that https://issues.apache.org/jira/browse/INFRA-20675 is done and just waits for confirmation. |
Ack, thanks. Is this ready to merge? |
@tvalentyn yes |
Thank you! |
Workflow run with python tests
Before merging
Before merging it is required to setup related secrets:
GCP_TESTING_BUCKET
- beam-github-actions-testsGCP_PROJECT_ID
- apache-beam-testingGCP_REGION
- us-central1Before merging it Service Account should have elevated permissions:
roles/storage.admin
)roles/dataflow.admin
)it was requested in https://issues.apache.org/jira/browse/INFRA-20675
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).[BEAM-XXX] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.