Sklearn Mnist example and IT test #21781

AnandInguva · 2022-06-09T22:11:37Z

Scikit learn example that runs on MNIST data. An IT test that runs and asserts the output on subset of MNIST data.
Also, a pytest marker that collects all the Inference IT tests and run them in them PostCommit suite on Direct Runner.

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

Choose reviewer(s) and mention them in a comment (R: @username).
Mention the appropriate issue in your description (for example: "addresses [BEAM-121] Add DisplayData for IO transforms #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment "fixes #" instead.
Update CHANGES.md with noteworthy changes.
If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

See CI.md for more information about GitHub Actions CI.

AnandInguva · 2022-06-09T22:12:26Z

Run Python 3.8 PostCommit

asf-ci · 2022-06-09T22:21:14Z

Can one of the admins verify this patch?

asf-ci · 2022-06-09T22:21:14Z

Can one of the admins verify this patch?

asf-ci · 2022-06-09T22:21:14Z

Can one of the admins verify this patch?

asf-ci · 2022-06-09T22:21:14Z

Can one of the admins verify this patch?

asf-ci · 2022-06-09T22:21:14Z

Can one of the admins verify this patch?

AnandInguva · 2022-06-09T22:27:33Z

Run Python 3.9 PostCommit

sdks/python/apache_beam/examples/inference/sklearn_mnist_classification.py

tvalentyn · 2022-06-15T12:44:06Z

sdks/python/apache_beam/ml/inference/sklearn_inference_it_test.py

+  @pytest.mark.it_postcommit
+  def test_predictions_output_file(self):
+    test_pipeline = TestPipeline(is_integration_test=True)
+    input_file = 'gs://apache-beam-ml/testing/inputs/it_mnist_data.csv'


if this will have to be downloaded separately, we should mention necessary instructions. You probably want to add a section in https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/inference/README.md.

Shouldn't this go under apache-beam-ml/datasets/?

This test would be internal test. Also sickbayed it for now. There are is PR in work #21887 on how to run the sample

sdks/python/apache_beam/ml/inference/sklearn_inference_it_test.py

yeandy

Do you want me to help with writing the README?

yeandy · 2022-06-15T13:17:53Z

sdks/python/apache_beam/examples/inference/sklearn_mnist_classification.py

+            Tuple[int, PredictionResult])
+        | "PostProcessor" >> beam.ParDo(PostProcessor()))
+
+    if known_args.output:


Make output required. Remove if statement.

yeandy · 2022-06-15T13:19:27Z

sdks/python/apache_beam/ml/inference/sklearn_inference_it_test.py

+  @pytest.mark.it_postcommit
+  def test_predictions_output_file(self):
+    test_pipeline = TestPipeline(is_integration_test=True)
+    input_file = 'gs://apache-beam-ml/testing/inputs/it_mnist_data.csv'


Shouldn't this go under apache-beam-ml/datasets/?

sdks/python/apache_beam/examples/inference/sklearn_mnist_classification.py

yeandy · 2022-06-15T13:20:00Z

sdks/python/apache_beam/examples/inference/sklearn_mnist_classification.py

+      dest='input',
+      help='CSV file with row containing label and pixel values.')
+  parser.add_argument(
+      '--output', dest='output', help='Path to save output predictions.')


Add required=True

sdks/python/apache_beam/examples/inference/sklearn_mnist_classification.py

AnandInguva · 2022-06-15T15:13:01Z

refactored the code with recent changes. Added the test but sickbayed it for now. Adding the issue here: #21859 to unskip the tests.

PTAL @ryanthompson591 @tvalentyn

codecov · 2022-06-15T15:53:53Z

Codecov Report

Merging #21781 (2c61681) into master (b1a313e) will decrease coverage by 0.00%.
The diff coverage is 47.50%.

❗ Current head 2c61681 differs from pull request most recent head fa42b67. Consider uploading reports for the commit fa42b67 to get more accurate results

@@            Coverage Diff             @@
##           master   #21781      +/-   ##
==========================================
- Coverage   74.01%   74.00%   -0.01%     
==========================================
  Files         699      700       +1     
  Lines       92675    92715      +40     
==========================================
+ Hits        68592    68614      +22     
- Misses      22828    22846      +18     
  Partials     1255     1255

Flag	Coverage Δ
python	`83.64% <47.50%> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...examples/inference/sklearn_mnist_classification.py	`47.50% <47.50%> (ø)`
sdks/python/apache_beam/io/localfilesystem.py	`90.97% <0.00%> (-0.76%)`	⬇️
...eam/runners/portability/fn_api_runner/execution.py	`92.44% <0.00%> (-0.65%)`	⬇️
sdks/python/apache_beam/transforms/util.py	`96.06% <0.00%> (-0.16%)`	⬇️
...examples/inference/pytorch_image_classification.py	`0.00% <0.00%> (ø)`
sdks/python/apache_beam/runners/direct/executor.py	`97.01% <0.00%> (+0.54%)`	⬆️
...hon/apache_beam/runners/direct/test_stream_impl.py	`94.02% <0.00%> (+0.74%)`	⬆️
...che_beam/runners/interactive/interactive_runner.py	`91.39% <0.00%> (+1.32%)`	⬆️
...python/apache_beam/runners/worker/worker_status.py	`79.71% <0.00%> (+1.44%)`	⬆️
.../python/apache_beam/testing/test_stream_service.py	`92.85% <0.00%> (+4.76%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b1a313e...fa42b67. Read the comment docs.

AnandInguva · 2022-06-15T16:15:32Z

cc: @yeandy Added the gradle task

sdks/python/test-suites/direct/common.gradle

Co-authored-by: Andy Ye <andyye333@gmail.com>

AnandInguva · 2022-06-15T18:42:44Z

PTAL @tvalentyn
R @TheNeuralBit

AnandInguva · 2022-06-15T19:48:12Z

PTAL @yeandy @tvalentyn

sdks/python/pytest.ini

sdks/python/test-suites/direct/common.gradle

Co-authored-by: Andy Ye <andyye333@gmail.com>

yeandy · 2022-06-15T20:00:44Z

Can you also uncomment pytest.skip and confirm that gradlew :sdks:python:test-suites:direct:py37:inferencePostCommitIT runs successfully? @AnandInguva

sdks/python/apache_beam/ml/inference/sklearn_inference_it_test.py

AnandInguva · 2022-06-15T20:01:44Z

Can you also uncomment pytest.skip and confirm that gradlew :sdks:python:test-suites:direct:py37:inferencePostCommitIT runs successfully? @AnandInguva

I checked and it runs. I was able to collect all the Inference IT tests

AnandInguva · 2022-06-15T21:35:18Z

@pabloem test failure unrelated to the change

pabloem · 2022-06-15T21:35:21Z

lgtm thanks folks

* sklearn example and IT test * Change the example name * Refactor sklearn example * Refactor and add assertions to the sklearn test * Fixup import order * fixup: help and name * Add gradle task for sklearn IT tests * fixup lint * Update sdks/python/test-suites/direct/common.gradle Co-authored-by: Andy Ye <andyye333@gmail.com> * Change sklearn IT test marker * Uncomment * Apply suggestions from code review Co-authored-by: Andy Ye <andyye333@gmail.com> Co-authored-by: Andy Ye <andyye333@gmail.com>

github-actions bot added examples infra python labels Jun 9, 2022

tvalentyn reviewed Jun 15, 2022

View reviewed changes

yeandy reviewed Jun 15, 2022

View reviewed changes

AnandInguva added 2 commits June 15, 2022 09:29

sklearn example and IT test

29767cf

Change the example name

f7a36a6

AnandInguva force-pushed the sklearn-tests branch from 8dafc57 to fe5f829 Compare June 15, 2022 14:19

Refactor sklearn example

0be2a8b

AnandInguva force-pushed the sklearn-tests branch from fe5f829 to 0be2a8b Compare June 15, 2022 14:22

yeandy reviewed Jun 15, 2022

View reviewed changes

sdks/python/apache_beam/examples/inference/sklearn_mnist_classification.py Outdated Show resolved Hide resolved

AnandInguva added 3 commits June 15, 2022 11:05

Refactor and add assertions to the sklearn test

e4cf3f3

Fixup import order

6b6d6b1

fixup: help and name

0375d74

Add gradle task for sklearn IT tests

1ccdbf3

github-actions bot removed the infra label Jun 15, 2022

fixup lint

c472974

yeandy reviewed Jun 15, 2022

View reviewed changes

sdks/python/test-suites/direct/common.gradle Show resolved Hide resolved

sdks/python/test-suites/direct/common.gradle Outdated Show resolved Hide resolved

Update sdks/python/test-suites/direct/common.gradle

2c61681

Co-authored-by: Andy Ye <andyye333@gmail.com>

AnandInguva mentioned this pull request Jun 15, 2022

Add README documentation for scikit-learn MNIST example #21887

Merged

4 tasks

AnandInguva added 2 commits June 15, 2022 15:45

Change sklearn IT test marker

3f527be

Uncomment

d70eee6

yeandy reviewed Jun 15, 2022

View reviewed changes

sdks/python/pytest.ini Outdated Show resolved Hide resolved

sdks/python/test-suites/direct/common.gradle Outdated Show resolved Hide resolved

yeandy approved these changes Jun 15, 2022

View reviewed changes

Apply suggestions from code review

fa42b67

Co-authored-by: Andy Ye <andyye333@gmail.com>

yeandy reviewed Jun 15, 2022

View reviewed changes

sdks/python/apache_beam/ml/inference/sklearn_inference_it_test.py Show resolved Hide resolved

pabloem merged commit a06a13d into apache:master Jun 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sklearn Mnist example and IT test #21781

Sklearn Mnist example and IT test #21781

AnandInguva commented Jun 9, 2022 •

edited

Loading

AnandInguva commented Jun 9, 2022

asf-ci commented Jun 9, 2022

asf-ci commented Jun 9, 2022

asf-ci commented Jun 9, 2022

asf-ci commented Jun 9, 2022

asf-ci commented Jun 9, 2022

AnandInguva commented Jun 9, 2022

tvalentyn Jun 15, 2022

yeandy Jun 15, 2022

AnandInguva Jun 15, 2022

yeandy left a comment

yeandy Jun 15, 2022

yeandy Jun 15, 2022

yeandy Jun 15, 2022

AnandInguva commented Jun 15, 2022

codecov bot commented Jun 15, 2022 •

edited

Loading

AnandInguva commented Jun 15, 2022

AnandInguva commented Jun 15, 2022 •

edited

Loading

AnandInguva commented Jun 15, 2022

yeandy commented Jun 15, 2022 •

edited

Loading

AnandInguva commented Jun 15, 2022

AnandInguva commented Jun 15, 2022

pabloem commented Jun 15, 2022

Sklearn Mnist example and IT test #21781

Sklearn Mnist example and IT test #21781

Conversation

AnandInguva commented Jun 9, 2022 • edited Loading

GitHub Actions Tests Status (on master branch)

AnandInguva commented Jun 9, 2022

asf-ci commented Jun 9, 2022

asf-ci commented Jun 9, 2022

asf-ci commented Jun 9, 2022

asf-ci commented Jun 9, 2022

asf-ci commented Jun 9, 2022

AnandInguva commented Jun 9, 2022

tvalentyn Jun 15, 2022

Choose a reason for hiding this comment

yeandy Jun 15, 2022

Choose a reason for hiding this comment

AnandInguva Jun 15, 2022

Choose a reason for hiding this comment

yeandy left a comment

Choose a reason for hiding this comment

yeandy Jun 15, 2022

Choose a reason for hiding this comment

yeandy Jun 15, 2022

Choose a reason for hiding this comment

yeandy Jun 15, 2022

Choose a reason for hiding this comment

AnandInguva commented Jun 15, 2022

codecov bot commented Jun 15, 2022 • edited Loading

Codecov Report

AnandInguva commented Jun 15, 2022

AnandInguva commented Jun 15, 2022 • edited Loading

AnandInguva commented Jun 15, 2022

yeandy commented Jun 15, 2022 • edited Loading

AnandInguva commented Jun 15, 2022

AnandInguva commented Jun 15, 2022

pabloem commented Jun 15, 2022

AnandInguva commented Jun 9, 2022 •

edited

Loading

codecov bot commented Jun 15, 2022 •

edited

Loading

AnandInguva commented Jun 15, 2022 •

edited

Loading

yeandy commented Jun 15, 2022 •

edited

Loading