Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sklearn Mnist example and IT test #21781

Merged
merged 12 commits into from
Jun 15, 2022
Merged

Conversation

AnandInguva
Copy link
Contributor

@AnandInguva AnandInguva commented Jun 9, 2022

Scikit learn example that runs on MNIST data. An IT test that runs and asserts the output on subset of MNIST data.
Also, a pytest marker that collects all the Inference IT tests and run them in them PostCommit suite on Direct Runner.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests

See CI.md for more information about GitHub Actions CI.

@AnandInguva
Copy link
Contributor Author

Run Python 3.8 PostCommit

@asf-ci
Copy link

asf-ci commented Jun 9, 2022

Can one of the admins verify this patch?

4 similar comments
@asf-ci
Copy link

asf-ci commented Jun 9, 2022

Can one of the admins verify this patch?

@asf-ci
Copy link

asf-ci commented Jun 9, 2022

Can one of the admins verify this patch?

@asf-ci
Copy link

asf-ci commented Jun 9, 2022

Can one of the admins verify this patch?

@asf-ci
Copy link

asf-ci commented Jun 9, 2022

Can one of the admins verify this patch?

@AnandInguva
Copy link
Contributor Author

Run Python 3.9 PostCommit

@pytest.mark.it_postcommit
def test_predictions_output_file(self):
test_pipeline = TestPipeline(is_integration_test=True)
input_file = 'gs://apache-beam-ml/testing/inputs/it_mnist_data.csv'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this will have to be downloaded separately, we should mention necessary instructions. You probably want to add a section in https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/inference/README.md.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this go under apache-beam-ml/datasets/?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test would be internal test. Also sickbayed it for now. There are is PR in work #21887 on how to run the sample

Copy link
Contributor

@yeandy yeandy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want me to help with writing the README?

Tuple[int, PredictionResult])
| "PostProcessor" >> beam.ParDo(PostProcessor()))

if known_args.output:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make output required. Remove if statement.

@pytest.mark.it_postcommit
def test_predictions_output_file(self):
test_pipeline = TestPipeline(is_integration_test=True)
input_file = 'gs://apache-beam-ml/testing/inputs/it_mnist_data.csv'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this go under apache-beam-ml/datasets/?

dest='input',
help='CSV file with row containing label and pixel values.')
parser.add_argument(
'--output', dest='output', help='Path to save output predictions.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add required=True

@AnandInguva
Copy link
Contributor Author

refactored the code with recent changes. Added the test but sickbayed it for now. Adding the issue here: #21859 to unskip the tests.

PTAL @ryanthompson591 @tvalentyn

@codecov
Copy link

codecov bot commented Jun 15, 2022

Codecov Report

Merging #21781 (2c61681) into master (b1a313e) will decrease coverage by 0.00%.
The diff coverage is 47.50%.

❗ Current head 2c61681 differs from pull request most recent head fa42b67. Consider uploading reports for the commit fa42b67 to get more accurate results

@@            Coverage Diff             @@
##           master   #21781      +/-   ##
==========================================
- Coverage   74.01%   74.00%   -0.01%     
==========================================
  Files         699      700       +1     
  Lines       92675    92715      +40     
==========================================
+ Hits        68592    68614      +22     
- Misses      22828    22846      +18     
  Partials     1255     1255              
Flag Coverage Δ
python 83.64% <47.50%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...examples/inference/sklearn_mnist_classification.py 47.50% <47.50%> (ø)
sdks/python/apache_beam/io/localfilesystem.py 90.97% <0.00%> (-0.76%) ⬇️
...eam/runners/portability/fn_api_runner/execution.py 92.44% <0.00%> (-0.65%) ⬇️
sdks/python/apache_beam/transforms/util.py 96.06% <0.00%> (-0.16%) ⬇️
...examples/inference/pytorch_image_classification.py 0.00% <0.00%> (ø)
sdks/python/apache_beam/runners/direct/executor.py 97.01% <0.00%> (+0.54%) ⬆️
...hon/apache_beam/runners/direct/test_stream_impl.py 94.02% <0.00%> (+0.74%) ⬆️
...che_beam/runners/interactive/interactive_runner.py 91.39% <0.00%> (+1.32%) ⬆️
...python/apache_beam/runners/worker/worker_status.py 79.71% <0.00%> (+1.44%) ⬆️
.../python/apache_beam/testing/test_stream_service.py 92.85% <0.00%> (+4.76%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b1a313e...fa42b67. Read the comment docs.

@AnandInguva
Copy link
Contributor Author

cc: @yeandy Added the gradle task

@github-actions github-actions bot removed the infra label Jun 15, 2022
Co-authored-by: Andy Ye <andyye333@gmail.com>
@AnandInguva
Copy link
Contributor Author

AnandInguva commented Jun 15, 2022

PTAL @tvalentyn
R @TheNeuralBit

@AnandInguva
Copy link
Contributor Author

PTAL @yeandy @tvalentyn

Co-authored-by: Andy Ye <andyye333@gmail.com>
@yeandy
Copy link
Contributor

yeandy commented Jun 15, 2022

Can you also uncomment pytest.skip and confirm that gradlew :sdks:python:test-suites:direct:py37:inferencePostCommitIT runs successfully? @AnandInguva

@AnandInguva
Copy link
Contributor Author

Can you also uncomment pytest.skip and confirm that gradlew :sdks:python:test-suites:direct:py37:inferencePostCommitIT runs successfully? @AnandInguva

I checked and it runs. I was able to collect all the Inference IT tests

@AnandInguva
Copy link
Contributor Author

@pabloem test failure unrelated to the change

@pabloem
Copy link
Member

pabloem commented Jun 15, 2022

lgtm thanks folks

@pabloem pabloem merged commit a06a13d into apache:master Jun 15, 2022
bullet03 pushed a commit to akvelon/beam that referenced this pull request Jun 20, 2022
* sklearn example and IT test

* Change the example name

* Refactor sklearn example

* Refactor and add assertions to the sklearn test

* Fixup import order

* fixup: help and name

* Add gradle task for sklearn IT tests

* fixup lint

* Update sdks/python/test-suites/direct/common.gradle

Co-authored-by: Andy Ye <andyye333@gmail.com>

* Change sklearn IT test marker

* Uncomment

* Apply suggestions from code review

Co-authored-by: Andy Ye <andyye333@gmail.com>

Co-authored-by: Andy Ye <andyye333@gmail.com>
prodriguezdefino pushed a commit to prodriguezdefino/beam-pabs that referenced this pull request Jun 21, 2022
* sklearn example and IT test

* Change the example name

* Refactor sklearn example

* Refactor and add assertions to the sklearn test

* Fixup import order

* fixup: help and name

* Add gradle task for sklearn IT tests

* fixup lint

* Update sdks/python/test-suites/direct/common.gradle

Co-authored-by: Andy Ye <andyye333@gmail.com>

* Change sklearn IT test marker

* Uncomment

* Apply suggestions from code review

Co-authored-by: Andy Ye <andyye333@gmail.com>

Co-authored-by: Andy Ye <andyye333@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants