Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add guideline for including a sample in sample test #2026

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions samples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,3 +33,30 @@ Useful parameter values:
## Components source

All samples use pre-built components. The command to run for each container is built into the pipeline file.

## Sample conventions
For better readability and functions of sample test, samples are encouraged to adopt the following conventions.

* The sample should be either `*.py` or `*.ipynb`, and its file name is in consistence with its dir name.
* For `*.py` sample, it's recommended to have a main invoking `kfp.compiler.Compiler().compile()` to compile the
pipeline function into pipeline yaml spec.
* For `*.ipynb` sample, parameters (e.g., experiment name and project name) should be defined in a dedicated cell and
tagged as parameter. Detailed guideline is [here](https://github.com/nteract/papermill). Also, custom environment setup
should be done within the notebook itself by `!pip install`


## (Optional) Add sample test
For those samples that cover critical functions of KFP, possibly it should be picked up by KFP's sample test
so that it won't be broken by accidental PR merge. Contributors can do that by following these steps.

* Place the sample under the core sample directory `kubeflow/pipelines/samples/core`
* Make sure it follows the [sample conventions](#sample-conventions).
* If the test running requires specific values of pipeline parameters, they can be specified in a config yaml file
placed under `test/sample-test/configs`. See
[`tfx_cab_classification.config.yaml`](https://github.com/kubeflow/pipelines/blob/master/test/sample-test/configs/tfx_cab_classification.config.yaml) as an example. The config yaml file will be validated
according to `schema.config.yaml`. If no config yaml is provided, pipeline parameters will be substituted by their
default values.
* Finally, add your test name (in consistency with the file name and dir name) in
[`test/sample_test.yaml`](https://github.com/kubeflow/pipelines/blob/ecd93a50564652553260f8008c9a2d75ab907971/test/sample_test.yaml#L69)
* (Optional) Current sample test infra only checks successful runs, without any result/outcome validation.
If those are needed, runtime checks should be included in the sample itself.