Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up 'out of date' banner for few more KFP pages #2174

Merged
merged 19 commits into from
Sep 12, 2020
Merged
4 changes: 0 additions & 4 deletions content/en/docs/pipelines/sdk/gcp.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,6 @@ description = "SDK features that are available on Google Cloud Platform (GCP) on
weight = 130

+++
{{% alert title="Out of date" color="warning" %}}
This guide contains outdated information pertaining to Kubeflow 1.0. This guide
needs to be updated for Kubeflow 1.1.
{{% /alert %}}

For pipeline features that are specific to GCP, including SDK features, see the
[GCP section of the docs](/docs/gke/pipelines/).
10 changes: 5 additions & 5 deletions content/en/docs/pipelines/sdk/install-sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,6 @@ description = "Setting up your Kubeflow Pipelines development environment"
weight = 20

+++
{{% alert title="Out of date" color="warning" %}}
This guide contains outdated information pertaining to Kubeflow 1.0. This guide
needs to be updated for Kubeflow 1.1.
{{% /alert %}}

This guide tells you how to install the
[Kubeflow Pipelines SDK](https://github.com/kubeflow/pipelines/tree/master/sdk)
Expand Down Expand Up @@ -80,16 +76,20 @@ Run the following command to install the Kubeflow Pipelines SDK:
```bash
pip3 install kfp --upgrade
```

**Note:** If you are not using a virtual environment, such as `conda`, when installing the Kubeflow Pipelines SDK, you may receive the following error:

```bash
ERROR: Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/usr/local/lib/python3.5/dist-packages/kfp-0.2.0.dist-info'
ERROR: Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/usr/local/lib/python3.5/dist-packages/kfp-<version>.dist-info'
Consider using the `--user` option or check the permissions.
```

If you get this error, install `kfp` with the `--user` option:

```bash
pip3 install kfp --upgrade --user
```

This command installs the `dsl-compile` and `kfp` binaries under `~/.local/bin`, which is not part of the PATH in some Linux distributions, such as Ubuntu. You can add `~/.local/bin` to your PATH by appending the following to a new line at the end of your `.bashrc` file:

```bash
Expand Down
12 changes: 5 additions & 7 deletions content/en/docs/pipelines/sdk/parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,6 @@ description = "Passing data between pipeline components"
weight = 70

+++
{{% alert title="Out of date" color="warning" %}}
This guide contains outdated information pertaining to Kubeflow 1.0. This guide
needs to be updated for Kubeflow 1.1.
{{% /alert %}}

The [`kfp.dsl.PipelineParam`
class](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html#kfp.dsl.PipelineParam)
Expand All @@ -23,7 +19,7 @@ The task output references can again be passed to other components as arguments.

In most cases you do not need to construct `PipelineParam` objects manually.

The following code sample shows how to define pipeline with parameters:
The following code sample shows how to define a pipeline with parameters:

```python
@kfp.dsl.pipeline(
Expand All @@ -36,7 +32,9 @@ def my_pipeline(
my_url: str = 'http://example.com'
numerology marked this conversation as resolved.
Show resolved Hide resolved
):
...
# In the pipeline function body you can use the `my_num`, `my_name`,
# `my_url` arguments as PipelineParam objects.
```

See more in the guide to [building a
component](/docs/pipelines/sdk/build-component/#create-a-python-class-for-your-component).
For more information, you can refer to the guide on
[building components and pipelines](/docs/pipelines/sdk/build-component/#create-a-python-class-for-your-component).
119 changes: 76 additions & 43 deletions content/en/docs/pipelines/sdk/static-type-checking.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,6 @@ description = "Statically check the component I/O types"
weight = 100

+++
{{% alert title="Out of date" color="warning" %}}
This guide contains outdated information pertaining to Kubeflow 1.0. This guide
needs to be updated for Kubeflow 1.1.
{{% /alert %}}

This page describes how to integrate the type information in the pipeline and utilize the
static type checking for fast development iterations.
Expand Down Expand Up @@ -37,6 +33,7 @@ In the component YAML, types are specified as a string or a dictionary with the
"*component a*" expects an input with Integer type and emits three outputs with the type GCSPath, customized_type and GCRPath.
Among these types, Integer, GCSPath, and GCRPath are core types that are predefined in the SDK while customized_type is a user-defined
type.

```yaml
name: component a
description: component desc
Expand All @@ -58,55 +55,80 @@ implementation:
field_n: /feature.txt
field_o: /output.txt
```

Similarly, when you write a component with the decorator, you can annotate I/O with types in the function signature, as shown below.

```python
from kfp.dsl import component
from kfp.dsl.types import Integer, GCRPath


@component
def task_factory_a(field_l: Integer()) -> {'field_m': {'GCSPath': {'openapi_schema_validator': '{"type": "string", "pattern": "^gs://.*$"}'}},
'field_n': 'customized_type',
'field_o': GCRPath()
}:
return ContainerOp(
name = 'operator a',
image = 'gcr.io/ml-pipeline/component-a',
command = [
'python3', '/pipelines/component/src/train.py'
],
arguments = [
'--field-l', field_l,
],
file_outputs = {
'field_m': '/schema.txt',
'field_n': '/feature.txt',
'field_o': '/output.txt'
def task_factory_a(field_l: Integer()) -> {
'field_m': {
'GCSPath': {
'openapi_schema_validator':
'{"type": "string", "pattern": "^gs://.*$"}'
}
)
},
'field_n': 'customized_type',
'field_o': GCRPath()
}:
return ContainerOp(
name='operator a',
image='gcr.io/ml-pipeline/component-a',
command=['python3', '/pipelines/component/src/train.py'],
arguments=[
'--field-l',
field_l,
],
file_outputs={
'field_m': '/schema.txt',
'field_n': '/feature.txt',
'field_o': '/output.txt'
})
```

You can also annotate pipeline inputs with types and the input are checked against the component I/O types as well. For example,

```python
@component
def task_factory_a(field_m: {'GCSPath': {'openapi_schema_validator': '{"type": "string", "pattern": "^gs://.*$"}'}}, field_o: 'Integer'):
return ContainerOp(
name = 'operator a',
image = 'gcr.io/ml-pipeline/component-a',
arguments = [
'--field-l', field_m,
'--field-o', field_o,
],
)
def task_factory_a(
field_m: {
'GCSPath': {
'openapi_schema_validator':
'{"type": "string", "pattern": "^gs://.*$"}'
}
}, field_o: 'Integer'):
return ContainerOp(
name='operator a',
image='gcr.io/ml-pipeline/component-a',
arguments=[
'--field-l',
field_m,
'--field-o',
field_o,
],
)


# Pipeline input types are also checked against the component I/O types.
@dsl.pipeline(name='type_check',
description='')
def pipeline(a: {'GCSPath': {'openapi_schema_validator': '{"type": "string", "pattern": "^gs://.*$"}'}}='good', b: Integer()=12):
task_factory_a(field_m=a, field_o=b)
@dsl.pipeline(name='type_check', description='')
def pipeline(
a: {
'GCSPath': {
'openapi_schema_validator':
'{"type": "string", "pattern": "^gs://.*$"}'
}
} = 'good',
b: Integer() = 12):
task_factory_a(field_m=a, field_o=b)


try:
compiler.Compiler().compile(pipeline, 'pipeline.tar.gz', type_check=True)
compiler.Compiler().compile(pipeline, 'pipeline.tar.gz', type_check=True)
except InconsistentTypeException as e:
print(e)
print(e)
```

## How does the type checking work?
Expand All @@ -127,26 +149,35 @@ If inconsistent types are detected, it throws an [InconsistentTypeException](htt
Type checking is enabled by default and it can be disabled in two ways:

If you compile the pipeline programmably:

```python
compiler.Compiler().compile(pipeline_a, 'pipeline_a.tar.gz', type_check=False)
```

If you compile the pipeline using the dsl-compiler tool:

```bash
dsl-compiler --py pipeline.py --output pipeline.zip --disable-type-check
```

### Fine-grained configuration

Sometimes, you might want to enable the type checking but disable certain arguments. For example,
when the upstream component generates an output with type "*Float*" and the downstream can ingest either
"*Float*" or "*Integer*", it might fail if you define the type as "*Float_or_Integer*".
Disabling the type checking per-argument is also supported as shown below.

```python
@dsl.pipeline(name='type_check_a',
description='')
@dsl.pipeline(name='type_check_a', description='')
def pipeline():
a = task_factory_a(field_l=12)
# For each of the arguments, you can also ignore the types by calling ignore_type function.
b = task_factory_b(field_x=a.outputs['field_n'], field_y=a.outputs['field_o'], field_z=a.outputs['field_m'].ignore_type())
a = task_factory_a(field_l=12)
# For each of the arguments, you can also ignore the types by calling
# ignore_type function.
b = task_factory_b(
field_x=a.outputs['field_n'],
field_y=a.outputs['field_o'],
field_z=a.outputs['field_m'].ignore_type())

compiler.Compiler().compile(pipeline, 'pipeline.tar.gz', type_check=True)
```

numerology marked this conversation as resolved.
Show resolved Hide resolved
Expand All @@ -158,4 +189,6 @@ type checking would still fail if some I/Os lack the type information and some I

## Next steps

* See [type checking sample](https://github.com/kubeflow/pipelines/blob/master/samples/core/dsl_static_type_checking/dsl_static_type_checking.ipynb).
Learn how to define a KubeFlow pipeline with Python DSL and compile the
pipeline with type checking: a
[Jupyter notebook demo](https://github.com/kubeflow/pipelines/blob/master/samples/core/dsl_static_type_checking/dsl_static_type_checking.ipynb).