Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot create artifact when using func_to_container_op #2395

Closed
Toeplitz opened this issue Oct 15, 2019 · 4 comments
Closed

Cannot create artifact when using func_to_container_op #2395

Toeplitz opened this issue Oct 15, 2019 · 4 comments

Comments

@Toeplitz
Copy link

/kind bug

What steps did you take and what happened:
Trying to create a simple artifact and make it display in "Run output" for a pipeline

The following code is used to create the pipeline:

import kfp.dsl as dsl
import kfp.gcp as gcp
import kfp.components as comp

def test(foo):
    import json

    source_str = 'Test text: %s' % foo

    metadata = {
        'outputs': [
            {
                'storage': 'inline',
                'source': '# Inline Markdown\n[A link](https://www.kubeflow.org/)',
                'type': 'markdown',
            },
            {
                'source': source_str,
                'type': 'markdown',
            }]
    }
    print(json.dumps(metadata))
    print(metadata)
    with open('/mlpipeline-ui-metadata.json', 'w') as f:
        json.dump(metadata, f)

@dsl.pipeline(
    name='Pipeline name',
    description='Debug...'
)
def pipeline(foo: str = "default value"):
    test_op = comp.func_to_container_op(test)
    test_task = test_op(foo)

if __name__ == '__main__':
    import kfp.compiler as compiler
    compiler.Compiler().compile(pipeline, __file__ + '.tar.gz')

What did you expect to happen:
The pipeline job compltes as expected, but there is noehting in "Run output".

Environment:

  • Kubeflow version: build commit 812ca7f
  • kfctl version: kfctl v0.6.2-0-g47a0e4c7
  • Kubernetes platform: gcp 1.12.10-gke.5
  • kubectl version: 1.6
  • OS: Ubuntu Bionic x64

Python module version:
kfp (0.1.31.2)
kfp-server-api (0.1.18.3)

@tanguycdls
Copy link
Contributor

tanguycdls commented Oct 17, 2019

In the last version you have to define yourself the container op outputs artifacts:

test_task.output_artifact_paths = {
     'mlpipeline-ui-metadata': '/tmp/mlpipeline-ui-metadata.json',
     'mlpipeline-metrics': '/tmp/mlpipeline-metrics.json',
       }

In the pipeline add the code above and it should work with the correct path (fix path).

If you're using func to container op you can also define a namedtuple with type: UI_metadata and Metrics such as explained here:
https://github.com/kubeflow/pipelines/pull/2046/files#diff-0aec3bb4eee5b5b7a9b97fc30a516060

related to #2268 and the PR #2046

@Ark-kun
Copy link
Contributor

Ark-kun commented Oct 18, 2019

@tanguycdls

test_task.output_artifact_paths = {

This might not be a good idea.

  1. ContainerOp().output_artifact_paths should not be used. It should have been made private long time ago fix(sdk/dsl) Made ContainerOp.output_artifact_paths private #1832. Generally, setting any attribute of ContainerOp instance (apart from .container.*) is not a good idea.

  2. ContainerOp(output_artifact_paths=...) will be deprecated soon since it's no longer needed - file_outputs now works just as well - all outputs now produce artifacts (support big data).

@Ark-kun
Copy link
Contributor

Ark-kun commented Oct 18, 2019

@Toeplitz @tanguycdls

If you're using func to container op you can also define a namedtuple with type: UI_metadata and Metrics such as explained here:
https://github.com/kubeflow/pipelines/pull/2046/files#diff-0aec3bb4eee5b5b7a9b97fc30a516060

Yes, that's the correct way.
The component author has to declare all outputs. Since #2046 the mlpipeline_ui_metadata and mlpipeline_metrics now also need to be declared just like every other output. See the sample: https://github.com/kubeflow/pipelines/blob/7b6957a/samples/core/lightweight_component/lightweight_component.ipynb

def test() -> NamedTuple('MyDivmodOutput', [('mlpipeline_ui_metadata', 'UI_metadata'), ('mlpipeline_metrics', 'Metrics')]):
    ...
    return (json.dumps(metadata), json.dumps(metrics))

@drobison-deloitte
Copy link

drobison-deloitte commented Feb 4, 2021

@Toeplitz @tanguycdls

If you're using func to container op you can also define a namedtuple with type: UI_metadata and Metrics such as explained here:
https://github.com/kubeflow/pipelines/pull/2046/files#diff-0aec3bb4eee5b5b7a9b97fc30a516060

Yes, that's the correct way.
The component author has to declare all outputs. Since #2046 the mlpipeline_ui_metadata and mlpipeline_metrics now also need to be declared just like every other output. See the sample: https://github.com/kubeflow/pipelines/blob/7b6957a/samples/core/lightweight_component/lightweight_component.ipynb

def test() -> NamedTuple('MyDivmodOutput', [('mlpipeline_ui_metadata', 'UI_metadata'), ('mlpipeline_metrics', 'Metrics')]):
    ...
    return (json.dumps(metadata), json.dumps(metrics))

In addition to declaring the mlpipeline_metrics output, what other steps do i need to do in order to accomplish the original ask of this thread which is having artifacts appear under "Run Outputs" when working with func_to_container_op? I see the artifact is saved as an argo artifact with a .tgz extension in the minio storage server I have deployed, but KFP does not seem to display it from there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants