Kubeflow component integration with ML Metadata #783

richardsliu · 2022-04-11T22:45:31Z

/kind feature

Why you need this feature:
Kubeflow currently doesn't have a unified metadata/artifact management story beyond what's supported in KFP. For example, the concept of a "ML experiment" exists in training and hyperparameter tuning, but there is no way to track it across separate Kubeflow components. Having unified metadata tracking allows users to aggregate things like:

Experiment runs
Datasets
Metrics
Trained artifacts
Hyperparameter configurations
etc

Originally Kubeflow covered this through the Metadata project but it has since been archived. There were some additional discussions around this, found in issue kubeflow/kubeflow#4955.

It would be great to revisit this problem and see if we can propose a unified interface for metadata and artifact storage, possibly by using ML metadata.

Describe the solution you'd like:

One problem with the original Kubeflow metadata project is that it comes with its own storage backend using MySQL, which makes it heavy-weight. We do not need to re-implement the storage backend since MLMD already solves that problem. Instead, we can make MLMD an optional installation, and write to it directly. This is what KFP is currently doing, see this link for the code.

If we can define a unified data model and interface, it should be possible to build a light-weight library on top of ML metadata. It can be an optional import for training jobs and hyperparameter tuning jobs.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

jbottum · 2022-04-11T23:32:53Z

/priority p1
/kind feature

zijianjoy · 2022-04-12T03:43:53Z

Thank you Richard for your proposal.

I think it will be beneficial if more Kubeflow components want to adopt MLMD. The questions I have are:

Are we looking for a mechanism to only group objects across different Kubeflow components? Do we provide a mechanism for Kubeflow components to consume MLMD?
How to guarantee the MLMD version consistency across Kubeflow components?

Note: MLMD has become the hard dependency of KFP in KFPv2. We are not only writing to MLMD, we are also reading MLMD for status update.

Note: Can external addon also use MLMD? For example: Can KServe also use MLMD? If so, how to design a client which can be adopted by Kubeflow components and addon?

Note: Once we have a proper proposal, you can make use of https://github.com/kubeflow/community/tree/master/proposals by creating a PR to this folder.

johnugeorge · 2022-04-12T06:30:48Z

This will be a great value add

juliusvonkohout · 2022-04-12T15:48:40Z

i think it is very dangerous because MLMD is not yet separated per namespace kubeflow/pipelines#4790. It will lower the security standards even more if more components break down the namespace isolation.

ca-scribner · 2022-04-12T16:10:21Z

In general I think this is a great proposal. This to me has been one of the bigger gaps in Kubeflow ever since the previous attempt was archived. There's details to be worked out as @zijianjoy and @juliusvonkohout mention, but they're not impossible.

What other requirements do people envision needed for this? I agree with @juliusvonkohout that whatever we do it should at least have an option for user isolation. Whether it is completely isolated or we maintain two stores (one shared and one namespaced) is debatable. I believe @zijianjoy had some good comments about that and maintaining backward compatibility.

rustam-ashurov-mcx · 2022-05-27T14:44:08Z

The ability to track experiments' metadata in a centralized place dedicated to such aims would be great 👍 Atm I'm not sure what to use for such audit/governance/tracking activities without the help of external tools. The same time I don't even want to try a mix of KFL and MLFlow since it looks to me like a over-engineering in case there could be built-in functionality for it

frittentheke · 2024-06-07T08:40:11Z

One reads about support for MLFlow here and there across the KubeFlow components and SDKs.
MLflow is also mentioned in the GSOC 2024 list of ideas: https://www.kubeflow.org/events/gsoc-2024/#project-10-enhancing-kf-model-registry-python-client-for-seamless-ml-imports-from-alternative-registries

Is the current state of the integration of MLflow for metadata summarized somewhere?
In short I'd like to understand if and how Kubeflow can leverage the capabilities and data of an existing MLflow installation.

juliusvonkohout · 2024-06-10T08:07:31Z

@frittentheke so far most users just manage MLflow themselves next to Kubeflow. Integration is possible, but manual and you have to get the hard multi-tenancy right.

frittentheke · 2024-06-10T08:24:25Z

@frittentheke so far most users just manage MLflow themselves next to Kubeflow. Integration is possible, but manual and you have to get the hard multi-tenancy right.

Thanks for your response @juliusvonkohout !

Full integration and automation is nice, but also makes things less lightweight or even clunky. I envision (read: have) an environment with an existing MLflow installation containing experiment tracking data already. So I am asking about the integrations and wondering if this can work "nicely" together with Kubeflow. Or will this just duplicate features Kubeflow also covers itself and then feel alien?

andreyvelich · 2024-10-17T16:02:31Z

Let's continue this discussion in community repo.
cc @kubeflow/wg-data-leads
/transfer community

tarilabs · 2024-10-17T16:58:31Z

thanks @andreyvelich for bringing this back into attention

One problem with the original Kubeflow metadata project is that it comes with its own storage backend using MySQL, which makes it heavy-weight

I'm not sure I understood this from the original posting 🤔 isn't MLMD backed by MySQL(/MariaDB/PostgreSQL) too?

google-oss-prow bot added the kind/feature label Apr 11, 2022

google-oss-prow bot added the priority/p1 label Apr 11, 2022

google-oss-prow bot transferred this issue from kubeflow/kubeflow Oct 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubeflow component integration with ML Metadata #783

Kubeflow component integration with ML Metadata #783

richardsliu commented Apr 11, 2022

jbottum commented Apr 11, 2022

zijianjoy commented Apr 12, 2022

johnugeorge commented Apr 12, 2022

juliusvonkohout commented Apr 12, 2022

ca-scribner commented Apr 12, 2022

rustam-ashurov-mcx commented May 27, 2022

frittentheke commented Jun 7, 2024

juliusvonkohout commented Jun 10, 2024 •

edited

Loading

frittentheke commented Jun 10, 2024

andreyvelich commented Oct 17, 2024

tarilabs commented Oct 17, 2024

Kubeflow component integration with ML Metadata #783

Kubeflow component integration with ML Metadata #783

Comments

richardsliu commented Apr 11, 2022

jbottum commented Apr 11, 2022

zijianjoy commented Apr 12, 2022

johnugeorge commented Apr 12, 2022

juliusvonkohout commented Apr 12, 2022

ca-scribner commented Apr 12, 2022

rustam-ashurov-mcx commented May 27, 2022

frittentheke commented Jun 7, 2024

juliusvonkohout commented Jun 10, 2024 • edited Loading

frittentheke commented Jun 10, 2024

andreyvelich commented Oct 17, 2024

tarilabs commented Oct 17, 2024

juliusvonkohout commented Jun 10, 2024 •

edited

Loading