Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubeflow component integration with ML Metadata #783

Open
richardsliu opened this issue Apr 11, 2022 · 11 comments
Open

Kubeflow component integration with ML Metadata #783

richardsliu opened this issue Apr 11, 2022 · 11 comments

Comments

@richardsliu
Copy link

/kind feature

Why you need this feature:
Kubeflow currently doesn't have a unified metadata/artifact management story beyond what's supported in KFP. For example, the concept of a "ML experiment" exists in training and hyperparameter tuning, but there is no way to track it across separate Kubeflow components. Having unified metadata tracking allows users to aggregate things like:

  • Experiment runs
  • Datasets
  • Metrics
  • Trained artifacts
  • Hyperparameter configurations
  • etc

Originally Kubeflow covered this through the Metadata project but it has since been archived. There were some additional discussions around this, found in issue kubeflow/kubeflow#4955.

It would be great to revisit this problem and see if we can propose a unified interface for metadata and artifact storage, possibly by using ML metadata.

Describe the solution you'd like:

One problem with the original Kubeflow metadata project is that it comes with its own storage backend using MySQL, which makes it heavy-weight. We do not need to re-implement the storage backend since MLMD already solves that problem. Instead, we can make MLMD an optional installation, and write to it directly. This is what KFP is currently doing, see this link for the code.

If we can define a unified data model and interface, it should be possible to build a light-weight library on top of ML metadata. It can be an optional import for training jobs and hyperparameter tuning jobs.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

@jbottum
Copy link
Contributor

jbottum commented Apr 11, 2022

/priority p1
/kind feature

@zijianjoy
Copy link
Contributor

Thank you Richard for your proposal.

I think it will be beneficial if more Kubeflow components want to adopt MLMD. The questions I have are:

  1. Are we looking for a mechanism to only group objects across different Kubeflow components? Do we provide a mechanism for Kubeflow components to consume MLMD?
  2. How to guarantee the MLMD version consistency across Kubeflow components?

Note: MLMD has become the hard dependency of KFP in KFPv2. We are not only writing to MLMD, we are also reading MLMD for status update.

Note: Can external addon also use MLMD? For example: Can KServe also use MLMD? If so, how to design a client which can be adopted by Kubeflow components and addon?

Note: Once we have a proper proposal, you can make use of https://github.com/kubeflow/community/tree/master/proposals by creating a PR to this folder.

@johnugeorge
Copy link
Member

This will be a great value add

@juliusvonkohout
Copy link
Member

i think it is very dangerous because MLMD is not yet separated per namespace kubeflow/pipelines#4790. It will lower the security standards even more if more components break down the namespace isolation.

@ca-scribner
Copy link

In general I think this is a great proposal. This to me has been one of the bigger gaps in Kubeflow ever since the previous attempt was archived. There's details to be worked out as @zijianjoy and @juliusvonkohout mention, but they're not impossible.

What other requirements do people envision needed for this? I agree with @juliusvonkohout that whatever we do it should at least have an option for user isolation. Whether it is completely isolated or we maintain two stores (one shared and one namespaced) is debatable. I believe @zijianjoy had some good comments about that and maintaining backward compatibility.

@rustam-ashurov-mcx
Copy link

The ability to track experiments' metadata in a centralized place dedicated to such aims would be great 👍 Atm I'm not sure what to use for such audit/governance/tracking activities without the help of external tools. The same time I don't even want to try a mix of KFL and MLFlow since it looks to me like a over-engineering in case there could be built-in functionality for it

@frittentheke
Copy link

One reads about support for MLFlow here and there across the KubeFlow components and SDKs.
MLflow is also mentioned in the GSOC 2024 list of ideas: https://www.kubeflow.org/events/gsoc-2024/#project-10-enhancing-kf-model-registry-python-client-for-seamless-ml-imports-from-alternative-registries

Is the current state of the integration of MLflow for metadata summarized somewhere?
In short I'd like to understand if and how Kubeflow can leverage the capabilities and data of an existing MLflow installation.

@juliusvonkohout
Copy link
Member

juliusvonkohout commented Jun 10, 2024

@frittentheke so far most users just manage MLflow themselves next to Kubeflow. Integration is possible, but manual and you have to get the hard multi-tenancy right.

@frittentheke
Copy link

@frittentheke so far most users just manage MLflow themselves next to Kubeflow. Integration is possible, but manual and you have to get the hard multi-tenancy right.

Thanks for your response @juliusvonkohout !

Full integration and automation is nice, but also makes things less lightweight or even clunky. I envision (read: have) an environment with an existing MLflow installation containing experiment tracking data already. So I am asking about the integrations and wondering if this can work "nicely" together with Kubeflow. Or will this just duplicate features Kubeflow also covers itself and then feel alien?

@andreyvelich
Copy link
Member

Let's continue this discussion in community repo.
cc @kubeflow/wg-data-leads
/transfer community

@google-oss-prow google-oss-prow bot transferred this issue from kubeflow/kubeflow Oct 17, 2024
@tarilabs
Copy link
Member

thanks @andreyvelich for bringing this back into attention

One problem with the original Kubeflow metadata project is that it comes with its own storage backend using MySQL, which makes it heavy-weight

I'm not sure I understood this from the original posting 🤔 isn't MLMD backed by MySQL(/MariaDB/PostgreSQL) too?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants