-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[P0] [SPIKE] Remote model deployment #185
Comments
There are two proposals that have been written over time:
Both documents are focused on ModelMesh while, nowadays, ODH has also adopted KServe. However, since ModelMesh and KServe share the same APIs, both proposals can be extended to KServe without changes to their essence. There is also a related and ongoing work focused on deploying models "at the edge" that can be seen in this repository: https://github.com/opendatahub-io/ai-edge/. Currently, the ai-edge work is closer to @anishasthana "current state" ("right now" way) about using ACM/OCM and its GitOps/ArgoCD integration as helpers for deploying models. In the ai-edge work a pattern using existing tooling is being proposed meaning that most of the setup is still on user side. In general, the "remote model deployment" requirement is too general and does not make it clear how much 3rd party tooling is OK to use, nor how much integration to ODH is needed that would simplify/reduce manual intervention. This needs to be refined. That said, an OCM/ACM <-> ODH integration may still be the solution that makes most sense to provide remote model deployment capabilities into ODH:
|
Assuming ACM/OCM setup and cluster registration would be out of scope for model-serving (i.e. the user would do it through ACM/OCM) and also that setup and deployment of the multicluster-observability-operator would be out of scope, the work to implement remote deployment would be limited to creating an ACM/OCM add-on that would provide the missing capabilities. Breakdown of tasksThe breakdown of tasks assume that the other clusters would be managed from a hub cluster and that support for direct deployment on the other clusters is out of scope (i.e. creating an Back-end support
The previous list enumerates the needed tasks to expose the most important blocks of functionality for model serving. Additional work/tasks would be needed to expose other functionality (like Note: Unless possible with existing APIs, in general the "implement" tasks imply creation of new cluster APIs (CRDs) to manage the resources in the other clusters. For example, to deploy the model serving stack to the other clusters, it would be needed to create a Front-end supportBasically, expose functionality from the back-end:
|
Closing, as there seems to be no more actionable work. |
#### Motivation Triton introduced [support for more model frameworks last year](https://developer.nvidia.com/blog/real-time-serving-for-xgboost-scikit-learn-randomforest-lightgbm-and-more/) and can support xgboost, lightgbm, and more. This PR adds examples and docs to advertise this. #### Modifications - Add newly supported models to Triton runtime config, setting `autoSelect: false`. - Add an example ISVC config for Triton-served XGBoost model. - Update example-models doc to reflect example models added in kserve/modelmesh-minio-examples#7 - Update model-formats README to reflect framework support and framework-specific docs to show example ISVC using Triton. - Add FVTs for lightgbm and xgboost deployment on Triton runtime #### Result Closes opendatahub-io#185 --------- Signed-off-by: Rafael Vasquez <raf.vasquez@ibm.com> Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
From req doc:
Remote deployment (eg. locations other than the cluster where model deployment is initiated). Note: this does not include edge locations
Acceptance criteria:
The text was updated successfully, but these errors were encountered: