Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[P0] [SPIKE] Remote model deployment #185

Closed
2 tasks
heyselbi opened this issue Sep 14, 2023 · 3 comments
Closed
2 tasks

[P0] [SPIKE] Remote model deployment #185

heyselbi opened this issue Sep 14, 2023 · 3 comments
Assignees
Labels
req From the Requirements document

Comments

@heyselbi
Copy link

heyselbi commented Sep 14, 2023

From req doc:
Remote deployment (eg. locations other than the cluster where model deployment is initiated). Note: this does not include edge locations

  • Support models being deployed to remote (location other than where model deployment is initiated):
    • P0: OCP including single node OCP
      • This is the standard OCP product. Other variants such as MicroShift will be covered in a separate edge epic

Acceptance criteria:

  • Architecture or detailed explanation of what remote deployment entails
  • Scope of the work that would be needed (perhaps in L0, L1 etc format)
@heyselbi heyselbi converted this from a draft issue Sep 14, 2023
@heyselbi heyselbi added the req From the Requirements document label Sep 14, 2023
@heyselbi heyselbi moved this from New/Backlog to To-do/Groomed in ODH Model Serving Planning Sep 14, 2023
@israel-hdez israel-hdez moved this from To-do/Groomed to In Progress in ODH Model Serving Planning Sep 25, 2023
@israel-hdez
Copy link

israel-hdez commented Sep 28, 2023

There are two proposals that have been written over time:

  1. @anishasthana proposal: GDoc link
    • It is the older one, yet still quite valid.
    • It mentions a way to achieve remote deployment right now by using ACM/OCM and its GitOps/ArgoCD integration. This way is somewhat manual, although it requires no changes to ODH.
    • It proposes a new architecture to make it easier for the user to deploy models remotely from a hub cluster. This would, again, rely on ACM/OCM but management would be done through ODH using 3 new CRDs rather than via a GitOps repository.
  2. @israel-hdez proposal: GDoc link
    • It is the newer one. Despite this, the proposed architecture is almost identical to the one from @anishasthana.
    • It describes slightly deeper the technical details that would need to be worked to implement remote model deployment.
    • In general, it is complementary to the proposal from @anishasthana .

Both documents are focused on ModelMesh while, nowadays, ODH has also adopted KServe. However, since ModelMesh and KServe share the same APIs, both proposals can be extended to KServe without changes to their essence.

There is also a related and ongoing work focused on deploying models "at the edge" that can be seen in this repository: https://github.com/opendatahub-io/ai-edge/. Currently, the ai-edge work is closer to @anishasthana "current state" ("right now" way) about using ACM/OCM and its GitOps/ArgoCD integration as helpers for deploying models. In the ai-edge work a pattern using existing tooling is being proposed meaning that most of the setup is still on user side.

In general, the "remote model deployment" requirement is too general and does not make it clear how much 3rd party tooling is OK to use, nor how much integration to ODH is needed that would simplify/reduce manual intervention. This needs to be refined.

That said, an OCM/ACM <-> ODH integration may still be the solution that makes most sense to provide remote model deployment capabilities into ODH:

  • At the very least, ACM/OCM should be used because of the cluster registration capabilities.
  • Secondarily, it makes sense to use ACM/OCM's multicluster-observability-operator to collect metrics into a central location.
  • Any other technical gaps would be solved through an ODH-OCM/ACM add-on.
    • Additional APIs/CRDs would need to be created to cover any technical gaps.
    • The complexity of the add-on would be dependent on how much capabilities we want to cover into ODH in favor of relying on other existent tooling.

@israel-hdez
Copy link

Assuming ACM/OCM setup and cluster registration would be out of scope for model-serving (i.e. the user would do it through ACM/OCM) and also that setup and deployment of the multicluster-observability-operator would be out of scope, the work to implement remote deployment would be limited to creating an ACM/OCM add-on that would provide the missing capabilities.

Breakdown of tasks

The breakdown of tasks assume that the other clusters would be managed from a hub cluster and that support for direct deployment on the other clusters is out of scope (i.e. creating an InferenceService directly in the other clusters could be possible, but is not going to be possible to manage such ISVC from the hub cluster).

Back-end support

  • Scaffolding/creation of a new ACM/OCM add-on for model serving (agent and manager) - it could be part-of an already existent project (e.g. odh-model-controller).
    • Implement deployment, configuration and undeployment of the ModelMesh stack in the other clusters (headless).
      • Implement (or validate) configuration of metrics collection to the "local" OpenShift, assuming the multicluster-observability-operator is going to be already configured to federate metrics to the hub cluster.
    • Implement deployment, configuration and undeployment of the KServe stack in the other clusters (headless).
      • Implement (or validate) configuration of metrics collection (similar to ModelMesh case).
    • Implement enabling/disabling ModelMesh on the other clusters (KServe shouldn't need to be enabled).
    • Implement management of storage credentials in the other clusters.
    • Implement management of ServingRuntimes in the other clusters.
    • Implement management of InferenceServices in the other clusters.
  • ODH operator support for deploying the serving ACM/OCM add-on (in case a new project is created).
  • Documentation about how to setup/prepare a cluster to work as a hub for model serving (i.e. examples to setup OCM/ACM and register a cluster).
  • Documentation about how to setup/prepare the clusters to federate metrics to the hub cluster (i.e. examples of the multicluster-observability-operator).

The previous list enumerates the needed tasks to expose the most important blocks of functionality for model serving. Additional work/tasks would be needed to expose other functionality (like InferenceGraph).

Note: Unless possible with existing APIs, in general the "implement" tasks imply creation of new cluster APIs (CRDs) to manage the resources in the other clusters. For example, to deploy the model serving stack to the other clusters, it would be needed to create a ModelServing CRD to be able to manage the stack from the hub cluster.

Front-end support

Basically, expose functionality from the back-end:

  • Ability to deploy or remove the model serving stack on other (already registered) clusters.
  • Extend the feature to enable or disable model serving on namespaces/projects to the other clusters.
  • Ability to create and remove model servers (ServingRuntimes) on other clusters.
  • Ability to configure and remove storage credentials on other clusters.
  • Ability to select which cluster (or clusters) to deploy or remove a model.

@israel-hdez
Copy link

Closing, as there seems to be no more actionable work.

@github-project-automation github-project-automation bot moved this from In Progress to Done in ODH Model Serving Planning Oct 3, 2023
israel-hdez pushed a commit to israel-hdez/modelmesh-serving that referenced this issue Mar 20, 2024
#### Motivation
Triton introduced [support for more model frameworks last
year](https://developer.nvidia.com/blog/real-time-serving-for-xgboost-scikit-learn-randomforest-lightgbm-and-more/)
and can support xgboost, lightgbm, and more. This PR adds examples and
docs to advertise this.

#### Modifications
- Add newly supported models to Triton runtime config, setting
`autoSelect: false`.
- Add an example ISVC config for Triton-served XGBoost model.
- Update example-models doc to reflect example models added in
kserve/modelmesh-minio-examples#7
- Update model-formats README to reflect framework support and
framework-specific docs to show example ISVC using Triton.
- Add FVTs for lightgbm and xgboost deployment on Triton runtime

#### Result
Closes opendatahub-io#185

---------

Signed-off-by: Rafael Vasquez <raf.vasquez@ibm.com>
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
req From the Requirements document
Projects
Status: No status
Status: No status
Status: Done
Development

No branches or pull requests

2 participants