forked from kserve/kserve
-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DEV TRACKER] Model Serving Requirements for Q4 #92
Labels
tracker
Non-completable ticket; used for tracking work at a high level
Comments
heyselbi
added
the
tracker
Non-completable ticket; used for tracking work at a high level
label
Sep 27, 2023
heyselbi
changed the title
[TRACKER] Model Serving Requirements for Q4
[DEV TRACKER] Model Serving Requirements for Q4
Sep 27, 2023
This was referenced Sep 28, 2023
heyselbi
changed the title
[DEV TRACKER] Model Serving Requirements for Q4
[DEV TRACKER] Model Serving Deliverables for Q4
Oct 16, 2023
heyselbi
changed the title
[DEV TRACKER] Model Serving Deliverables for Q4
[DEV TRACKER] Model Serving Requirements for Q4
Oct 16, 2023
Closing, as we are now tracking work on Jira. |
/close |
@israel-hdez: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
github-project-automation
bot
moved this from In Progress
to Done
in ODH Model Serving Planning
Feb 13, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
From Req Document
Req 1: Model Storage
Users must be able to to deploy a model stored in (d.) AWS
Req 2: Model Formats - Estimate: RHODS 1.36
Users must be able to serve models based on a variety of framework (a.) OOTB support for TensorFlow, PyTorch, scikit-learn models and (d.) Users must be able to serve models from Hugging Face without having to do any additional conversions or configurations
Req 7: Deployment Rollouts
a. Ability to deploy new model versions & deploy % of traffic to new version (canary rollout)
b. Ability to do A/B testing on different model versions
c. Ability to test deployed endpoint directly in the product UI
Req 10: OOTB Deployed model performance metrics
Users must be able to access performance metrics for all deployed models (e.) CPU/GPU/memory utilization
Req 14: Model Serving Runtimes
b. OOTB support for Caikit/TGIS
c. OOTB support for NVIDIA Triton Inference Server
Req 15: Remote Deployment
eg. locations other than the cluster where model deployment is initiated (a.) Support models being deployed to remote (location other than where model deployment is initiated)
Req 17: Support options for KServe and/or ModelMesh - Estimate: RHODS 1.36
Support KServe - 1 model per pod or modelmesh - multiple models per pod (a.) RHODS admins should be able to configure whether they want to use KServe (single model serving + additional functionality), ModelMesh, or both
Other planned features
Other planned enhancements
Other planned bug fixes
Resources
Model Serving Phase 2 Requirements doc
Model Serving Phase 2 Requirement Mapping spreadsheet
The text was updated successfully, but these errors were encountered: