You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Req. 2: [P0] Users must be able to deploy an updated version of a foundation model
For example, if using a curated IBM model, when IBM releases a new model version, users must be able to deploy the updated model version. The same scenario applies to foundation models from other sources such as Hugging Face.
Req. 4: [P0] The system must support the ability to update the Caikit serving runtime version without impacting actively served models
For example, the upstream version will be updated and need to incorporate in RHODS as appropriate without impacting deployed models. A new RHODS release must not break model serving functionality.
Req. 5: [P0] Users must be able to access all applicable metrics for any deployed model
Metrics required:
--> number of inference requests over defined time period
--> Avg. response time over defined time period
--> number of successful / failed inference requests over defined time period
--> Compute utilization (CPU, GPU, Memory)
Req. 3: [P0] The product must support the ability to deploy foundation models from Hugging Face using OOTB capabilities.
To be clear, we’re not looking to support the actual models themselves, but rather the ability to deploy/serve the models. If a customer has an issue with the actual model, that is outside the support scope.
Goal 1: Integration of Watsonx.ai components into ODH
Req. 1: [P0] The following components must be included out-of-the-box in RHODS at a GA support level
--> Caikit (Compositional AI Kit) serving runtime
--> TGIS (Text Generation Inference Service)
--> KServe, Service Mesh, Serverless
Goal 2: Seamless updates to models and Caikit
Req. 2: [P0] Users must be able to deploy an updated version of a foundation model
For example, if using a curated IBM model, when IBM releases a new model version, users must be able to deploy the updated model version. The same scenario applies to foundation models from other sources such as Hugging Face.
Req. 4: [P0] The system must support the ability to update the Caikit serving runtime version without impacting actively served models
For example, the upstream version will be updated and need to incorporate in RHODS as appropriate without impacting deployed models. A new RHODS release must not break model serving functionality.
Goal 3: Monitoring & Metrics
Req. 5: [P0] Users must be able to access all applicable metrics for any deployed model
Metrics required:
--> number of inference requests over defined time period
--> Avg. response time over defined time period
--> number of successful / failed inference requests over defined time period
--> Compute utilization (CPU, GPU, Memory)
Goal 4: Support for HF models
Req. 3: [P0] The product must support the ability to deploy foundation models from Hugging Face using OOTB capabilities.
To be clear, we’re not looking to support the actual models themselves, but rather the ability to deploy/serve the models. If a customer has an issue with the actual model, that is outside the support scope.
Goal 5: HTTP and gRPC endpoints
Req. 6: [P0] Enable users to create & access http and grpc endpoints for model serving routes
The text was updated successfully, but these errors were encountered: