-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RHOAIENG-2828:Adding OVMS runtime to single model serving docs #194
Conversation
{productname-short} includes the following pre-installed runtimes for KServe: | ||
|
||
* Standalone TGIS | ||
* Composite Caikit-TGIS | ||
* OpenVINO model server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how we're presenting these in a bullet list now - it's nice and clear and scannable.
I suggest we write the first two with a more complete phrase, so that users don't think those are the official names. And OpenVINO Model Server
is a proper name, so it should be init-capped.
So, something like the following:
- A standalone TGIS runtime
- A composite Caikit-TGIS runtime
- OpenVINO Model Server
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@syaseen-rh I think 'OpenVINO Model Server' still requires proper capitalization here.
modules/accessing-api-endpoints-for-models-deployed-on-single-model-server.adoc
Outdated
Show resolved
Hide resolved
@@ -37,6 +35,10 @@ endif::[] | |||
+ | |||
NOTE: To query the endpoints for the TGIS standalone runtime, you must also download the files in the `proto` directory of the IBM link:https://github.com/IBM/text-generation-inference[text-generation-inference^] repository. | |||
|
|||
*OpenVINO Model Server* | |||
|
|||
* `/v2/models/<model-name>/infer` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're correct about this. Per https://issues.redhat.com/browse/RHOAIENG-3018, Luca considers it a bug (and a regression from the behavior on ModelMesh) that the entire endpoint for OVMS is not exposed in the Inference endpoint field). But at this stage, I don't think that behavior will change for 2.7. For now, I believe that this is the correct additional string that users must add to the URL.
modules/accessing-api-endpoints-for-models-deployed-on-single-model-server.adoc
Outdated
Show resolved
Hide resolved
---- | ||
|
||
*TGIS Standalone ServingRuntime for KServe* | ||
[source] | ||
---- | ||
grpcurl -proto text-generation-inference/proto/generation.proto -d '{"requests": [{"text":"At what temperature does water boil?"}]}' -H 'mm-model-id: <model_name>' -insecure <inference_url>:443 fmaas.GenerationService/Generate | ||
grpcurl -proto text-generation-inference/proto/generation.proto -d '{"requests": [{"text":"<text>"}]}' -H 'mm-model-id: <model_name>' -insecure <inference_url>:443 fmaas.GenerationService/Generate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as earlier comment, let's update the replaceable value here to <inference_endpoint_url>
.
-- | ||
endif::[] | ||
ifdef::self-managed,cloud-service[] | ||
-- | ||
*Caikit TGIS ServingRuntime for KServe* | ||
[source] | ||
---- | ||
curl --json '{"model_id": "<model_name>", "inputs": "At what temperature does water boil?"}' https://<inference_endpoint_url>:443/api/v1/task/server-streaming-text-generation | ||
curl --json '{"model_id": "<model_name>", "inputs": "<text>"}' https://<inference_endpoint_url>:443/api/v1/task/server-streaming-text-generation | ||
---- | ||
|
||
*TGIS Standalone ServingRuntime for KServe* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as earlier comment, add the (gRPC)
part to the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@syaseen-rh I don't see this comment addressed.
modules/accessing-api-endpoints-for-models-deployed-on-single-model-server.adoc
Outdated
Show resolved
Hide resolved
modules/adding-a-custom-model-serving-runtime-for-the-single-model-serving-platform.adoc
Outdated
Show resolved
Hide resolved
a32b632
to
b03e69d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see just one unadressed comment, @syaseen-rh.
-- | ||
endif::[] | ||
ifdef::self-managed,cloud-service[] | ||
-- | ||
*Caikit TGIS ServingRuntime for KServe* | ||
[source] | ||
---- | ||
curl --json '{"model_id": "<model_name>", "inputs": "At what temperature does water boil?"}' https://<inference_endpoint_url>:443/api/v1/task/server-streaming-text-generation | ||
curl --json '{"model_id": "<model_name>", "inputs": "<text>"}' https://<inference_endpoint_url>:443/api/v1/task/server-streaming-text-generation | ||
---- | ||
|
||
*TGIS Standalone ServingRuntime for KServe* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@syaseen-rh I don't see this comment addressed.
https://issues.redhat.com/browse/RHOAIENG-2828
Description
Adding OVMS runtime to single model serving docs
How Has This Been Tested?
Local and downstream builds.