Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[P0] SPIKE: making sure the long running connection works (http/grpc streaming) #33

Closed
heyselbi opened this issue Jul 19, 2023 · 1 comment
Assignees
Labels
odh-release/1.8 Need to do for ODH Release v1.8.0 rhods-1.32

Comments

@heyselbi
Copy link

No description provided.

@heyselbi heyselbi transferred this issue from opendatahub-io/caikit-tgis-serving Jul 19, 2023
@heyselbi heyselbi moved this from Backlog to Groomed Issues in ODH Model Serving Planning Jul 19, 2023
@heyselbi heyselbi moved this from Groomed Issues to To-do This Sprint in ODH Model Serving Planning Jul 19, 2023
@heyselbi heyselbi added the odh-release/1.8 Need to do for ODH Release v1.8.0 label Aug 3, 2023
@heyselbi heyselbi moved this from To-do This Sprint to In Progress in ODH Model Serving Planning Aug 8, 2023
@danielezonca
Copy link

Ok I have verified that it works.

More details:

  • I have followed this guide to provision the cluster (using the new Operator)
  • Deployed the model following the same guide
  • Tested normal "all token" call using (note replace caikit-example-isvc-predictor-default with caikit-example-isvc-predictor if using Kserve 0.11)
export TEST_NS=kserve-demo
export KSVC_HOSTNAME=$(oc get ksvc caikit-example-isvc-predictor-default -n ${TEST_NS} -o jsonpath='{.status.url}' | cut -d'/' -f3)
grpcurl -insecure -d '{"text": "At what temperature does liquid Nitrogen boil?"}' -H "mm-model-id: flan-t5-small-caikit" ${KSVC_HOSTNAME}:443 caikit.runtime.Nlp.NlpService/TextGenerationTaskPredict
{
  "generated_text": "74 degrees F",
  "generated_tokens": "5",
  "finish_reason": "EOS_TOKEN",
  "producer_id": {
    "name": "Text Generation",
    "version": "0.1.0"
  }
}
  • Tested "stream token" call using
export TEST_NS=kserve-demo
export KSVC_HOSTNAME=$(oc get ksvc caikit-example-isvc-predictor-default -n ${TEST_NS} -o jsonpath='{.status.url}' | cut -d'/' -f3)
grpcurl -insecure -d '{"text": "At what temperature does liquid Nitrogen boil?"}' -H "mm-model-id: flan-t5-small-caikit" ${KSVC_HOSTNAME}:443 caikit.runtime.Nlp.NlpService/ServerStreamingTextGenerationTaskPredict                                                                  
{
 "details": {
   
 }
}
{
 "tokens": [
   {
     "text": "",
     "logprob": -1.5990846157073975
   }
 ],
 "details": {
   "generated_tokens": 1
 }
}
{
 "generated_text": "74",
 "tokens": [
   {
     "text": "74",
     "logprob": -3.3622496128082275
   }
 ],
 "details": {
   "generated_tokens": 2
 }
}
{
 "generated_text": " degrees",
 "tokens": [
   {
     "text": "▁degrees",
     "logprob": -0.516351580619812
   }
 ],
 "details": {
   "generated_tokens": 3
 }
}
{
 "generated_text": " F",
 "tokens": [
   {
     "text": "▁F",
     "logprob": -1.1749719381332397
   }
 ],
 "details": {
   "generated_tokens": 4
 }
}
{
 "tokens": [
   {
     "text": "\u003c/s\u003e",
     "logprob": -0.009402398951351643
   }
 ],
 "details": {
   "finish_reason": "EOS_TOKEN",
   "generated_tokens": 5
 }
}

How caikit gRPC services work

Caikit automatically/dynamically creates gRPC service to expose a Caikit task of a module.
In the case of Caikit-nlp we are using the TextGenerationTask. When the task is declared, it has an annotation with the parameters (in this case text) and the returned type in case of unary output type and streaming output type.
This implies that the tasks supports both all token invocation and streaming use cases.
Caikit generates via naming convention different services:

  • %TaskName%Predict: for all token invocation
  • ServerStreaming%TaskName%Predict: for streaming invocation
  • BidiStreaming%TaskName%Predict: bidirectional stream (stream of inputs and stream of outputs). This is not supported by TextGenerationTask
    See this test for a full example

@github-project-automation github-project-automation bot moved this from In Progress to Done in ODH Model Serving Planning Aug 11, 2023
Jooho pushed a commit to Jooho/kserve that referenced this issue Nov 29, 2023
[Cherry-pick] Support verify variable with storage-config json style (fix-3263)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
odh-release/1.8 Need to do for ODH Release v1.8.0 rhods-1.32
Projects
Status: No status
Status: No status
Status: Done
Development

No branches or pull requests

2 participants