Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TGIS Standalone SR and ISVC #132

Merged
merged 1 commit into from
Nov 8, 2023

Conversation

heyselbi
Copy link
Contributor

@heyselbi heyselbi commented Nov 6, 2023

Description

Adding standalone TGIS ServingRuntime and InferenceService.

Testing instructions

  1. Install KServe on a cluster by following instructions here
  2. Deploy Minio (or other storage) with a model (example) and create a new namespace (Steps 1, 2a and 2c)
  3. Deploy SR that is in this PR -- add the namespace field:
apiVersion: serving.kserve.io/v1alpha1
kind: ServingRuntime
metadata:
  name: tgis-runtime
spec:
  multiModel: false
  supportedModelFormats:
    - autoSelect: true
      name: pytorch
  containers:
    - name: kserve-container
      image: quay.io/opendatahub/text-generation-inference:stable
      command: ["text-generation-launcher"]
      args: 
        - "--model-name=/mnt/models/"
        - "--port=3000"
        - "--grpc-port=8033"
      env:
        - name: TRANSFORMERS_CACHE
          value: /tmp/transformers_cache
      # resources: # configure as required
      #   requests:
      #     cpu: 8
      #     memory: 16Gi
      ports:
      - containerPort: 8033
        name: h2c
        protocol: TCP
  1. Deploy ISVC that is in this PR with edited storageUri field -- add the namespace field as well:
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  annotations:
    serving.knative.openshift.io/enablePassthrough: "true"
    sidecar.istio.io/inject: "true"
    sidecar.istio.io/rewriteAppHTTPProbers: "true"
  name: tgis-example-isvc
spec:
  predictor:
    serviceAccountName: sa
    model:
      modelFormat:
        name: pytorch
      runtime: tgis-runtime
      storageUri: s3://modelmesh-example-models/llm/models/flan-t5-small-caikit/artifacts
  1. Verify that the ISVC is ready
    oc get isvc/tgis-example-isvc -n ${TEST_NS}
  2. Download both of the proto files here
  3. Perform sample inference call
export KSVC_HOSTNAME=$(oc get ksvc tgis-example-isvc-predictor -n ${TEST_NS} -o jsonpath='{.status.url}' | cut -d'/' -f3)
grpcurl -insecure -proto generation.proto \
    -d '{"requests": [{"text":"At what temperature does Nitrogen boil?"}]}' \
    ${KSVC_HOSTNAME}:443 fmaas.GenerationService/Generate

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

@openshift-ci openshift-ci bot requested review from Jooho and Xaenalt November 6, 2023 19:37
@openshift-ci openshift-ci bot added the approved label Nov 6, 2023
Signed-off-by: heyselbi <selbi@redhat.com>
@davidesalerno
Copy link

/lgtm

Copy link
Contributor

openshift-ci bot commented Nov 8, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: davidesalerno, heyselbi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit ef6cc36 into opendatahub-io:main Nov 8, 2023
1 check passed
@heyselbi heyselbi deleted the tgis branch January 15, 2025 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TGIS standalone SR and ISVC
3 participants