Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split caikit-tgis into separate containers #107

Merged
merged 5 commits into from
Oct 20, 2023
Merged

Split caikit-tgis into separate containers #107

merged 5 commits into from
Oct 20, 2023

Conversation

dtrifiro
Copy link
Contributor

Description

  • add Dockerfile for caikit-nlp with tgis backend
  • update docs example (InferenceService+ServingRuntime)
  • remove smoketest (this breaks with the new caikit-only image

How Has This Been Tested?

This can be tested by running the provided example in either a local kserve deployment or openshift:

This example uses a pvc/pvc, but the model can be also stored in s3 or other object storage

  1. Create a PV/PVC
apiVersion: v1
kind: PersistentVolume
metadata:
  name: caikit-volume
  labels:
    type: local
spec:
  storageClassName: manual
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteMany
  hostPath:
    path: "/mnt/models/"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: caikit-claim
spec:
  storageClassName: manual
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 2Gi

Download a model and convert it to caikit format (note: this requires git-lfs to be installed and configured

git clone https://huggingface.co/google/flan-t5-small
python -c 'import caikit_nlp; \
model = caikit_nlp.text_generation.TextGeneration.bootstrap("flan-t5-small"); \
model.save("flan-t5-small-caikit")'
  1. Copy the model to the pv

  2. Edit docs/caikit-isvc.yaml and make storageUri point to the model:
    storageUri: pvc://caikit-claim/flan-t5-small-caikit/

3 Apply the manifests,

oc apply -f docs/caikit-isvc.yaml -f docs/caikit-servingruntime.yaml
  1. After port forwarding to the inference service, it should be able to query the grpc endpoint
oc port-forward deployments/caikit-example-isvc-predictor-00001-deployment 8085 &>/dev/null &
grpcurl -plaintext -d '{"text": "At what temperature does Nitrogen boil?"}' -H "mm-model-id: flan-t5-small-caikit" 127.0.0.1:8085 caikit.runtime.Nlp.NlpService/TextGenerationTaskPredict

Verify that the inference is running on the tgis instance by looking at the logs of the transformer container:

oc logs deployments/caikit-example-isvc-predictor-00001-deployment -c transformer-container -f     

In a subsequent PR, I'll provide an automatic method of testing this setup, either via docker-compose and/or by deploying these manifests.

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Makefile Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
@dtrifiro dtrifiro force-pushed the split-dockerfile branch 2 times, most recently from 31b65e4 to 5b53ed0 Compare October 18, 2023 11:09
Dockerfile Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Copy link
Member

@Xaenalt Xaenalt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few small ones on the Dockerfile and the ServingRuntime, but overall looking great!

@Xaenalt
Copy link
Member

Xaenalt commented Oct 18, 2023

Overall, PR is looking great, great job!

@Xaenalt
Copy link
Member

Xaenalt commented Oct 19, 2023

Added a few more notes to my comments (message here for timestamp tracking)

- create caikit image using UBI as base
- use poetry to lock deps instead of pipenv
Copy link
Member

@Xaenalt Xaenalt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

@Xaenalt
Copy link
Member

Xaenalt commented Oct 20, 2023

/hold

@openshift-ci openshift-ci bot removed the lgtm label Oct 20, 2023
@dtrifiro
Copy link
Contributor Author

Updated the images/tags names as discussed with @Xaenalt

@Xaenalt
Copy link
Member

Xaenalt commented Oct 20, 2023

/unhold

@openshift-ci openshift-ci bot added the lgtm label Oct 20, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 20, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dtrifiro, Xaenalt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kpouget
Copy link

kpouget commented Oct 20, 2023

do you guys know in which version of RHOAI this will land?

@Xaenalt
Copy link
Member

Xaenalt commented Oct 20, 2023

1.36 is the official release, you can use it ahead of time from the ODH repo

@Xaenalt Xaenalt merged commit 9bdebad into main Oct 20, 2023
@kpouget
Copy link

kpouget commented Oct 20, 2023

@Xaenalt , actually, it's this image that I'm using at the moment: quay.io/opendatahub/caikit-tgis-serving:stable
will it be updated automatically following this merge? 🤔

@dtrifiro dtrifiro deleted the split-dockerfile branch October 20, 2023 16:03
@dtrifiro
Copy link
Contributor Author

@Xaenalt , actually, it's this image that I'm using at the moment: quay.io/opendatahub/caikit-tgis-serving:stable will it be updated automatically following this merge? 🤔

yep

@@ -9,7 +9,10 @@ metadata:
spec:
predictor:
model:
apiVersion: serving.kserve.io/v1alpha2
Copy link
Contributor

@bdattoma bdattoma Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to update the KServe CRD as well? currently we ship v1beta1 @dtrifiro @Xaenalt

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to align this with the current KServe version, so I think we'd want this to be v1beta1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cakit+TGIS combined image/SR
6 participants