Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v0.11.1] Fix controller setup - add schemas based on CRDs available #265

Conversation

israel-hdez
Copy link

@israel-hdez israel-hdez commented Mar 7, 2024

What this PR does / why we need it:

In #233 an enhancement was done to KServe controllers to watch resources based on available CRDs.

A similar change in the setup of the manager was overlooked: it is also needed to add schemas to the manager based on available CRDs rather than only the values in the inferenceservice-config ConfigMap. This would keep both manager setup and controller setup in sync with regards schemas and watches around the CRDs.

This is, basically, a cherry-pick of commti 5180ec5 from upstream PR (not merged yet).

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes https://issues.redhat.com/browse/RHOAIENG-4196

Type of changes
Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

Feature/Issue validation/testing:

Recommended testing is by using OpenShift.

  • Checkout the code from this pull request (use the method you like)
  • Deploy OSSM to your cluster by running ./test/scripts/openshift-ci/deploy.ossm.sh (available in this repo / the checked out code).
  • Deploy Serverless to your cluster by running ./test/scripts/openshift-ci/deploy.serverless.sh
  • Create the odh namespace: oc new-project opendatahub
  • Deploy the buggy KServe image: oc apply -k 'github.com/red-hat-data-services/kserve/config/overlays/odh?ref=rhoai-2.8'

Once KServe is deployed, you should be able to look at the logs and it is going to be running fine. Let's do the problematic setup:

  • Undeploy Serverless: oc delete knativeserving -n knative-serving knative-serving
  • Undeploy OSSM: oc delete smcp -n istio-system basic
  • Patch the KServe ConfigMap: oc patch cm -n opendatahub inferenceservice-config --type merge -p '{"data": {"deploy": "{\"defaultDeploymentMode\": \"RawDeployment\"}"}}'
  • Restart the kserve controller pod: oc delete pod -n opendatahub -l control-plane=kserve-controller-manager

Once it restarts, you should notice kind must be registered to the Scheme errors in the logs of the pod and it will enter in a CrashLoop state. Now, deploy the fixed image:

  • Build the image by yourself:
    • Declare the IMG envvar to your own container registry; e.g. export IMG=quay.io/edgarhz/kserve-controller:pr-265
      • This tag actually exists, in case you prefer to use this pre-built image.
    • Build the image and push it to your registry: make docker-build docker-push
  • Deploy the fixed image: oc set image -n opendatahub deploy/kserve-controller-manager manager=$IMG

Once the KServe pod restarts with the fixed image, you should observe it is stable and there are no errors in the logs.

Copy link

openshift-ci bot commented Mar 7, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@israel-hdez
Copy link
Author

@zdtsw Tagging you just FYI :-)


ksvcFound, ksvcCheckErr := utils.IsCrdAvailable(cfg, knservingv1.SchemeGroupVersion.String(), constants.KnativeServiceKind)
if ksvcCheckErr != nil {
log.Error(ksvcCheckErr, "error when checking if Knative Services are available")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Knative Services -> Knative Service kind

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -146,13 +147,27 @@ func main() {
log.Error(err, "unable to get ingress config.")
os.Exit(1)
}
if deployConfig.DefaultDeploymentMode == string(constants.Serverless) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this removed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upstream PR where all related changes are present should make it clear. I'll refer to it here to give clarity.

I'll take as an example the changes to the inference controller here: https://github.com/kserve/kserve/pull/3472/files#diff-f14605793ed521dabc5c4794584c2961365d2d8324ad41f009aa44b1513d50feR321.

Notice that an equivalent condition around deployConfig.DefaultDeploymentMode is also being removed in the controller.

So, this removal aligns the conditionals that were modified in the controllers (that were merged in #233).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

if ingressConfig.DisableIstioVirtualHost == false {
}

if ingressConfig.DisableIstioVirtualHost == false {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be simplified !ingressConfig.DisableIstioVirtualHost

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Member

@spolti spolti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

besides the changes required by Yuan, the changes looks good to me.

@openshift-ci openshift-ci bot added the lgtm label Mar 7, 2024
@spolti spolti removed the lgtm label Mar 7, 2024
In opendatahub-io#233 an enhancement was done to KServe controllers to watch resources based on available CRDs.

A similar change in the setup of the manager was overlooked: it is also needed to add schemas to the manager based on available CRDs rather than only the values in the inferenceservice-config ConfigMap. This would keep both manager setup and controller setup in sync with regards schemas and watches around the CRDs.

Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com>
Copy link
Member

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link

openshift-ci bot commented Mar 7, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: israel-hdez, spolti, terrytangyuan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [israel-hdez,spolti,terrytangyuan]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit b1dd255 into opendatahub-io:release-v0.11.1 Mar 7, 2024
20 checks passed
@heyselbi
Copy link

heyselbi commented Mar 7, 2024

Tested it in an OpenShift 4.14.13 cluster. Passed the test per instructions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants