Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report reconciliation errors as part of the component' status (pt1) #1684

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

lburgazzoli
Copy link
Contributor

@lburgazzoli lburgazzoli commented Feb 20, 2025

Description

This commit introduces some enhancement to what and how components are
reporting theirs status by:

  • Includes a Ready top-level condition which summarizes more detailed
    conditions
  • Includes a ProvisioningSucceeded condition that reports any
    provisioning error, i.e. a deployment fails to be provisioned because
    of any invalid fields, some pre-requisites are not met
apiVersion: components.platform.opendatahub.io/v1alpha1
kind: ModelRegistry
metadata:
  name: default-modelregistry
spec:
  registriesNamespace: odh-model-registries
status:
  conditions:
  - lastTransitionTime: "2025-02-03T13:10:32Z"
    message: 0/1 deployments ready
    reason: DeploymentsNotReady
    status: "False"
    type: Ready
  - lastTransitionTime: "2025-02-03T12:55:45Z"
    status: "True"
    type: ProvisioningSucceeded
  - lastTransitionTime: "2025-02-03T13:10:32Z"
    message: 0/1 deployments ready
    reason: DeploymentsNotReady
    status: "False"
    type: DeploymentsAvailable
  - lastTransitionTime: "2025-02-03T12:55:32Z"
    status: "True"
    type: ServerlessAvailable

Note

in the case we want a condition to be set to False, but that should not affect the top level condition, it is possible to set the severity field as Info (the default is Error and it is being represented by an empty value):

  - lastTransitionTime: "2025-02-03T13:10:42Z"
    reason: FooReasin
    severity: Info
    status: "True"
    type: Foo

If needed, additional conditions can be added by individual component or
services:

_, err := reconciler.ReconcilerFor(mgr, &componentApi.ModelRegistry{}).
    // ...
    WithConditions(
	status.ConditionDeploymentsAvailable,
	status.ConditionServerlessAvailable).
    Build(ctx)

To help managing conditions, a conditions.Manager type has been
inrtoduced, largely inspired by Knative's conditions set code [1], but
improved to handle our specific use cases.

[1] https://github.com/knative/pkg/blob/main/apis/condition_set.go

https://issues.redhat.com/browse/RHOAIENG-18216

How Has This Been Tested?

Screenshot or short clip

Merge criteria

  • You have read the contributors guide.
  • Commit messages are meaningful - have a clear and concise summary and detailed explanation of what was changed and why.
  • Pull Request contains a description of the solution, a link to the JIRA issue, and to any dependent or related Pull Request.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Copy link

openshift-ci bot commented Feb 20, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from lburgazzoli. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@lburgazzoli lburgazzoli requested review from grdryn and zdtsw February 20, 2025 16:40
Copy link

codecov bot commented Feb 20, 2025

Codecov Report

Attention: Patch coverage is 45.62118% with 267 lines in your changes missing coverage. Please review.

Project coverage is 23.39%. Comparing base (2fa3afc) to head (25374d6).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
pkg/controller/conditions/conditions.go 67.59% 50 Missing and 8 partials ⚠️
...ers/components/kserve/kserve_controller_actions.go 0.00% 35 Missing ⚠️
.../modelregistry/modelregistry_controller_actions.go 0.00% 26 Missing ⚠️
...pelines/datasciencepipelines_controller_actions.go 0.00% 24 Missing ⚠️
pkg/controller/reconciler/reconciler.go 79.56% 16 Missing and 3 partials ⚠️
controllers/components/kueue/kueue_controller.go 0.00% 15 Missing ⚠️
...status/deployments/action_deployments_available.go 50.00% 13 Missing and 1 partial ⚠️
...llers/components/kueue/kueue_controller_actions.go 0.00% 7 Missing ⚠️
...llers/services/monitoring/monitoring_controller.go 0.00% 7 Missing ⚠️
pkg/controller/predicates/dependent/dependent.go 0.00% 7 Missing ⚠️
... and 16 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1684      +/-   ##
==========================================
+ Coverage   22.01%   23.39%   +1.37%     
==========================================
  Files         168      169       +1     
  Lines       11359    11597     +238     
==========================================
+ Hits         2501     2713     +212     
- Misses       8601     8617      +16     
- Partials      257      267      +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@lburgazzoli
Copy link
Contributor Author

/test opendatahub-operator-e2e

2 similar comments
@lburgazzoli
Copy link
Contributor Author

/test opendatahub-operator-e2e

@lburgazzoli
Copy link
Contributor Author

/test opendatahub-operator-e2e

@lburgazzoli
Copy link
Contributor Author

/test all

@lburgazzoli
Copy link
Contributor Author

/test opendatahub-operator-e2e

if err := cli.Get(ctx, client.ObjectKey{Name: kserveConfigMapName, Namespace: dscispec.ApplicationsNamespace}, &kserveConfigMap); err != nil {
err := cli.Get(ctx, client.ObjectKey{Name: kserveConfigMapName, Namespace: dscispec.ApplicationsNamespace}, &kserveConfigMap)
if errors.IsNotFound(err) {
return "", nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will set k.Statu.sDefaultDeploymentMode to "" ?
or should it return string(Serverless), nil

Copy link
Contributor Author

@lburgazzoli lburgazzoli Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something not introduced by this PR, I've only added the IsNotFound check to avoid swallowing an error

@grdryn should know more

@lburgazzoli
Copy link
Contributor Author

/test opendatahub-operator-e2e

@lburgazzoli
Copy link
Contributor Author

/retest

@lburgazzoli lburgazzoli requested a review from zdtsw February 27, 2025 11:51
@lburgazzoli lburgazzoli marked this pull request as draft February 27, 2025 16:12
@lburgazzoli lburgazzoli marked this pull request as ready for review February 27, 2025 18:11
lburgazzoli and others added 8 commits February 27, 2025 19:13
This commit introduces some enhancement to what and how components are
reporting theirs status by:

- Includes a `Ready` top-level condition which summarizes more detailed
  conditions
- Includes a `ProvisioningSucceeded` condition that reports any
  provisioning error, i.e. a deployment fails to be provisioned because
  of any invalid fields, some pre-requisites are not met

```yaml
apiVersion: components.platform.opendatahub.io/v1alpha1
kind: ModelRegistry
metadata:
  name: default-modelregistry
spec:
  registriesNamespace: odh-model-registries
status:
  conditions:
  - lastTransitionTime: "2025-02-03T13:10:32Z"
    message: 0/1 deployments ready
    reason: DeploymentsNotReady
    status: "False"
    type: Ready
  - lastTransitionTime: "2025-02-03T12:55:45Z"
    status: "True"
    type: ProvisioningSucceeded
  - lastTransitionTime: "2025-02-03T13:10:32Z"
    message: 0/1 deployments ready
    reason: DeploymentsNotReady
    status: "False"
    type: DeploymentsAvailable
  - lastTransitionTime: "2025-02-03T12:55:32Z"
    status: "True"
    type: ServerlessAvailable
```
> [!NOTE]
> in the case we want a condition to be set to `False`, but that should not affect
> the top level condition, it is possible to set the `severity` field as `Info`.
> The default is `Error` and it is being represented by an empty value.
>
>```yaml
>   - lastTransitionTime: "2025-02-03T13:10:42Z"
>     reason: FooReasin
>     severity: Info
>     status: "True"
>     type: Foo
>```

If needed, additional conditions can be added by individual component or
services:

```go
_, err := reconciler.ReconcilerFor(mgr, &componentApi.ModelRegistry{}).
    // ...
    WithConditions(
	status.ConditionDeploymentsAvailable,
	status.ConditionServerlessAvailable).
    Build(ctx)
```

To help managing conditions, a `conditions.Manager` type has been
inrtoduced, largely inspired by Knative's conditions set code [1], but
improved to handle our specific use cases.

[1] https://github.com/knative/pkg/blob/main/apis/condition_set.go
Co-authored-by: Wen Zhou <wenzhou@redhat.com>
…actions.go

Co-authored-by: Wen Zhou <wenzhou@redhat.com>
…actions.go

Co-authored-by: Wen Zhou <wenzhou@redhat.com>
Copy link
Member

@zdtsw zdtsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

good to have another eyes on this @grdryn maybe ?

@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants