Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VPA: Create event for VPA object when Pod is evicted #7413

Merged
merged 5 commits into from
Nov 6, 2024

Conversation

adrianmoisey
Copy link
Member

What type of PR is this?

/kind feature

What this PR does / why we need it:

A user asked for the VPA to emit an event that is related to the VPA object, in addition to the event related to the Pod object.
This makes sense, I see it as a "log" of sorts of what a particular VPA is up to.

Which issue(s) this PR fixes:

Fixes #7149

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Create VPA Event (in addition to a Pod event) when a Pod is evicted

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 20, 2024
@k8s-ci-robot k8s-ci-robot requested a review from voelzmo October 20, 2024 19:15
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 20, 2024
Copy link
Member

@omerap12 omerap12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just my humble opinion.

@@ -322,11 +323,11 @@ func TestEvictReplicatedByReplicaSet(t *testing.T) {
}

for _, pod := range pods[:2] {
err := eviction.Evict(pod, test.FakeEventRecorder())
err := eviction.Evict(pod, getBasicVpa(), test.FakeEventRecorder())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason why not doing something like:

basicVpa := getBasicVpa()
for _, pod := range pods[:2] {
    err := eviction.Evict(pod, basicVpa, test.FakeEventRecorder())
    assert.Nil(t, err, "Should evict with no error")
}
for _, pod := range pods[2:] {
    err := eviction.Evict(pod, basicVpa, test.FakeEventRecorder())
    assert.Error(t, err, "Error expected")
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5cbf777

var err error
// Add delay for fake client to catch up due to be being asynchronous
for i := 0; i < 5; i++ {
events, err = fakeClient.CoreV1().Events("default").List(context.TODO(), metav1.ListOptions{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do something like?

const (
    maxRetries = 5
    retryDelay = 100 * time.Millisecond
    contextTimeout = 5*time.Second
)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I've made them variables instead, I hope that's fine: d16c5e7

var err error
// Add delay for fake client to catch up due to be being asynchronous
for i := 0; i < 5; i++ {
events, err = fakeClient.CoreV1().Events("default").List(context.TODO(), metav1.ListOptions{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of context.TODO() I suggest something like:

ctx, cancel := context.WithTimeout(context.Background(), contextTimeout)
defer cancel()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks. Done in d16c5e7

return eventBroadcaster.NewRecorder(scheme.Scheme, apiv1.EventSource{Component: "vpa-updater"})

vpascheme := scheme.Scheme
corescheme.AddToScheme(vpascheme)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check for error?

if err := corescheme.AddToScheme(vpascheme); err != nil {
        klog.Fatalf("Error adding core scheme: %v", err)
    }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops! Thanks for catching that. Fixed in f6ada00

Copy link
Member

@omerap12 omerap12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple more comments.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 24, 2024
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 27, 2024
@omerap12
Copy link
Member

omerap12 commented Nov 5, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 5, 2024
Copy link
Member

@kwiesmueller kwiesmueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

/hold
Feel free to unhold, just want to unblock.
WDYT about putting the extra event behind a flag? Not sure if people may be surprised about an extra event (load wise) for each evict. On the other hand load of those shouldn't be this high.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 6, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adrianmoisey, kwiesmueller

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 6, 2024
@adrianmoisey
Copy link
Member Author

WDYT about putting the extra event behind a flag? Not sure if people may be surprised about an extra event (load wise) for each evict. On the other hand load of those shouldn't be this high.

Hmmm, I'm not too sure.

I like it "on by default", and I think the load won't be too high. Kubernetes seems to handle events pretty well.
But I'm unsure at what scale people will be running the VPA.

At a guess this may happen roughly as often as the HPA scales Deployments, so my gut feel is that it doesn't need to be behind a flag.

@adrianmoisey
Copy link
Member Author

What about putting it behind a flag, that is default to "on"?
That way users can opt-out if it's too noisy.

@adrianmoisey
Copy link
Member Author

I thought about it a bit more. The act of scheduling a Pod often creates a few Events, ie:

Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  2m6s  default-scheduler  Successfully assigned default/hamster-65cd4dd797-rdplp to kind-control-plane
  Normal  Pulled     2m6s  kubelet            Container image "registry.k8s.io/ubuntu-slim:0.1" already present on machine
  Normal  Created    2m6s  kubelet            Created container hamster
  Normal  Started    2m6s  kubelet            Started container hamster

I think there also may be an event on the ReplicaSet, and if this was backed by an HPA, one there too.

I'm happy to merge this as-is if you're fine with it

@kwiesmueller
Copy link
Member

sgtm
/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 6, 2024
@k8s-ci-robot k8s-ci-robot merged commit 9c91496 into kubernetes:master Nov 6, 2024
7 checks passed
@adrianmoisey adrianmoisey deleted the emit_event_on_vpa branch November 7, 2024 06:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/vertical-pod-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add the event of the pod to the VPA object
4 participants