Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ScaleUp for check-capacity ProvisioningRequestClass #6451

Merged
merged 5 commits into from
Jan 30, 2024

Conversation

yaroslava-serdiuk
Copy link
Contributor

@yaroslava-serdiuk yaroslava-serdiuk commented Jan 16, 2024

What type of PR is this?

/kind feature

Implement ScaleUp for check-capacity ProvisioningRequestClass. It will try to schedule pods on existed nodes and if it's succesful it update ProvReq status and book capacity for those pods.

What this PR does / why we need it:

part of implementation https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/provisioning-request.md

Special notes for your reviewer:

The full implementation of check-capacity is not completed yet, pods from ProvisioningRequest will be injected and processed in the following PRs. When the implementation is completed I'll add release-notes.

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jan 16, 2024
@Shubham82
Copy link
Contributor

IMO this looks like a work in progress
/retitle [WIP]ScaleUp for check-capacity ProvisioningRequestClass

@k8s-ci-robot k8s-ci-robot changed the title ScaleUp for check-capacity ProvisioningRequestClass [WIP]ScaleUp for check-capacity ProvisioningRequestClass Jan 16, 2024
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 16, 2024
@yaroslava-serdiuk yaroslava-serdiuk changed the title [WIP]ScaleUp for check-capacity ProvisioningRequestClass ScaleUp for check-capacity ProvisioningRequestClass Jan 16, 2024
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 16, 2024
@yaroslava-serdiuk
Copy link
Contributor Author

@Shubham82 probably my initial comment is misleading. The ScaleUp part of ProvisioningRequestClass is done, so this PR is ready for review.

@yaroslava-serdiuk yaroslava-serdiuk force-pushed the provreq-2 branch 2 times, most recently from 245e4cc to 4bce196 Compare January 16, 2024 17:40
@Shubham82
Copy link
Contributor

@Shubham82 probably my initial comment is misleading. The ScaleUp part of ProvisioningRequestClass is done, so this PR is ready for review.

Ok @yaroslava-serdiuk
Maybe I misunderstood it, Thanks for the clarification.

Copy link
Contributor

@kisieland kisieland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial pass done

return nil
}
// scheduling the pods to reserve capacity for provisioning request with BookCapacity condition
o.injector.TrySchedulePods(o.context.ClusterSnapshot, podsToCreate, scheduling.ScheduleAnywhere, false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the last parameter of the function? Can we comment what is happening here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I see that the Function can return error, we should at least log it. And if should happen, maybe even return it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last bool value is breakOnFailure. Here we continue to try schedule pods even if some of them weren't scheduled.

Copy link
Contributor

@kisieland kisieland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left couple of comments on how to handle the conditions

return false
}
condition := pr.Conditions()[len(pr.Conditions())-1]
if condition.Type == string(v1beta1.CapacityAvailable) && condition.Status == v1.ConditionTrue {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do a proper check here, see the comment below.

return false
}

func setCondition(pr *provreqwrapper.ProvisioningRequest, conditionType string, reason, message string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have the following cases in the conditions:
BookedCapacity=False -> we did not provision capacity but we tried
BookedCapacity=True -> we got capacity and it is actively booked
BookedCapacity=True && Expired=True -> The capacity expired
Failed=True -> We failed
BookedCapacity=False Failed=True -> We failed but made a try before

I would also vote for removal of the Pending=True condition as we can communicate this information by BookedCapacity=False

@yaroslava-serdiuk yaroslava-serdiuk force-pushed the provreq-2 branch 2 times, most recently from 29f6dc2 to b493ccc Compare January 18, 2024 15:33
Copy link
Contributor

@kisieland kisieland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Left couple comments for the test cases

Copy link
Contributor

@kisieland kisieland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of comments

Copy link
Contributor

@kisieland kisieland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot
Copy link
Contributor

@kisieland: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Shubham82
Copy link
Contributor

Apply lgtm as the above one has not been added.
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 29, 2024
Copy link
Contributor

@BigDarkClown BigDarkClown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a single nit + a question about potential starvation.

…tor_test.go

Co-authored-by: Bartek Wróblewski <bwroblewski@google.com>
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 29, 2024
@yaroslava-serdiuk
Copy link
Contributor Author

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jan 29, 2024
@BigDarkClown
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 30, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BigDarkClown, kisieland, yaroslava-serdiuk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 30, 2024
@k8s-ci-robot k8s-ci-robot merged commit ed6ebbe into kubernetes:master Jan 30, 2024
6 checks passed
@yaroslava-serdiuk
Copy link
Contributor Author

#6814

@yaroslava-serdiuk yaroslava-serdiuk deleted the provreq-2 branch May 28, 2024 23:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cluster-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants