Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add request value verification for hugepage #98515

Merged
merged 1 commit into from
Mar 6, 2021

Conversation

lala123912
Copy link
Contributor

@lala123912 lala123912 commented Jan 28, 2021

What type of PR is this?
/kind bug

What this PR does / why we need it:
Limit the request value of hugepage to integer multiple of page size to ensure the memory allocated is equal to the request.

Which issue(s) this PR fixes:

Fixes #98513

Special notes for your reviewer:
the limitation can refer to https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1539-hugepages

The quantity of huge pages must be a positive amount of memory in bytes. The specified amount must align with the ; otherwise, the pod will fail validation. For example, it would be valid to request hugepages-2Mi: 4Mi, but invalid to request hugepages-2Mi: 3Mi.

Does this PR introduce a user-facing change?:

hugepages request values are limited to integer multiples of the page size.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jan 28, 2021
@k8s-ci-robot k8s-ci-robot added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API sig/apps Categorizes an issue or PR as relevant to SIG Apps. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 28, 2021
@lala123912
Copy link
Contributor Author

/cc @bart0sh

@k8s-ci-robot k8s-ci-robot requested a review from bart0sh January 28, 2021 07:43
@lala123912
Copy link
Contributor Author

/test pull-kubernetes-node-e2e

@lala123912
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-ubuntu-containerd

@bart0sh
Copy link
Contributor

bart0sh commented Jan 28, 2021

@lala123912 Thank you for your contribution. As far as I remember Linux kernel supports huge pages of many different sizes depending on the hardware platform, e.g. 4K, 1M etc. What's the reason for limiting size to only two values?

@fejta-bot
Copy link

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

@lala123912
Copy link
Contributor Author

lala123912 commented Jan 28, 2021

@lala123912 Thank you for your contribution. As far as I remember Linux kernel supports huge pages of many different sizes depending on the hardware platform, e.g. 4K, 1M etc. What's the reason for limiting size to only two values?

I have modified the code for different page sizes .

the limitation can refer to https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1539-hugepages

The quantity of huge pages must be a positive amount of memory in bytes. The specified amount must align with the <hugepagesize>; otherwise, the pod will fail validation. For example, it would be valid to request hugepages-2Mi: 4Mi, but invalid to request hugepages-2Mi: 3Mi.

@bart0sh

@lala123912 lala123912 force-pushed the huge_page branch 3 times, most recently from 90e17b8 to 2e483b6 Compare January 29, 2021 08:05
}
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have ValidateResourceRequirements API tested with hugepage requests and limits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added two cases to test ValidateResourceRequirements in pkg/apis/core/validation/validation_test.go.

if !helper.IsHugePageResourceValueValid(resourceName, quantity) {
limContainsHugePages = false
allErrs = append(allErrs, field.Invalid(fldPath, quantity.String(),
fmt.Sprintf("value must be integer multiple of %s limit", resourceName)))
Copy link
Contributor

@bart0sh bart0sh Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if resourceName is "hugepages" the error would be "value must be integer multiple of hugepages limit". This is hard to understand from my point of view. Would it be better to somehow use quantity in the error message?

Copy link
Contributor Author

@lala123912 lala123912 Jan 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the error msg below.

fmt.Errorf("%s is not integer multiple of %s", quantity.String(), name)

test logs:

* spec.containers[0].resources.limits[hugepages-2Mi]: Invalid value: "5347737600m": 5347737600m is not integer multiple of hugepages-2Mi
* spec.containers[0].resources.requests[hugepages-2Mi]: Invalid value: "5347737600m": 5347737600m is not integer multiple of hugepages-2Mi

if !helper.IsHugePageResourceValueValid(resourceName, quantity) {
reqContainsHugePages = false
allErrs = append(allErrs, field.Invalid(fldPath, quantity.String(),
fmt.Sprintf("value must be integer multiple of %s limit", resourceName)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd suggest to reorganize this code a little for better readability, e.g.:

if helper.IsHugePageResourceValueValid(resourceName, quantity) {
    reqContainsHugePages = True
} else {
    allErrs = append(allErrs, ...
}

Copy link
Contributor

@bart0sh bart0sh Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea is to move this check into ValidateResourceQuantityValue as essentially it is a resource quantity value check. This would eliminate a need to do it twice in this place as ValidateResourceQuantityValue is already called for resources and limits here. wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ValidateResourceQuantityValue is used for check various kinds of resource quantity value, such as node.Status.Capacity. Instead, I add a new function validateResourceQuantityHugePageValue so that the impact of the verification is smaller and IsHugePageResourceValueValid() is only called once here.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 30, 2021
@lala123912
Copy link
Contributor Author

/test pull-kubernetes-integration

@bart0sh
Copy link
Contributor

bart0sh commented Feb 1, 2021

/lgtm

@lala123912 lala123912 force-pushed the huge_page branch 2 times, most recently from 8ca78e8 to 6bd1535 Compare March 5, 2021 03:24
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 5, 2021
@lala123912
Copy link
Contributor Author

/retest

@lala123912
Copy link
Contributor Author

/retest

@lala123912 lala123912 force-pushed the huge_page branch 2 times, most recently from 82ebb94 to 4d8ebba Compare March 5, 2021 08:52
@lala123912
Copy link
Contributor Author

/test pull-kubernetes-unit

@lala123912
Copy link
Contributor Author

/retest

@liggitt
Copy link
Member

liggitt commented Mar 6, 2021

/triage accepted
/priority important-soon
/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Mar 6, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lala123912, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 6, 2021
@k8s-ci-robot k8s-ci-robot merged commit 4e95e1d into kubernetes:master Mar 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-review Categorizes an issue or PR as actively needing an API review. approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/apps Categorizes an issue or PR as relevant to SIG Apps. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: API review completed, 1.21
8 participants