Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix read "max" value for cgroup v2 #2837

Merged
merged 2 commits into from
Mar 26, 2021
Merged

Conversation

giuseppe
Copy link
Contributor

and also simplify some other code to behave like cgroup v1

Signed-off-by: Giuseppe Scrivano gscrivan@redhat.com

read the memory stats from the specified cgroup instead of looking the
values up in the ancestor cgroups.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
when the "max" value is found handle it as math.MaxUint64.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@google-cla google-cla bot added the cla: yes label Mar 24, 2021
@k8s-ci-robot
Copy link
Collaborator

Hi @giuseppe. Thanks for your PR.

I'm waiting for a google member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@giuseppe
Copy link
Contributor Author

@odinuge PTAL

return spec, err
}
if memoryRoot != "" {
if utils.FileExists(path.Join(memoryRoot, "memory.max")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@odinuge odinuge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small nit, otherwise lgtm

@@ -226,7 +203,10 @@ func readString(dirpath string, file string) string {

func readUInt64(dirpath string, file string) uint64 {
out := readString(dirpath, file)
if out == "" || out == "max" {
if out == "max" {
return math.MaxUint64
Copy link
Contributor

@odinuge odinuge Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The max value for cgroup v1 is 9223372036854771712 (max signed integer rounded to nearest page, at least on x86). So to be correct it should be something like this I guess:

Suggested change
return math.MaxUint64
return math.MaxInt64^0xFFF

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with changing it, but would it be a problem with cgroup v1? "max" is used only on cgroup v2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember where this value is used, but I just thought it was nice with the same value for both cgroup v2. If you try to write MaxInt64^0xFFF + 0x1 you get max on cgroup v2, and 9223372036854771712 on v1. 9223372036854771712 is also default on v1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: This is used for more than the memory limits, and my above comments only apply to those. Hmm

@giuseppe giuseppe marked this pull request as ready for review March 25, 2021 15:34
@giuseppe
Copy link
Contributor Author

a similar fix for cpu specs: #2839

@@ -226,7 +203,10 @@ func readString(dirpath string, file string) string {

func readUInt64(dirpath string, file string) uint64 {
out := readString(dirpath, file)
if out == "" || out == "max" {
if out == "max" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have you decided to change behaviour for out == ""?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for out == "" I've left the same behavior there was before 2ccad4b

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, my bad!

@iwankgb
Copy link
Collaborator

iwankgb commented Mar 25, 2021

/ok-to-test

@odinuge
Copy link
Contributor

odinuge commented Mar 26, 2021

This change is probably related to:

cadvisor/manager/manager.go

Lines 473 to 479 in fa07332

if spec.HasMemory {
// Memory.Limit is 0 means there's no limit
if spec.Memory.Limit == 0 {
m.machineMu.RLock()
spec.Memory.Limit = uint64(m.machineInfo.MemoryCapacity)
m.machineMu.RUnlock()
}

Does that code do anything at all if this change merged?

Also, the description of this value is probably wrong, since it says -1 is default, but it is an unsigned value...

// The amount of memory requested. Default is unlimited (-1).
// Units: bytes.
Limit uint64 `json:"limit,omitempty"`

@giuseppe
Copy link
Contributor Author

yes it will be ignored.

And I think this is the expectation, at least from Kubernetes, since there is a e2e tests failing right now as the memory limit is being set while it is expected to be unlimited (isMemoryUnlimited in pkg/kubelet/stats/helper.go)

@iwankgb iwankgb merged commit 4765480 into google:master Mar 26, 2021
chenchun pushed a commit to chenchun/kubernetes that referenced this pull request Mar 20, 2024
chenchun pushed a commit to chenchun/kubernetes that referenced this pull request Mar 20, 2024
移植upstream对kubelet及cadvisor的修改,修复使用cgroupv2时指标收集统计的问题
1. port cadvisor pr google/cadvisor#2839 reading cpu stats on cgroup v2
2. port cadvisor pr google/cadvisor#2837 read "max" value for cgroup v2
3. port cadvisor pr google/cadvisor#2801 gathering of stats for root cgroup on v2
4. port cadvisor pr google/cadvisor#2800: Update heuristic for container creation time
5. Fix cgroup handling for systemd with cgroup v2
6. test: adjust summary test for cgroup v2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants