Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

userns: Skip tests if the host doesn't support idmap mounts #1489

Merged
merged 3 commits into from
Jul 10, 2024

Conversation

rata
Copy link
Contributor

@rata rata commented Jul 5, 2024

I'm testing containerd with runc 1.2.0-rc.2 and I found this issue. This (and other fixes that belong to the containerd repo) make the CI happy again.

It will be great if we can include this in a 1.30.1 crit-tools release, so we can update it in the containerd CI. We need to prepare for the runc 1.2.0 final release, so we can just update whenever it is released :).

The containerd PR is this: containerd/containerd#9313. I did a hack (use my fork) to test the fixes indeed fix everything on containerd CI. It does, see here: https://github.com/containerd/containerd/actions/runs/9808474525/job/27084476978?pr=9313.

Without this commit, it fails on AlmaLinux 8 saying the syscall is not implemented: https://github.com/containerd/containerd/actions/runs/9797348691/job/27053863516?pr=9313#step:9:203

Is it possible to release cri-tools 1.30.1 soon with this change?

cc @saschagrunert


critest is used in projects like containerd, that test against older distros (like AlmaLinux 8). In those distros, CI will fail when we upgrade to runc 1.2.0.

With runc 1.1 those test don't fail because runc doesn't support idmap mounts and the tests are skipped in that case. But with runc 1.2.0-rc.2, that supports idmap mounts, the tests are not skipped but fail on distros with older kernels that don't support idmap mounts.

This commit just tries to detect if the path used for the container rootfs supports idmap mounts. To do that it uses the Status() message from CRI with verbose param set to true. It parses the output that containerd sets (it's quite unspecified that field), and otherwise fallbacks to "/var/lib" as the path to test idmap mounts support.

What type of PR is this?

/kind bug
/kind failing-test

What this PR does / why we need it:

We need it to fix the CI in containerd, once we update runc to 1.2.
It might be needed here too, once runc 1.2 is used here

Which issue(s) this PR fixes:

None

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Skip tests with idmap mounts on host that don't support it

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 5, 2024
@k8s-ci-robot
Copy link
Contributor

Welcome @rata!

It looks like this is your first PR to kubernetes-sigs/cri-tools 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cri-tools has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jul 5, 2024
@@ -876,6 +879,11 @@ var _ = framework.KubeDescribe("Security Context", func() {
if !supportsUserNamespaces {
Skip("no runtime handler found which supports user namespaces")
}

pathIDMap := rootfsPath(resp.GetInfo())
if err := supportsIDMap(pathIDMap); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we use the runtime handler features feature? shouldn't containerd not report the feature is available if it's not?

Copy link
Contributor Author

@rata rata Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runtime handler features reports if the OCI runtime knows about idmap mounts and how to handle them, this is done at compile time. The thing is, OCI spec mandates that unrecognized lines are ignored. So, an old OCI runtime that doesn't know about idmap mounts, will just ignore the lines "uidMapping" and so on the mounts.

The end result if that happens is horrible, because there is no mapping applied and then you use the volume with the hostUID (the ID on the host the container is mapped to) and then create files. The hostUID changes every time you create the pod, so the volume will be with garbage in the UID/GID of files.

This is why we added to the runtime-spec the way to report in features if idmap mounts are supported: opencontainers/runtime-spec#1219

That is to avoid that situation: you just don't give idmap mounts to runtime that don't know about it (that report no support for idmap mounts). Because they will ignore it and create the aforementioned mess.

So once you know the runtime will handle it properly (using the features you mentioned), you can pass it to the runtime. But idmap mount support is filesystem-dependant. That means, each filesystem used needs to support it. And more filesystems are adding support every kernel release.

So you send the request to the OCI runtime to use idmap mounts on some volumes, but the runtime can return an error if the given filesystem doesn't support idmap mounts on the kernel it is running.

In particular in AlmaLinux 8 what is happening is: runc is new enough to handle them correctly, so we can pass a config.json with idmap mounts to that runc version. So we do. But the kernel is old and doesn't support the syscalls needed (so it doesn't support idmap mounts), so runc returns an error when trying to set it up.

What this PR does is just testing for idmap mounts supports and, if it is not supported, it skips the test.

tl; dr: the features command is useful to know if the OCI runtime knows about idmap mounts at compile time (see more info on the problem it solves here: opencontainers/runtime-spec#1219). However, we can run runc in filesystems that don't support idmap mounts (one fs can support it but another don't), and runc will return an error in those cases. We just skip the tests instead, to avoid the expected error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the shorter answer is: containerd can only report it will be used or fail, but not ignored (as it was before if containerd/runc didn't know about idmap mounts and userns).

But each file-system used by mounts on the container needs to support idmap mounts. So it can't report yes/no without knowing which pod (e.g. if the pod users NFS, then it will fail, but if only uses ext4 and tmpfs, it will work on lot of kernels).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what file system are we trying to create a mount here? if runc thinks alma support idmapped mounts I'd expect all the mounts we create to work. like, we control what environments we run these tests in. We could make sure they're not nfs but instead ext4 or tmpfs (whatever the alma kernel supports at that time). What file systems are these tests running on that are failing, but runc is saying it's supporteD?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haircommander I'm not sure what you are thinking. I'm definitely missing something, but I'm not sure what.

runc can't do that, that is really not possible. For several reasons, one is that the spec mandates to not do that:

The Features structure is irrelevant to the actual availability of the features in the host operating system. Hence, the content of the Features structure SHOULD be determined on the compilation time of the runtime, not on the execution time.

(from https://github.com/opencontainers/runtime-spec/blob/main/features.md)

And this to be at compilation time is really reasonable, for several reasons.

Another reason is that the syscalls are not supported in AlmaLinux 8 (opentree, mount_setattr and move_mount).

As I said in the first message: the features thing is only useful to see if the runtime will not ignore the idmap setting and create a mess. It's okay to return an error if some fs is not supported, the kernel is too old or something. That is what we want. We just need to skip tests on setups that will fail.

Am I missing something?

pkg/validate/security_context_linux.go Outdated Show resolved Hide resolved
pkg/validate/security_context_linux.go Outdated Show resolved Hide resolved
critest is used in projects like containerd, that test against older
distros (like AlmaLinux 8). In those distros, CI will fail when we
upgrade to runc 1.2.0.

With runc 1.1 those test don't fail because runc doesn't support idmap
mounts and the tests are skipped in that case. But with runc 1.2.0-rc.2,
that supports idmap mounts, the tests are not skipped but fail on
distros with older kernels that don't support idmap mounts.

This commit just tries to detect if the path used for the container
rootfs supports idmap mounts. To do that it uses the Status() message
from CRI with verbose param set to true. It parses the output that
containerd sets (it's quite unspecified that field), and otherwise
fallbacks to "/var/lib" as the path to test idmap mounts support.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 8, 2024
@rata rata force-pushed the rata/userns-fixes branch 3 times, most recently from 160798c to d929d7a Compare July 8, 2024 15:44
Sascha suggested to run this only once. Let's cache the answer from the
runtime and move the tests that need idmap mounts on the host to
`When("Host idmap mount support is needed"`.

While we split the tests in that way, let's just query idmap mount
support for the tests that need it, using the cache.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jul 9, 2024
@saschagrunert
Copy link
Member

saschagrunert commented Jul 9, 2024

/hold

Feel free to lift the hold once we got more review or you think this is good to be merged.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 9, 2024
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 9, 2024
@rata rata force-pushed the rata/userns-fixes branch 4 times, most recently from 3b082e9 to 2db7481 Compare July 9, 2024 15:33
@rata
Copy link
Contributor Author

rata commented Jul 9, 2024

The test fail was unrelated to this, some github issue:

fatal: unable to access 'https://github.com/containernetworking/plugins.git/': Failed to connect to github.com port 443 after 21 ms: Connection refused

I pushed again due to that.

@saschagrunert thanks a lot! I polished the code and fixed some edge cases now. I've also verified it passes all the tests in the containerd CI too(https://github.com/containerd/containerd/actions/runs/9859447900/job/27223414193?pr=9313) :)

containerd creates a userns and inside there, it runs the critest tool.
However, in that setup, the length of containerd's userns is not the
whole UID space.

Let's verify that the length of the userns inside the pod, when we
created it with NamespaceMode_NODE (IOW, when not using a new userns for
the pod) is the same as outside the pod.

This works fine when contained itself runs inside a userns.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 10, 2024
@saschagrunert
Copy link
Member

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 10, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rata, saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 767879c into kubernetes-sigs:master Jul 10, 2024
23 checks passed
@saschagrunert
Copy link
Member

I created release-1.30 and a corresponding cherry-pick in #1492

@rata rata deleted the rata/userns-fixes branch July 10, 2024 08:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants