-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IDMapping field for mount point #1143
Conversation
specs-go/config.go
Outdated
// UID/GID mappings used for changing file owners w/o calling chown, fs should support it | ||
// every mount point could have its own mapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the comment is missing some punctuation.
specs-go/config.go
Outdated
|
||
// UID/GID mappings used for changing file owners w/o calling chown, fs should support it | ||
// every mount point could have its own mapping | ||
IDMappings []LinuxIDMapping `json:"id_mappings,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I don't understand is why we do not have separate UID and GID mappings, can you please explain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I kept in mind one limited case, I agree in for general purpose, better to have both uid and gid.
as @kolyshkin pointed out, I think we need separate mappings for UIDs and GIDs. |
131dca8
to
125f429
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should you describe the new fields config.md
, too?
This is Linux-specific addition to the Mount
struct. Should we introduce a platform specific sub-struct LinuxMountOpts
for clearly separating platform-specific stuff, WDYT?
could we just use something like |
as well as the json schema under |
Mm, I didn't realize that 🙄 Sounds better |
Initially I though to just keep uid/gid as strings into Options. |
specs-go/config.go
Outdated
@@ -117,6 +117,11 @@ type Mount struct { | |||
Source string `json:"source,omitempty"` | |||
// Options are fstab style mount options. | |||
Options []string `json:"options,omitempty"` | |||
|
|||
// UID/GID mappings used for changing file owners w/o calling chown, fs should support it. | |||
// Every mount point could have its own mapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: missing .
at EOL.
specs-go/config.go
Outdated
UIDMappings []LinuxIDMapping `json:"uid_mappings,omitempty"` | ||
GIDMappings []LinuxIDMapping `json:"gid_mappings,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add platform:"linux"
as suggested by @giuseppe (similar to how it is done for Type
above).
This also needs an addition to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach with temporary user namespace allows to create ID mapping per mount point, which is more flexible.
Can I ask what is the use case for that?
I am not sure we want to do that. If we add one mapping per mount to the spec, then we either:
- Runtimes will implement the logic to use a tmp userns for all mounts, which is more complex. Or maybe something more complex might be needed (I honestly don't know if having one userns per mount on each container can hit some limits), like see if we can reuse the userns the OCI container might create, if not create a new one. If other existing mounts need the same mapping, share the userns used...
- Or have runtimes not implement it as it is more complex and no use case really needed it so far, in which case it is just silly to have it on the spec as some runtime might implement it and some others don't, which defeats the purpose of the spec in a way.
I honestly can't see any use case where using different mappings per mount is useful for OCI. I think in most cases we will want to use the same userns as the container. That even supports to migrate to userns without chowning the volumes, I guess almost everyone using userns will want that.
Also, the upside of not doing it now, is that we can in the future add mappings per mount if needed. But we can't remove it later if we add it now.
What is the use case for having different mappings per mount? Also, aren't there limits to create potentially so many userns? I guess not, but unsure.
Maybe I'm looking at this from a very kubernetes centric POV and missing other important use cases, sorry if that is the case, just let me know :)
I'm all in for adding idmap to the runtime-spec, though :)
@@ -117,6 +117,11 @@ type Mount struct { | |||
Source string `json:"source,omitempty"` | |||
// Options are fstab style mount options. | |||
Options []string `json:"options,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idmap option is one of the fstab options and that is why we don't need to mention it in any other part of the spec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it is explicit when a mapping is specified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, so we use idmap if a mapping is specified? Makes sense. Thanks!
if it can help, there was an RFE for crun to allow per-mount customization of the mappings: |
Thanks! That helps, but it doesn't say really the use case, it just says they want to not map root for the volume, they say what they want, not why they want that. I'm unsure, for example, if they wouldn't benefit of not having root at all mapped (nor for the container nor for the volume), because it doesn't really say why they want that. But well, if crun is doing it maybe it is worth supporting it in OCI. What do others think? |
We might have e.g. log grabber which in container1 under user1 and it works with log producers in container2 and container3 working under user2 and user3 appropriately.
Usage of the same userns for idmap as in container requires to have|request that user ns by OCI, but idmap is attribute of mount, but not user namespace, yes, notwithstanding in linux kernel it was implemented by [gid|uid]_map of specific user namespace. But the initial purpose of [gid|uid]_map in /proc/$pid/[gid|uid]_map was to map uid of the processes.
As a summary I chose per mount option because of:
|
Ok, I guess you are talking about a k8s pod, right? That use case wouldn't fit more naturally into using securityContext.supplementalGroups (it takes an array), to add the GIDs that are needed to the log grabber? Or just using fsGroup? Or just having logs o+r to containers in the pod, that is another option too. I find it kind of forced to use idmap for that. Maybe it is just me?
Right. But what is the use case for this in OCI? I'm not saying there is not one, I'm just saying I don't see it.
Right, if in the future we want this, we add the mappings per mount. I'm not sure where you are going with this. What am I missing? As I said, I will not oppose to adding idmap to OCI, I do want to move idmap to OCI. What I'm not sure I see is the use case for this extra complexity. But I'm fine with this if other's think this is needed, of course :) |
IMO the OCI PoV should be as generic as possible to expose the kernel features, even if a use case is not clear yet. Upper layers will then decide if/how to use these knobs. |
@giuseppe @AlexeyPerevalov Makes sense, I buy this now. Sorry for the noise :) |
I'm not sure the entire range of problems with DAC could be solved by groups. The initial intent of id mapping feature was to avoid any preparation steps with image, and take image as is, use it with any user with minimum operation to start container. |
@tianon PTAL 🙏🏻 |
344fb18
to
38c0390
Compare
Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com> Co-authored-by: Giuseppe Scrivano <giuseppe@scrivano.org>
38c0390
to
9d1130d
Compare
This has 2 LGTM now, is there anything else we want before merging? :) |
bah, why is pullaprove still here. I thought we removed it. |
That was only ever removed in the runc repo, as part of @kolyshkin's infra clean-up there after he joined as a maintainer. The runtime spec has been languishing in this regard, see also #1101... |
Pullapprove seems to be gone now (replaced by build-pr/run maybe?), but all checks are green. Anything else missing to merge? :) |
Last I checked pullapprove was still blocking the merge, but can confirm it's indeed now gone! 👍 |
Add IDMapping for mount points see opencontainers/runtime-spec#1143 Signed-off-by: Aditya R <arajan@redhat.com>
Add IDMapping for mount points see opencontainers/runtime-spec#1143 Signed-off-by: Aditya R <arajan@redhat.com>
The format is the same as [user namespace mappings](config-linux.md#user-namespace-mappings). | ||
* **`gidMappings`** (array of type LinuxIDMapping, OPTIONAL) The mapping to convert GIDs from the source file system to the destination mount point. | ||
For more details see `uidMappings`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any specific reason that we don't have this for rootfs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't have other mount options for rootfs.
And practically, it won't be very helpful since overlay doesn't support idmapped mounts on the overlay mount itself (only on the lower layers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the spec currently says the runtime should not modify the rootfs permissions: https://github.com/opencontainers/runtime-spec/blob/main/config-linux.md#user-namespace-mappings
The runtime SHOULD NOT modify the ownership of referenced filesystems to realize the mapping.
IDMap for mount point allows to apply file system feature for changing FS entity owner, once and with minimal performance impact.
In runc it could be used for shared volumes (but kernel still doesn't support overlayfs, ext[2,3,4] fs is supported in kernel > 5.17)
This interface implies the ID mapping per mount point.
Technically mount_setattr syscall requires fd of the user namespace to write mount ID map into uid_map in the procfs. Practically it could be arbitrary user ns, e.g. user namespace which was create by runc for container's process, or temporary.
Approach with temporary user namespace allows to create ID mapping per mount point, which is more flexible.