cgroup tracking #3170

kkourt · 2024-11-27T13:40:03Z

See commits

tetragon: add support for associating pod information when nested cgroups are used

netlify · 2024-11-27T13:40:46Z

✅ Deploy Preview for tetragon ready!

Name	Link
🔨 Latest commit	`b3070ec`
🔍 Latest deploy log	https://app.netlify.com/sites/tetragon/deploys/6749e8ff0080ff0008557cca
😎 Deploy Preview	https://deploy-preview-3170--tetragon.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

kkourt · 2024-11-27T19:13:54Z

Checkpatch failure cannot be fixed:

CHECK:MACRO_ARG_PRECEDENCE: Macro argument 'member' may be better as '(member)' to avoid precedence issues
  #136: FILE: bpf/lib/bpf_helpers.h:111:
  +#define offsetof(type, member) ((unsigned long)&((type *)0)->member)

mtardy

Overall it looks good, it's a large PR though! Cgroup tracking looks cool as well! Thanks I have a few comments

pkg/testutils/progs/tester.go

mtardy · 2024-11-29T14:29:04Z

contrib/tester-progs/go/tester/tester.go

not a bit fan of the name, could you use something like tester-shell or something a bit more explicit? Also not sure to understand why the code is in pkg/testutils/ and not here, to run the tests without changing everything else?

OK, will rename it to test-hellper. I was also not very happy with "tester"

Also not sure to understand why the code is in pkg/testutils/ and not here, to run the tests without changing everything else?

It's easier to develop both the test and the code that uses it in the same file. For example, you can add both the commands and the helper for the tests that use it.

bpf/cgroup/bpf_cgtracker.c

mtardy · 2024-11-29T14:41:15Z

pkg/cgtracker/cgtracker.go

+}
+
+func OpenMap(fname string) (Map, error) {
+	m, err := ebpf.LoadPinnedMap(fname, &ebpf.LoadPinOptions{})


do we want to close this map at some point or should it just die with tetragon?

Well it's on the caller to close the map. I don't see how we could close it from OpenMap.
We should be closing it for the non-global uses, but for global we keep it open until tetragon dies.

yep yep agree, just pointed this line to talk about it

pkg/cgidarg/cgidarg.go

pkg/cgidmap/cgidmap.go

pkg/cgidmap/rthooks.go

mtardy · 2024-11-29T15:02:25Z

pkg/option/config.go

@@ -103,6 +103,7 @@ type config struct {

 	EnableCgIDmap      bool
 	EnableCgIDmapDebug bool
+	EnableCgTrackerID  bool


nit: naming is a bit off because you have ID but Cg, but you have other names with this also.

Not sure I understand 🤔

mtardy · 2024-11-29T15:17:03Z

pkg/cgtracker/test/cgtracker_test.go

@@ -78,7 +78,7 @@ func cgfsMkdirTemp(t *testing.T, cgfsPath string) string {
 	// create tempdir
 	dir, err := os.MkdirTemp(cgfsPath, "cgtracker-test-*")
 	if err != nil {
-		t.Fatalf("failed to create test dir: %s", err)
+		t.Skipf("skipping test, failed to create test dir: %s", err)


I understand the decision to skip, maybe what you would need for this case is to create a dir in a subsystem directory, like memory for v1 to make it work nop?

maybe, would need to investigate more.

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

GetTestSensorManager requires the observer, which means that if we try to use pkg/testutils/sensors from something that the observer imports, we will get a import cycle. Move GetTestSensorManager to a new package (pkg/testutils/observer) to avoid this. This is needed for a susequent commit. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

Future change will introduce an import cycle: package github.com/cilium/tetragon/pkg/cgtracker imports github.com/cilium/tetragon/pkg/sensors imports github.com/cilium/tetragon/pkg/policyfilter imports github.com/cilium/tetragon/pkg/process imports github.com/cilium/tetragon/pkg/cgidmap imports github.com/cilium/tetragon/pkg/cgtracker: import cycle not allowed in test Fix it by moving GetWorkloadMetaFromPod from process to podhelpers package. After this change, policyfilter does need not to import process, which breaks the cycle. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

The method is to be used in a subsequent patch. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

Add a simple tester program. The idea is that tester acts as a simple shell that we can use to execute commands without having to exec again. This allows us, for example, to set add tester in a cgroup that we want and having everything executed by it being on the same cgroup. Tests in subsequent commits will use it. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

Add a utility function to copy an exec file to temp. This allows us to do reliable binary filtering. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

kkourt · 2024-11-29T16:17:26Z

Overall it looks good, it's a large PR though! Cgroup tracking looks cool as well! Thanks I have a few comments

Thanks! Pushed a new version.

Pod association with cgidmap is based on the cgroup id. One limitation of the current implementation is that if a cgroup inside the container is created, processes running in this nested cgroup will not be properly associated with the container. To address this, this commit introduces the cgtracker map that maps these nested cgroup ids (tracked ids) to the container cgroup id (tracker id). There is already older some BPF code (see bpf/cgroup and bpf/lib/bpf_cgroup.h) that is aimed at a similar idea, but it is only used in tests. The idea behind the old code was to use the cgroup level as a way to identify which cgroups to track. The current code follows a simpler approach and relies on user-space to setup the initial values of the map (e.g., from rthooks). This will be added in subsequent commits. The plan is to migrate the tests to the new approach and remove this old code, but this is left as a followup. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

This commit adds userspace code for cgtracker and a test. The test loads the cgtracker program and maps and checks that the programs operate as expected. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

This commit adds a cgtracker command to the CLI aimed for troubleshooting. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

Update the cgtracker from userspace in two paths from the cgidmap code: i) in a runtime-hook, when we learn of a new container, and ii) in the cri-resolver that is a result of an unmapped id from a pod event. For case (i), cgtracker is updated before the container starts running and whatever cgroups are created once the container is created, should be tracked by the BPF hooks. Case (ii), is somewhat trickier because a cgroup creation from the container may race AddCgroupTrackerPath() and we might miss cgroups. We leave making this more robust as a followup. Some potential solutions might be to traverse more than one level towards the root in the BPF hook. Another is to periodically query the CRI and scan the cgroup paths for IDs we are missing. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

Add a configuraiton option to enable cgroup tracker id. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

This commit modifies the exec event to include a new field called cgroup tracker id (cgrp_tracker_id in bpf and CgrpTrackerID in go). The intention of this field is to track children cgroups. Specifically, in a container environment the cgroup of the container will be used to track all children cgroups within this container. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

This commit adds a test for cgtracker and the execve events which tests that if we crate a cgroup child the tracker id is what we expect. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

When running with cri-enabled, the criResolver goroutine needs to call cgtracker. The problem is that the cgtracker map is created when we the base sensor is loaded so it is not available when the criResolver first starts. This commit adds a retry loop to allow the base sensor to be loaded so that it becomes available for the criResolver goroutine. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

This is to avoid import cycles, because the cgtracker test imports from the exec sensor. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

Commit 0227f5d introduced pod association using cgidmap. This commit extends functionality to also support nested cgroups via cgtracker. Cgtracker allows pod association to work even if containers create their own cgroup hiearchy. Commit 0227f5d changed the GetPodInfo function to pass the cgroup id and map the cgroup id to a container id inside this function. This, however, will not work for the cgroup tracker. In the existing code, GetPodInfo() is only executed if the cgroup name we get from the bpf side matches the container name. This is fine for non-nested cgroups, but it will not work for nested cgroups because their name might be anything. We could remove the check, but this would mean that we would need to retry to resolve pod information for every non-container process in the system. Instead, this commit moves the association earlier, in msgToExecveUnix. If we find a container id, then process code will attempt pod association via GetPodInfo. Using runtime-hooks should ensure that we have the necessary information (cgroup id -> container id) mapping before the container starts. For other cases, one possible solution would be to implement retry, but we would need to have a way to distinguish which processes we need to retry (e.g., by marking processes in bpf with cgroups under a certain hierarchy). Furthermore, when using cgtracker, there is no need to use GetCgroupIDFromSubCgroup introduced in b1e8c44. In fact, if we use it there will be an error because we need the actual container id because this is what is being tracked. Hence, this commit also changes criResolve accordingly. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

Remove stale comments since previous commits implenented cgroup tracking. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

In GH runners, we get: --- FAIL: TestCgTrackerMap (0.12s) cgtracker_test.go:73: '/sys/fs/cgroup' is cgroup v1 cgtracker_test.go:81: failed to create test dir: mkdir /sys/fs/cgroup/cgtracker-test-1715271346: read-only file system So it seems that we cannot execute the test there. Skip it for now. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

tixxdz

For now a comment, next week I give it a review, it seems it will break in cgroupv1

tixxdz · 2024-11-30T19:29:05Z

bpf/cgroup/bpf_cgtracker.c

+	cgid = get_cgroup_id(cgrp);
+	if (cgid == 0)
+		return 0;
+	cgid_parent = cgroup_get_parent_id(cgrp);


Hmm didn't have time to review all, but want to point this is not cgroupv1 compatible, the mkdir and release delete will work on any cgoupv1 hierarchies, and if kernfs ids clash could lead to corruption?

The old branch has this check: https://github.com/cilium/tetragon/blob/pr/tixxdz/cgroup-bpf-full/bpf/cgroup/bpf_cgroup_mkdir.c#L33

So this could break in cgroupv1. Will check closely later.

I think it's fine to have this only for group v2 at the moment. That being said, it's unclear what the problem is for v1. What are the kernfs IDs that could clash?

you are following all hierarchies in cgroupv1! including the ones that we don't track, why?

The kernfs IDs are cgroup IDs if they are re-used you remove them from the tracking and it won't work anymore, see your cgrp release bpf program.

you are following all hierarchies in cgroupv1! including the ones that we don't track, why?

We can add a filter to tack only the configured hierarchy for v1. My first goal was to support cgroup v2 so that's what the first implementation does. Any cgroup v1 support might or might not work since we do not have tests for it.

The kernfs IDs are cgroup IDs if they are re-used you remove them from the tracking and it won't work anymore, see your cgrp release bpf program.

Not sure I understand. Are you saying that cgroup ids are unique only within a specific hierarchy?

you are following all hierarchies in cgroupv1! including the ones that we don't track, why?

We can add a filter to tack only the configured hierarchy for v1. My first goal was to support cgroup v2 so that's what the first implementation does. Any cgroup v1 support might or might not work since we do not have tests for it.

Exactly my point and with that linked bpf cgroup helper, it transparently handle it.

The kernfs IDs are cgroup IDs if they are re-used you remove them from the tracking and it won't work anymore, see your cgrp release bpf program.

Not sure I understand. Are you saying that cgroup ids are unique only within a specific hierarchy?

Yes cgroupv1 are separate cgroup mounts backed by kernfs, each kernfs node part of a mount has its unique ID that is the inode number. The allocation is predictable using IDR last time I checked.

To make sure I understand correctly, are you saying that it's possible for two cgroup nodes of different hierarchies to have the same kernfs id?

kkourt · 2024-12-02T09:20:29Z

Thanks, merging this. I'm happy to address additional feedback as a followup.

tixxdz · 2024-12-02T13:01:22Z

Thanks, merging this. I'm happy to address additional feedback as a followup.

I would at least specifically state this is only cgroupv2 compatible, since cgroupv1 was not tested.

kkourt changed the title ~~Pr/kkourt/cgroup tracking~~ cgroup tracking Nov 27, 2024

kkourt added the release-note/major This PR introduces major new functionality label Nov 27, 2024

kkourt force-pushed the pr/kkourt/cgroup-tracking branch 8 times, most recently from 2194168 to 7c5baa3 Compare November 27, 2024 18:02

kkourt marked this pull request as ready for review November 27, 2024 19:14

kkourt requested review from mtardy and a team as code owners November 27, 2024 19:14

mtardy approved these changes Nov 29, 2024

View reviewed changes

kkourt added 7 commits November 29, 2024 16:34

bpf: fix get_current_ancestor_cgroup_id declaration

4795c8c

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

chore: linter busywork

72ccbbd

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

rthooks: add a HostCgroupPath() method to arg

ca8ee2e

The method is to be used in a subsequent patch. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

testutils: add CopyExecToTemp

6bf1214

Add a utility function to copy an exec file to temp. This allows us to do reliable binary filtering. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

kkourt force-pushed the pr/kkourt/cgroup-tracking branch from 7c5baa3 to b3070ec Compare November 29, 2024 16:17

kkourt added 5 commits November 29, 2024 17:20

cgtracker: introduce user-space code

fb204a6

This commit adds userspace code for cgtracker and a test. The test loads the cgtracker program and maps and checks that the programs operate as expected. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

tetra: add cgtracker command

91124a0

This commit adds a cgtracker command to the CLI aimed for troubleshooting. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

cgidmap: misc comment fix

d0e826f

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

kkourt added 8 commits November 29, 2024 17:20

cgtrackerid: add configuration option

c3bfb1a

Add a configuraiton option to enable cgroup tracker id. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

cgtracker: add test for execve event

6a4c54d

This commit adds a test for cgtracker and the execve events which tests that if we crate a cgroup child the tracker id is what we expect. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

cgtracker: move test in its own package

51d282e

This is to avoid import cycles, because the cgtracker test imports from the exec sensor. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

bpf: remove stale comments

1f8af06

Remove stale comments since previous commits implenented cgroup tracking. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>

kkourt force-pushed the pr/kkourt/cgroup-tracking branch from b3070ec to 5e924fb Compare November 29, 2024 16:20

tixxdz reviewed Nov 30, 2024

View reviewed changes

kkourt merged commit 882a020 into main Dec 2, 2024
44 of 45 checks passed

kkourt deleted the pr/kkourt/cgroup-tracking branch December 2, 2024 09:20

kkourt mentioned this pull request Dec 3, 2024

cgtracker: add policyfilter support #3180

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cgroup tracking #3170

cgroup tracking #3170

kkourt commented Nov 27, 2024 •

edited

Loading

netlify bot commented Nov 27, 2024 •

edited

Loading

kkourt commented Nov 27, 2024

mtardy left a comment

mtardy Nov 29, 2024

kkourt Nov 29, 2024 •

edited

Loading

mtardy Nov 29, 2024

kkourt Nov 29, 2024

mtardy Dec 2, 2024

mtardy Nov 29, 2024

kkourt Nov 29, 2024

mtardy Nov 29, 2024

kkourt Nov 29, 2024

kkourt commented Nov 29, 2024

tixxdz left a comment

tixxdz Nov 30, 2024

kkourt Dec 2, 2024

tixxdz Dec 2, 2024

kkourt Dec 2, 2024

tixxdz Dec 2, 2024

kkourt Dec 2, 2024

kkourt commented Dec 2, 2024

tixxdz commented Dec 2, 2024

cgroup tracking #3170

cgroup tracking #3170

Conversation

kkourt commented Nov 27, 2024 • edited Loading

netlify bot commented Nov 27, 2024 • edited Loading

✅ Deploy Preview for tetragon ready!

kkourt commented Nov 27, 2024

mtardy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kkourt Nov 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kkourt commented Nov 29, 2024

tixxdz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kkourt commented Dec 2, 2024

tixxdz commented Dec 2, 2024

kkourt commented Nov 27, 2024 •

edited

Loading

netlify bot commented Nov 27, 2024 •

edited

Loading

kkourt Nov 29, 2024 •

edited

Loading