Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dynamic extraction of a parameter attribute #3143

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

ScriptSathi
Copy link
Contributor

@ScriptSathi ScriptSathi commented Nov 19, 2024

The discussion for this PR can be found here #3142

This is currently a draft.
I wanted to start the discussion before submitting the final code, as I think, it is a big enough PR.

Take this PR in the today state as a proof of concept for dynamic parameter extraction. I will continue to work on this PR until the below checks are done.

At the current state, the PR is able to

  • Extract parameters from LSM programs
  • Extract parameters from kprobes programs
  • Handle exception for uprobes as it is not part of BTF
  • Have unit tests for all cases
  • Have written documentation

Description

This PR introduce the dynamic extraction of a custom attribute

Comments

  • I do not like the use of OverwriteType parameter. But since the function argSelectorType does not receive EventConfig, it is not possible to search for the type using directly BTF types. So I suggest doing another PR before this one is merged to do so if possible. Then I'll remove the parameter. It is also not possible to add an if condition for uprobes in this function. So if the user defines the parameter, it will overwrite the type at this moment.
  • I choose to use u8 to store the offset, as the verifier does not allow me to use u16. At this moment, I don't know if it is possible to found offset > 255 in BTF structures. For my uses cases, using u8 was enough.
  • I would appreciate seeing a closer relationship between BTF and the existing types. This mean that we could directly call it from YAML instead of hard-coding it. At least doing so for primitives types, and hard-code only the complex structures

Test the PR

You can use the following config

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: "lsm"
spec:
  lsmhooks:
  - hook: "bprm_check_security"
    args:
      - index: 0
        type: "linux_binprm"
        extractParam: "file.f_path.dentry.d_name.name"
        overwriteType: "string"
    selectors:
      - matchArgs:
        - index: 0
          operator: "Postfix"
          values:
            - "ls"
            - "sh"
            - "bash"

If you want to test it with more arguments, you can use bprm_creds_from_file hook. It has struct linux_binprm and struct file which are supported.

Release-note

Add dynamic extraction of a parameter attribute

@ScriptSathi ScriptSathi requested a review from a team as a code owner November 19, 2024 16:27
@ScriptSathi ScriptSathi requested a review from jrfastab November 19, 2024 16:27
@ScriptSathi ScriptSathi marked this pull request as draft November 19, 2024 16:28
@ScriptSathi ScriptSathi force-pushed the pr/ScriptSathi/add-dynamic-paramter-extraction branch from 9a9d21f to ce33ec7 Compare November 20, 2024 10:48
Copy link

netlify bot commented Nov 20, 2024

Deploy Preview for tetragon ready!

Name Link
🔨 Latest commit a04e634
🔍 Latest deploy log https://app.netlify.com/sites/tetragon/deploys/67814c806294f10008ea68ac
😎 Deploy Preview https://deploy-preview-3143--tetragon.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@ScriptSathi ScriptSathi force-pushed the pr/ScriptSathi/add-dynamic-paramter-extraction branch from ce33ec7 to 5ebb3e0 Compare November 20, 2024 16:00
@mtardy mtardy linked an issue Nov 21, 2024 that may be closed by this pull request
2 tasks
@mtardy mtardy added the release-note/major This PR introduces major new functionality label Nov 21, 2024
@ScriptSathi ScriptSathi force-pushed the pr/ScriptSathi/add-dynamic-paramter-extraction branch 2 times, most recently from e3117d7 to ba048b3 Compare November 21, 2024 10:59
Copy link
Contributor

@olsajiri olsajiri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, left some comments

thanks for splitting the code in multiple logical changes, it's rare ;-)

we were originally thinking of using C expression parsing (there's some go lib for that) but I think that can be done later if it's ever needed, the basic parsing you did should be fine

please add tests, would be great to have some framework where we could easily add various 'ExtractParam' expressions for testing

pkg/sensors/tracing/genericuprobe.go Show resolved Hide resolved
bpf/process/types/basic.h Outdated Show resolved Hide resolved
pkg/btf/btf.go Show resolved Hide resolved
pkg/btf/btf.go Outdated Show resolved Hide resolved
bpf/process/generic_calls.h Outdated Show resolved Hide resolved
bpf/process/types/basic.h Outdated Show resolved Hide resolved
bpf/process/generic_calls.h Outdated Show resolved Hide resolved
bpf/process/generic_calls.h Outdated Show resolved Hide resolved
pkg/sensors/tracing/generickprobe.go Show resolved Hide resolved
@ScriptSathi ScriptSathi force-pushed the pr/ScriptSathi/add-dynamic-paramter-extraction branch from ba048b3 to 1fb7dbc Compare November 23, 2024 17:54
@pchaigno pchaigno removed the request for review from jrfastab December 9, 2024 18:05
@ScriptSathi ScriptSathi force-pushed the pr/ScriptSathi/add-dynamic-paramter-extraction branch 2 times, most recently from 753fd10 to 2b1153f Compare January 10, 2025 14:12
@ScriptSathi ScriptSathi changed the title Add dynamic parameter extraction Add dynamic extraction of a parameter attribute Jan 10, 2025
This commit introduces the `struct config_btf_arg_depth`. It appends
`btf_argN` to `struct event_config` as an array. This array stores the
path to the searched data.

Any `btf_argN` can have a list of elements as follow :
file->f_path is 152 bytes, so the array will look like
[{ offset: 152, is_pointer: 0, is_initialized: 1 }, ...].

The max value `MAX_BTF_ARG_DEPTH` as been set arbitrary as the verifier
need a fixed size.

In config_btf_arg, is_pointer and is_initialized are u16 because it must
match padding of 64 bits long structure

Signed-off-by: Tristan d'Audibert <tristan.daudibert@orange.com>
@ScriptSathi ScriptSathi force-pushed the pr/ScriptSathi/add-dynamic-paramter-extraction branch from 2b1153f to a04e634 Compare January 10, 2025 16:36
@ScriptSathi ScriptSathi marked this pull request as ready for review January 10, 2025 16:38
@ScriptSathi ScriptSathi requested a review from mtardy as a code owner January 10, 2025 16:38
Tristan d'Audibert added 2 commits January 10, 2025 17:54
…_argN

This commit introduces the “extract_arg_depth” function and loops over it
to move into the “arg” buffer of config->btf_argN[i]->offset by iteration.
The goal is to reach the requiered data by overwriting over arg with
the new value.

Signed-off-by: Tristan d'Audibert <tristan.daudibert@orange.com>
This commit checks if btf.Type exists in Tetragon existing types.

For instance:

`struct file` with btf is called `file` and also exists in
`GenericStringToType` with the same alias.
However, the attribute `name` in `struct qstr` has a returned type
`unsigned char` wich does not exist yet in `GenericStringToType`.

The same thing happened with `linux_binprm->filename` as the type is
`char`

Signed-off-by: Tristan d'Audibert <tristan.daudibert@orange.com>
@ScriptSathi ScriptSathi force-pushed the pr/ScriptSathi/add-dynamic-paramter-extraction branch from a04e634 to 058c828 Compare January 10, 2025 16:55
Tristan d'Audibert added 7 commits January 10, 2025 19:11
FindNextBtfType function recursively searches in a btf structure in order
to find a specific path until it reaches the target or fails.

The function also searches in embedded anonymous structures or unions to
cover as much use cases as possible. For instance, mm_struct has 2
fields; anonymous struct and another type. But you are still able to look
into the anonymous struct by specifying a path like "mm.pgd.pgd".

For instance, if the search is in the linux_binprm structure and the
path is `file.f_path.dentry.d_name.name`, the following
actions will be done.

- Look for the variable name `file` inside `linux_binprm`.

- If it matches, it stores the offset from linux_binprm where the `file`
  variable could be found.

- Then it takes the btf type `file` and searches for a parameter named
  `f_path`.

Signed-off-by: Tristan d'Audibert <tristan.daudibert@orange.com>
In order to read the data properly on BPF side, integer/long values must
use `bpf_probe_read`. So now, every time the latest type retrieved is
defined as an integer, it will be safely read before accessing the data.

Signed-off-by: Tristan d'Audibert <tristan.daudibert@orange.com>
This commit adds 2 parameters to give the ability to the user to search
for a specific variable following a "path" as follow:

```yml
...
    args:
      - index: 0
        type: "linux_binprm"
        extractParam: "file.f_path.dentry.d_name.name"
        overwriteType: "string"
...
```

The above config can be used to extract a specific parameter from the
structure at index 0.

Signed-off-by: Tristan d'Audibert <tristan.daudibert@orange.com>
The OverwriteType parameter should be deleted if `argSelectorType` can use
the `EventConfig` structure to search for the correct Type.

Signed-off-by: Tristan d'Audibert <tristan.daudibert@orange.com>
Searches if every user defined type with ExtractParam exists as a BTF
type and stores its corresponding offset in ConfigBtfArg.

This function does a basic split on `ExtractParam` to obtain the "path"
to the required data. Then, the array is given to `btf.FindNextBTFType` to
find the offset of each element until we reach the required data.
The output is stored in EventConfig to keep the normal behaviour.

For example, if the arg 0 is `struct linux_binprm` and ExtractParam is
set to `file.f_path.dentry.d_name.name`, the output will give an array
of all the offsets from their parents as follows :
[{ offset: 96, is_pointer: 0 }, { offset: 152, is_pointer: 1 }, ...]

Signed-off-by: Tristan d'Audibert <tristan.daudibert@orange.com>
This commit updates `addLsm` function to use the `ExtractParam`
and `OverwriteType` in order to look for the attributes in BTF
structure.

Signed-off-by: Tristan d'Audibert <tristan.daudibert@orange.com>
…probes

As BTF types are not defined for Uprobes, their offsets can't be found in
BTF file. Thus, with this commit, if the user defines ExtractParam /
OverwriteType, their are ignored and a warning is displayed

Signed-off-by: Tristan d'Audibert <tristan.daudibert@orange.com>
Tristan d'Audibert added 5 commits January 10, 2025 19:11
Add very similar code as in `genericlsm.go` file to handle ExtractParam
feature.

Signed-off-by: Tristan d'Audibert <tristan.daudibert@orange.com>
This commit adds 3 tests for FindNextBtfType algorithm.
The first is `testAssertEqualBtfPath` to assert that a specific
path has the exact same btfConfig as expected.
The second, "testAssertPathIsAccessible" tries to reach the path and
asserts that no errors are raised.
The third test "testAssertErrorOnInvalidPath" asserts that the error
messages raised if the path is incorrect.

The chosen test cases have embed union/anonymous structs.

The important thing to notice in case of adding new tests is to be
aware of btf changes on different architectures. Especially for
`testAssertEqualBtfPath` where Offset could be different.

To Test locally :
```
go test -exec "sudo" ./pkg/btf/
```

Signed-off-by: Tristan d'Audibert <tristan.daudibert@orange.com>
This test aims at testing the ExtractParam feature.

Tetragon use hard-coded types instead of BTF. So currently, the
ExtractParam feature does not allow to extract attributes from other
types than the few hard-coded in `generictypes.go`.

For instance, if you try to use the feature with task_struct structure,
it will fail because it is not yet hard-coded.

Signed-off-by: Tristan d'Audibert <tristan.daudibert@orange.com>
Signed-off-by: Tristan d'Audibert <tristan.daudibert@orange.com>
Signed-off-by: Tristan d'Audibert <tristan.daudibert@orange.com>
@ScriptSathi ScriptSathi force-pushed the pr/ScriptSathi/add-dynamic-paramter-extraction branch from 058c828 to 391c5a8 Compare January 10, 2025 18:32
@kkourt kkourt self-requested a review January 14, 2025 14:55
Copy link
Contributor

@olsajiri olsajiri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great, I left some comments, thanks

@@ -172,6 +185,11 @@ struct event_config {
*/
__u32 policy_id;
__u32 flags;
struct config_btf_arg btf_arg0[MAX_BTF_ARG_DEPTH];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not follow the user space and have just btf_arg[5][10]
I don't see btf_arg[1-4] being used anywhere

"fmt"
"strings"

_btf "github.com/cilium/ebpf/btf"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit _ looks strange to me, would preffer ebtf which we use on some other place

return processMembers(btfArgs, currentType, t.Members, pathToFound, i)
case *btf.Pointer:
(*btfArgs)[i-1].IsPointer = uint16(1)
return FindNextBtfType(btfArgs, t.Target, pathToFound, i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is called (at beggining) with i == 0, but you are setting [i -1] field, could this crash?


Return: The last type found matching the path, or error.
*/
func FindNextBtfType(
Copy link
Contributor

@olsajiri olsajiri Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is mostly called like:
lastBtfType, err := btf.FindNextBtfType(btfArg, rootType, partialPath, 0)
which seems confusing because it's called next but return last type,
also always called with zero index

how about we add another top level function like ResolveParam that would
prepare the path array and called local function that would recurse?

@@ -341,6 +341,10 @@ func addUprobe(spec *v1alpha1.UProbeSpec, ids []idtable.EntryID, in *addUprobeIn
return nil, fmt.Errorf("Error add arg: ArgType %s Index %d out of bounds",
a.Type, int(a.Index))
}

if a.ExtractParam != "" || a.OverwriteType != "" {
logger.GetLogger().Warnf("Extracting parameters from Uprobes is not supported, ignoring")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's fail the policy, it's wrong IMO

assert.ErrorContains(t, err, "Attribute 'fail' not found in structure ''")
}

func TestFindNextBtf(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test crashes in CI, please check

{Offset: 64, IsPointer: 1, IsInitialized: 1},
{Offset: 168, IsPointer: 1, IsInitialized: 1},
{Offset: 0, IsPointer: 1, IsInitialized: 1},
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this will fail the same way like TestBuildBtfArgFromLSMPolicy on offset values?
could you just read expected offsets from BTF?

- index: 0
type: "linux_binprm"
extractParam: "mm.owner.real_parent.real_parent.comm"
overwriteType: "string"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need overwriteType? could we maybe just do something like:

 - index: 0
   type: "string"
   extractParam: "linux_binprm.file.f_path.dentry.d_name.name"

the overwrite stuff might cause headaches later

also maybe rename extractParam to resolve? not sure ;-)

cc @kkourt

struct config_btf_arg btf_arg1[MAX_BTF_ARG_DEPTH];
struct config_btf_arg btf_arg2[MAX_BTF_ARG_DEPTH];
struct config_btf_arg btf_arg3[MAX_BTF_ARG_DEPTH];
struct config_btf_arg btf_arg4[MAX_BTF_ARG_DEPTH];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not follow the user space part and have just btf_arg[5][MAX_BTF_ARG_DEPTH]
I did not see btf_arg[1-4] being touched elsewhere..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/major This PR introduces major new functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add dynamic parameter extration for specific use cases
3 participants