Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Populate local IP address from BPF space #1829

Open
oazizi000 opened this issue Jan 24, 2024 · 1 comment
Open

Populate local IP address from BPF space #1829

oazizi000 opened this issue Jan 24, 2024 · 1 comment

Comments

@oazizi000
Copy link
Contributor

Is your feature request related to a problem? Please describe.
With #1808, we collect local IP addresses from eBPF space when doing server-side tracing. For consistency, we should investigate adding support for collecting the local IP address with client-side tracing. This will provide more consistency in understanding of our tables for all users.

Describe the solution you'd like
Collect and populate local IP address in any tables that contain the local IP address.

Describe alternatives you've considered
Alternative solutions could try to populate it from user-space, but BPF-based collection is preferred for performance and reliability reasons.

Additional context
For background see #1807 #1808 and #1809

@benkilimnik
Copy link
Member

benkilimnik commented Jan 25, 2024

I believe we can merge http_events with tcp_stats_events using the tcp stats connector.

Ran a quick test in PxL and this seems to do the trick.

import px

# Load http events data
df = px.DataFrame(table='http_events', start_time='-300s')
df = df['time_', 'local_port', 'local_addr', 'remote_port', 'remote_addr', 'trace_role']
df = df[df.trace_role == 1]
df = df[df.local_port == -1]

# Load tcp stats data
tcp_stats_df = px.DataFrame(table="tcp_stats_events", start_time='-300s', select=["time_","local_addr", "local_port", "remote_addr", "remote_port"])

# Group by remote_addr and remote_port and aggregate to get one local_addr per remote_addr:port tuple (drop local port column, which presumably contains several local ports).
tcp_stats_df = tcp_stats_df.groupby(['local_addr', 'remote_addr', 'remote_port']).agg()

# merge on connection tuple
# note that there may be duplicates because multiple Pods configured with hostNetwork: true may be using their respective host node's IP addresses, leading them to share the same local IP addresses as those of their nodes.
merged_df = df.merge(tcp_stats_df, how='inner', left_on=["remote_addr", "remote_port"], right_on=["remote_addr", "remote_port"], suffixes=['_http', '_tcp'])

px.display(df, "http events")
px.display(tcp_stats_df, "tcp")
px.display(merged_df, "http merged with tcp")

For this to work, the TCP stats source connector needs to enabled in stirling. Perhaps it would be worth adding it to kProd?

ddelnano added a commit that referenced this issue Sep 5, 2024
Summary: Populate client side trace's local address via tcp kprobes

This change populates client side trace's `local_addr` and `local_port`
columns for the following use cases:
1. To provide more consistency for the protocol data tables. Having
columns that are empty make it difficult for end users to understand
what is being traced and make them less useful
2. To facilitate addressing a portion of the short lived process
problems (#1638)

For 2, the root of the issue is that `df.ctx["pod"]` syntax relies on
the
[px.upid_to_pod_name](https://docs.px.dev/reference/pxl/udf/upid_to_pod_name/)
function. If a PEM misses the short lived process during its metadata
update, this function fails to resolve the pod name. For client side
traces where the pod is making an outbound connection (non localhost),
the `local_addr` column provides an alternative pod name lookup for
short lived processes when the pod is long lived. This means the
following would be equivalent to the `df.ctx["pod"]` lookup:
`px.pod_id_to_pod_name(px.ip_to_pod_id(df.local_addr))`.

I intend to follow this PR with a compiler change that will make
`df.ctx["pod"]` try both methods should `px.upid_to_pod_name` fail to
resolve. This will allow the existing pxl scripts to display the
previously missed short lived processes.

**Alternatives**

Another approach I considered was expanding our use of the `sock_alloc`
kprobe. I used ftrace on a simple curl command to see what other options
could be used (`sudo trace-cmd record -F -p function_graph
http://google.com`). The `socket` syscall calls `sock_alloc`, which
would be another mechanism for accessing the `struct sock`. I decided
against this approach because I don't think its viable to assume that
the same thread/process that calls `socket` will be the one that does
the later syscalls (how our BPF maps are set up). It's common to have a
forking web server model, which means a different process/thread can
call `socket` than the ones that later read/write to it.

**Probe stability**

These probes appear to be stable from our oldest and newest supported
kernel. These functions exist in the
[tcp_prot](https://elixir.bootlin.com/linux/v4.14.336/source/net/ipv4/tcp_ipv4.c#L2422),
[tcpv6_prot](https://elixir.bootlin.com/linux/v4.14.336/source/net/ipv6/tcp_ipv6.c#L1941)
structs and I've seen that other projects and bcc tools use these
probes. This makes me believe that these functions have a pretty well
defined interface.

Relevant Issues: #1829, #1638

Type of change: /kind feature

Test Plan: New tests verify that ipv4 and ipv6 cases work
- [x] Ran `for i in $(seq 0 1000); do curl http://google.com/$i; sleep
2; done` within a pod and verified that `local_addr` is populated with
this change and `px.pod_id_to_pod_name(px.ip_to_pod_id(df.local_addr))`
works for pod name resolution.

- [x] Verified the above curl test results in traces without
`local_addr` without this change

![local-addr-testing](https://github.com/user-attachments/assets/344be022-97a0-4096-8af7-8de20d741e40)
- Tested on the following k8s offerings and machine images
- [x] GKE COS and Ubuntu
- [x] EKS Amazon Linux 2

Changelog Message: Populate socket tracer data table `local_addr` and
`local_port` column for client side traces.

---------

Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
ddelnano added a commit to ddelnano/pixie that referenced this issue Sep 23, 2024
…1989)

Summary: Populate client side trace's local address via tcp kprobes

This change populates client side trace's `local_addr` and `local_port`
columns for the following use cases:
1. To provide more consistency for the protocol data tables. Having
columns that are empty make it difficult for end users to understand
what is being traced and make them less useful
2. To facilitate addressing a portion of the short lived process
problems (pixie-io#1638)

For 2, the root of the issue is that `df.ctx["pod"]` syntax relies on
the
[px.upid_to_pod_name](https://docs.px.dev/reference/pxl/udf/upid_to_pod_name/)
function. If a PEM misses the short lived process during its metadata
update, this function fails to resolve the pod name. For client side
traces where the pod is making an outbound connection (non localhost),
the `local_addr` column provides an alternative pod name lookup for
short lived processes when the pod is long lived. This means the
following would be equivalent to the `df.ctx["pod"]` lookup:
`px.pod_id_to_pod_name(px.ip_to_pod_id(df.local_addr))`.

I intend to follow this PR with a compiler change that will make
`df.ctx["pod"]` try both methods should `px.upid_to_pod_name` fail to
resolve. This will allow the existing pxl scripts to display the
previously missed short lived processes.

**Alternatives**

Another approach I considered was expanding our use of the `sock_alloc`
kprobe. I used ftrace on a simple curl command to see what other options
could be used (`sudo trace-cmd record -F -p function_graph
http://google.com`). The `socket` syscall calls `sock_alloc`, which
would be another mechanism for accessing the `struct sock`. I decided
against this approach because I don't think its viable to assume that
the same thread/process that calls `socket` will be the one that does
the later syscalls (how our BPF maps are set up). It's common to have a
forking web server model, which means a different process/thread can
call `socket` than the ones that later read/write to it.

**Probe stability**

These probes appear to be stable from our oldest and newest supported
kernel. These functions exist in the
[tcp_prot](https://elixir.bootlin.com/linux/v4.14.336/source/net/ipv4/tcp_ipv4.c#L2422),
[tcpv6_prot](https://elixir.bootlin.com/linux/v4.14.336/source/net/ipv6/tcp_ipv6.c#L1941)
structs and I've seen that other projects and bcc tools use these
probes. This makes me believe that these functions have a pretty well
defined interface.

Relevant Issues: pixie-io#1829, pixie-io#1638

Type of change: /kind feature

Test Plan: New tests verify that ipv4 and ipv6 cases work
- [x] Ran `for i in $(seq 0 1000); do curl http://google.com/$i; sleep
2; done` within a pod and verified that `local_addr` is populated with
this change and `px.pod_id_to_pod_name(px.ip_to_pod_id(df.local_addr))`
works for pod name resolution.

- [x] Verified the above curl test results in traces without
`local_addr` without this change

![local-addr-testing](https://github.com/user-attachments/assets/344be022-97a0-4096-8af7-8de20d741e40)
- Tested on the following k8s offerings and machine images
- [x] GKE COS and Ubuntu
- [x] EKS Amazon Linux 2

Changelog Message: Populate socket tracer data table `local_addr` and
`local_port` column for client side traces.

---------

Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
GitOrigin-RevId: 0c1fdd2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants