Populate local IP address from BPF space #1829

oazizi000 · 2024-01-24T17:49:58Z

Is your feature request related to a problem? Please describe.
With #1808, we collect local IP addresses from eBPF space when doing server-side tracing. For consistency, we should investigate adding support for collecting the local IP address with client-side tracing. This will provide more consistency in understanding of our tables for all users.

Describe the solution you'd like
Collect and populate local IP address in any tables that contain the local IP address.

Describe alternatives you've considered
Alternative solutions could try to populate it from user-space, but BPF-based collection is preferred for performance and reliability reasons.

Additional context
For background see #1807 #1808 and #1809

benkilimnik · 2024-01-25T23:46:22Z

I believe we can merge http_events with tcp_stats_events using the tcp stats connector.

Ran a quick test in PxL and this seems to do the trick.

import px

# Load http events data
df = px.DataFrame(table='http_events', start_time='-300s')
df = df['time_', 'local_port', 'local_addr', 'remote_port', 'remote_addr', 'trace_role']
df = df[df.trace_role == 1]
df = df[df.local_port == -1]

# Load tcp stats data
tcp_stats_df = px.DataFrame(table="tcp_stats_events", start_time='-300s', select=["time_","local_addr", "local_port", "remote_addr", "remote_port"])

# Group by remote_addr and remote_port and aggregate to get one local_addr per remote_addr:port tuple (drop local port column, which presumably contains several local ports).
tcp_stats_df = tcp_stats_df.groupby(['local_addr', 'remote_addr', 'remote_port']).agg()

# merge on connection tuple
# note that there may be duplicates because multiple Pods configured with hostNetwork: true may be using their respective host node's IP addresses, leading them to share the same local IP addresses as those of their nodes.
merged_df = df.merge(tcp_stats_df, how='inner', left_on=["remote_addr", "remote_port"], right_on=["remote_addr", "remote_port"], suffixes=['_http', '_tcp'])

px.display(df, "http events")
px.display(tcp_stats_df, "tcp")
px.display(merged_df, "http merged with tcp")

For this to work, the TCP stats source connector needs to enabled in stirling. Perhaps it would be worth adding it to kProd?

Summary: Populate client side trace's local address via tcp kprobes This change populates client side trace's `local_addr` and `local_port` columns for the following use cases: 1. To provide more consistency for the protocol data tables. Having columns that are empty make it difficult for end users to understand what is being traced and make them less useful 2. To facilitate addressing a portion of the short lived process problems (#1638) For 2, the root of the issue is that `df.ctx["pod"]` syntax relies on the [px.upid_to_pod_name](https://docs.px.dev/reference/pxl/udf/upid_to_pod_name/) function. If a PEM misses the short lived process during its metadata update, this function fails to resolve the pod name. For client side traces where the pod is making an outbound connection (non localhost), the `local_addr` column provides an alternative pod name lookup for short lived processes when the pod is long lived. This means the following would be equivalent to the `df.ctx["pod"]` lookup: `px.pod_id_to_pod_name(px.ip_to_pod_id(df.local_addr))`. I intend to follow this PR with a compiler change that will make `df.ctx["pod"]` try both methods should `px.upid_to_pod_name` fail to resolve. This will allow the existing pxl scripts to display the previously missed short lived processes. **Alternatives** Another approach I considered was expanding our use of the `sock_alloc` kprobe. I used ftrace on a simple curl command to see what other options could be used (`sudo trace-cmd record -F -p function_graph http://google.com`). The `socket` syscall calls `sock_alloc`, which would be another mechanism for accessing the `struct sock`. I decided against this approach because I don't think its viable to assume that the same thread/process that calls `socket` will be the one that does the later syscalls (how our BPF maps are set up). It's common to have a forking web server model, which means a different process/thread can call `socket` than the ones that later read/write to it. **Probe stability** These probes appear to be stable from our oldest and newest supported kernel. These functions exist in the [tcp_prot](https://elixir.bootlin.com/linux/v4.14.336/source/net/ipv4/tcp_ipv4.c#L2422), [tcpv6_prot](https://elixir.bootlin.com/linux/v4.14.336/source/net/ipv6/tcp_ipv6.c#L1941) structs and I've seen that other projects and bcc tools use these probes. This makes me believe that these functions have a pretty well defined interface. Relevant Issues: #1829, #1638 Type of change: /kind feature Test Plan: New tests verify that ipv4 and ipv6 cases work - [x] Ran `for i in $(seq 0 1000); do curl http://google.com/$i; sleep 2; done` within a pod and verified that `local_addr` is populated with this change and `px.pod_id_to_pod_name(px.ip_to_pod_id(df.local_addr))` works for pod name resolution. - [x] Verified the above curl test results in traces without `local_addr` without this change ![local-addr-testing](https://github.com/user-attachments/assets/344be022-97a0-4096-8af7-8de20d741e40) - Tested on the following k8s offerings and machine images - [x] GKE COS and Ubuntu - [x] EKS Amazon Linux 2 Changelog Message: Populate socket tracer data table `local_addr` and `local_port` column for client side traces. --------- Signed-off-by: Dom Del Nano <ddelnano@gmail.com>

…1989) Summary: Populate client side trace's local address via tcp kprobes This change populates client side trace's `local_addr` and `local_port` columns for the following use cases: 1. To provide more consistency for the protocol data tables. Having columns that are empty make it difficult for end users to understand what is being traced and make them less useful 2. To facilitate addressing a portion of the short lived process problems (pixie-io#1638) For 2, the root of the issue is that `df.ctx["pod"]` syntax relies on the [px.upid_to_pod_name](https://docs.px.dev/reference/pxl/udf/upid_to_pod_name/) function. If a PEM misses the short lived process during its metadata update, this function fails to resolve the pod name. For client side traces where the pod is making an outbound connection (non localhost), the `local_addr` column provides an alternative pod name lookup for short lived processes when the pod is long lived. This means the following would be equivalent to the `df.ctx["pod"]` lookup: `px.pod_id_to_pod_name(px.ip_to_pod_id(df.local_addr))`. I intend to follow this PR with a compiler change that will make `df.ctx["pod"]` try both methods should `px.upid_to_pod_name` fail to resolve. This will allow the existing pxl scripts to display the previously missed short lived processes. **Alternatives** Another approach I considered was expanding our use of the `sock_alloc` kprobe. I used ftrace on a simple curl command to see what other options could be used (`sudo trace-cmd record -F -p function_graph http://google.com`). The `socket` syscall calls `sock_alloc`, which would be another mechanism for accessing the `struct sock`. I decided against this approach because I don't think its viable to assume that the same thread/process that calls `socket` will be the one that does the later syscalls (how our BPF maps are set up). It's common to have a forking web server model, which means a different process/thread can call `socket` than the ones that later read/write to it. **Probe stability** These probes appear to be stable from our oldest and newest supported kernel. These functions exist in the [tcp_prot](https://elixir.bootlin.com/linux/v4.14.336/source/net/ipv4/tcp_ipv4.c#L2422), [tcpv6_prot](https://elixir.bootlin.com/linux/v4.14.336/source/net/ipv6/tcp_ipv6.c#L1941) structs and I've seen that other projects and bcc tools use these probes. This makes me believe that these functions have a pretty well defined interface. Relevant Issues: pixie-io#1829, pixie-io#1638 Type of change: /kind feature Test Plan: New tests verify that ipv4 and ipv6 cases work - [x] Ran `for i in $(seq 0 1000); do curl http://google.com/$i; sleep 2; done` within a pod and verified that `local_addr` is populated with this change and `px.pod_id_to_pod_name(px.ip_to_pod_id(df.local_addr))` works for pod name resolution. - [x] Verified the above curl test results in traces without `local_addr` without this change ![local-addr-testing](https://github.com/user-attachments/assets/344be022-97a0-4096-8af7-8de20d741e40) - Tested on the following k8s offerings and machine images - [x] GKE COS and Ubuntu - [x] EKS Amazon Linux 2 Changelog Message: Populate socket tracer data table `local_addr` and `local_port` column for client side traces. --------- Signed-off-by: Dom Del Nano <ddelnano@gmail.com> GitOrigin-RevId: 0c1fdd2

oazizi000 mentioned this issue Jan 24, 2024

[Add Local Addr & Port 1/3] Capture local addr from socket for accept syscalls #1808

Merged

ddelnano mentioned this issue Aug 23, 2024

Populate client side trace's local address via tcp kprobes #1989

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Populate local IP address from BPF space #1829

Populate local IP address from BPF space #1829

oazizi000 commented Jan 24, 2024

benkilimnik commented Jan 25, 2024 •

edited

Loading

Populate local IP address from BPF space #1829

Populate local IP address from BPF space #1829

Comments

oazizi000 commented Jan 24, 2024

benkilimnik commented Jan 25, 2024 • edited Loading

benkilimnik commented Jan 25, 2024 •

edited

Loading