Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for TCX Programs #1222

Merged
merged 2 commits into from
Oct 21, 2024
Merged

Add Support for TCX Programs #1222

merged 2 commits into from
Oct 21, 2024

Conversation

anfredette
Copy link
Collaborator

@anfredette anfredette commented Aug 12, 2024

This PR adds support for TCX programs using priority-based API for
TCX modeled after the existing TC API.

Both the Linux kernel and Aya use a relative positioning approach for
ordering TCX programs that requires new TCX programs to be positioned
relative to an existing TCX program. To improve the user experience
across multiple independent applications, we’ve retained a TC-like
priority API in bpfman, which internally maps to Aya’s relative
ordering API.

Additional Changes:

  • A TCX integration test is included.
  • A go-tcx-counter example was added, and the go-app-counter example has been
    updated to include a TCX program.
  • GitHub Actions are now configured to use ubuntu-24.04 to meet the kernel
    version requirement (6.6 or higher) for TCX.

Fixes: #1215

@anfredette anfredette changed the title bpfman TCX support DRAFT bpfman TCX support Aug 12, 2024
@anfredette anfredette changed the title DRAFT bpfman TCX support Draft bpfman TCX support Aug 12, 2024
@anfredette anfredette force-pushed the tcx-support branch 3 times, most recently from 81c8ff9 to ef40d84 Compare August 12, 2024 20:21
Copy link

codecov bot commented Aug 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 8.57%. Comparing base (0b69a8d) to head (82a6937).
Report is 6 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff            @@
##            main   #1222      +/-   ##
========================================
- Coverage   8.94%   8.57%   -0.37%     
========================================
  Files         24      24              
  Lines       6027    6285     +258     
========================================
  Hits         539     539              
- Misses      5488    5746     +258     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@anfredette anfredette force-pushed the tcx-support branch 2 times, most recently from ebc8772 to f093d59 Compare August 13, 2024 13:46
@anfredette anfredette force-pushed the tcx-support branch 10 times, most recently from a058491 to 6d03cd7 Compare August 26, 2024 22:10
@anfredette anfredette force-pushed the tcx-support branch 3 times, most recently from e254539 to 247f0f3 Compare August 28, 2024 17:29
@anfredette anfredette changed the title Draft bpfman TCX support Add Support for TCX Programs Aug 28, 2024
Copy link
Contributor

mergify bot commented Aug 30, 2024

@anfredette, this pull request is now in conflict and requires a rebase.

@mergify mergify bot added the needs-rebase label Oct 9, 2024
@mergify mergify bot removed the needs-rebase label Oct 11, 2024
@anfredette anfredette marked this pull request as ready for review October 11, 2024 14:44
@anfredette anfredette requested a review from a team as a code owner October 11, 2024 14:44
@anfredette
Copy link
Collaborator Author

This TCX pr is ready for review.

Info::TcxAttachInfo(TcxAttachInfo {
priority,
iface,
position: _,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u drop position if its not in use ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

position is set later when the program is attached.

#[prost(string, tag = "2")]
pub iface: ::prost::alloc::string::String,
#[prost(int32, tag = "3")]
pub position: i32,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

position is set later when the program is attached.

links::FdLink,
loaded_programs,
tc::{SchedClassifierLink, SchedClassifierLinkId, TcAttachOptions},
trace_point::TracePointLink,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is tcx in this ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A tcx and tc programs are considered to be the same type by Aya and the kernel. It's the attach type that's different.

@@ -737,6 +762,105 @@ fn filter(
})
}

fn get_txc_programs(root_db: &Db, if_index: u32, direction: Direction) -> Vec<TcxProgram> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo ? should be tcx ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

let id = bytes_to_string(&p)
.split('_')
.last()
.unwrap()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to remove the unwrap() calls. This pattern is in this file a few times. Move it to a separate function and re-code it without the unwrap() calls.

Copy link
Collaborator Author

@anfredette anfredette Oct 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a helper function to get the id and fixed this one instance. However, it requires returning a Result in this function and handling it several other places. If people are okay with the way I did it, then I can modify the existing code similarly.

if order.is_none() {
np.set_current_position(tcx_programs.len())?;
order = Some(AttachOrder::After(
tcx_programs.last().unwrap().get_data().get_id()?,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this unwrap() fail? Would prefer not to have it if possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this one cannot fail because I've already checked tcx_programs.is_empty() at the top, and if it was it would have already returned. However, I plan to split this function into two separate functions to handle the cases with and without a new_program. I'd also like to see if I can make this whole function cleaner. I'll handle the unwrap() when I do that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-worked this function and updated the last commit.

let owned_link: SchedClassifierLink = tcx.take_link(link_id)?;
let fd_link: FdLink = owned_link
.try_into()
.expect("unable to get owned tcx attach link");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.expect() calls are scattered throughout the code, but if they are hit, they are a panic. I would like to avoid adding any new ones and slowly remove all the existing ones in the code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


pub fn get_direction(&self) -> Result<Direction, BpfmanError> {
sled_get(&self.data.db_tree, TCX_DIRECTION)
.map(|v| bytes_to_string(&v).to_string().try_into().unwrap())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid the unwrap().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// result, we use the following hack to figure out which one it
// really is.
ProgramType::Tc => {
if data.db_tree.get(TCX_IFACE).unwrap().is_some() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid the unwrap().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

RUN CGO_ENABLED=0 GOOS=${TARGETOS:-linux} GOARCH=${TARGETARCH} go build


FROM --platform=$TARGETPLATFORM registry.fedoraproject.org/fedora-minimal:latest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FROM --platform=$TARGETPLATFORM registry.fedoraproject.org/fedora-minimal:latest
FROM registry.fedoraproject.org/fedora-minimal:latest

Removed in all the files in the multi-arch PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

ARG BUILDPLATFORM

# The following ARGs are set internally by docker or podman on multiarch builds
ARG TARGETPLATFORM=linux/amd64
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ARG TARGETPLATFORM=linux/amd64
ARG TARGETARCH
ARG TARGETOS
ARG TARGETPLATFORM
RUN echo "TARGETOS=${TARGETOS} TARGETARCH=${TARGETARCH} BUILDPLATFORM=${BUILDPLATFORM} TARGETPLATFORM=${TARGETPLATFORM}"

Similar change made in all the files in the multi-arch PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


FROM --platform=$TARGETPLATFORM registry.fedoraproject.org/fedora-minimal:latest

ARG TARGETPLATFORM
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ARG TARGETPLATFORM

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

.unwrap()
.parse::<u32>()
.unwrap();
let tree = root_db.open_tree(p).expect("unable to open database tree");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.expect() calls are scattered throughout the code, but if they are hit, they are a panic. I would like to avoid adding any new ones and slowly remove all the existing ones in the code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -215,6 +224,10 @@ pub async fn add_program(mut program: Program) -> Result<Program, BpfmanError> {

add_multi_attach_program(root_db, &mut program, &mut image_manager, config).await
}
Program::Tcx(_) => {
program.set_if_index(get_ifindex(&program.if_name().unwrap())?)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this unwrap() fail? Would prefer not to have it if possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

mergify bot commented Oct 16, 2024

@anfredette, this pull request is now in conflict and requires a rebase.

bpfman/src/utils.rs Fixed Show fixed Hide fixed
@anfredette anfredette force-pushed the tcx-support branch 2 times, most recently from 83c36cb to bfa64cf Compare October 21, 2024 12:36
Copy link
Contributor

mergify bot commented Oct 21, 2024

@anfredette, this pull request is now in conflict and requires a rebase.

@mergify mergify bot added the needs-rebase label Oct 21, 2024
This PR adds support for TCX programs using priority-based API for
TCX modeled after the existing TC API.

Both the Linux kernel and Aya use a relative positioning approach for
ordering TCX programs that requires new TCX programs to be positioned
relative to an existing TCX program. To improve the user experience
across multiple independent applications, we’ve retained a TC-like
priority API in bpfman, which internally maps to Aya’s relative
ordering API.

Additional Changes:

- A TCX integration test is included.
- A go-tcx-counter example was added, and the go-app-counter example has been
  updated to include a TCX program.
- GitHub Actions are now configured to use ubuntu-24.04 to meet the kernel
  version requirement (6.6 or higher) for TCX.

Signed-off-by: Andre Fredette <afredette@redhat.com>
Also split up and reworked the update_tcx_program_positions()
function.

Signed-off-by: Andre Fredette <afredette@redhat.com>
@msherif1234
Copy link
Contributor

/LGTM

Copy link
Collaborator

@Billy99 Billy99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Before I approve, quick question. How does bpfman handle it if TCX programs are already installed on the system outside of bpfman? For TC and XDP, it doesn't, but since the multi-prog is handled by the kernel, what is the expect behavior?

@anfredette
Copy link
Collaborator Author

Both bpfman and other applications can install tcx programs. We don't currently pay attention to TCX programs installed by other means. The first bpfman TCX program on a given attach point is placed in first position regardless of whether there are already other programs attached. Subsequent bpfman TCX programs on the same attach point are positioned relative to existing bpfman programs based on priority without regard to any other TCX programs that may have been attached by other means. We've talked about some options for cooperating better with non-bpfman loaders, but this is the first step.

@mergify mergify bot merged commit 6c1a8e6 into bpfman:main Oct 21, 2024
56 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support TCX in bpfman
3 participants