Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add minimal support for sampling #22

Closed
wants to merge 54 commits into from

Conversation

Phantomical
Copy link
Contributor

@Phantomical Phantomical commented Nov 9, 2022

This PR introduces an absolutely minimal skeleton of an API for reading sampled perf events.

Highlights

  • Adds a new Sampler type which is a counter that can also read Records from the kernel ring buffer.
  • Adds support for PERF_EVENT_MMAP records and nothing else.
  • A bunch of enum bindings and parsing code needed to get the whole thing off of the ground.

This is just enough to write a simple mmap end-to-end test to check that the whole thing is working. The rest of the features can be added backwards-compatibly in follow-up PRs. See the non-goals section for stuff that I have excluded from this PR.

API Summary

This is just a subset of the API prototypes but since this PR is so large I felt that it would be helpful to have it summarized.

API Prototypes
struct Sampler { .. };
impl Sampler {
  fn next(&mut self) -> Option<samples::Record>;
  fn next_blocking(&mut self, timeout: Option<Duration>) -> Option<samples::Record>;
}

impl Builder {
  // Note that this is additive
  fn sample(self, sample: Sample) -> Self;
  fn mmap2(self, bool) -> Self;

  fn build_sampler(self, buffer_len: usize) -> Sampler;
}

struct Sample {
   ... bitflags go here
}

mod samples {
  struct Record {
    pub ty: RecordType,
    pub misc: RecordMiscFlags,
    pub event: RecordEvent,
    pub sample_id: SampleId,
  }

  enum RecordEvent {
    Mmap(Mmap),
    .. new variants go here
    Unknown(Vec<u8>),
  }

  .. more binding types
}

How the API Works

  • Construct a Builder like normal, configure it, and call Builder::build_sampler to create a Sampler.
  • Call Sampler::next or Sampler::next_blocking to read the next sample::Record out of the ring buffer.
  • Use the Record for whatever profiling thing you wanted to do.

Future Compatibility

This is important part of any binding to perf_event_open. Here's how this API can be extended in a semver compatible manner.

  • Adding new record types: The RecordEvent enum is #[non_exhaustive]. When a new record type is added to the perf API then we can add support without having to break backwards compatibility. In the meantime, users can use RecordEvent::Unknown to handle records that perf-event does not support.
  • Adding new common fields to Record: I don't think this one will happen but nevertheless Record is also #[non_exhaustive]
  • Adding new fields to a record type: Not relevant to this PR since I haven't actually added support for those records either. However, this can be addressed by #[non_exhaustive] as well.
  • Supporting new values for kernel flags: In order to remain forward compatible, none of the enums used in the bindings are rust enums. Instead, they are newtypes around an integer with associated constants. That way if the kernel adds a new enum variant in the future it is still representable.

Non-goals for this specific PR

I'm trying to keep this PR as small as possible because it is already too big. Except for the last one, all of these can be added in follow-up PRs.

  • Support for any record types besides PERF_RECORD_MMAP.
  • Support for all the configuration options available within perf_event_attr.
  • Support for working with the aux buffer.
  • Accessors for reading some of the extra information available within the memory map.
  • Support for using the record bindings and parsing code in any context outside of reading events from a kernel ringbuffer. I think the requirements to do this well are too different from what perf-event needs.

Supersedes #1

@Phantomical Phantomical marked this pull request as ready for review November 10, 2022 06:58
@Phantomical
Copy link
Contributor Author

@jimblandy I think this is ready for review. It shouldn't be merged until a new version of bytes is available with my upstream PR but that doesn't affect the code.

@jimblandy
Copy link
Owner

@Phantomical This is very exciting. I will have time to review this on Friday.

@jimblandy
Copy link
Owner

Support for using the record bindings and parsing code in any context outside of reading events from a kernel ringbuffer. I think the requirements to do this well are too different from what perf-event needs.

I don't agree - but I think it's fine to proceed as you suggest anyway.

I do think we're going to need to be able to process data from files eventually. If we don't, then we basically force our users who also want to consume perf files to either use two separate crates, or switch entirely to some other crate that does support both use cases.

But I don't see the work here as likely to cut off future possibilities, in practice. I'm pretty sure we're going to end up with a lot of code that doesn't care where the data came from, which can serve both live and recorded sources. I think we can just proceed with the work here and think about files later.

@jimblandy
Copy link
Owner

All new features are behind an unstable feature gate - I can remove this if you think it's not worth it.

I think we can just leave out the feature. We'll be bumping the major version anyway.

Copy link
Owner

@jimblandy jimblandy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't able to get into the substance of this today, but here are a few minor comments I have so far:

perf-event/src/lib.rs Outdated Show resolved Hide resolved
Comment on lines 1225 to 1121
impl std::ops::Deref for Sampler {
type Target = Counter;

fn deref(&self) -> &Self::Target {
&self.counter
}
}

impl std::ops::DerefMut for Sampler {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.counter
}
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way of making Sampler support the operations of Counter is considered an anti-pattern in Rust:
https://rust-unofficial.github.io/patterns/anti_patterns/deref.html

Possible alternatives:

  • Add a counter method to Sampler, and let people call that to get the underlying Counter.
  • Create a trait with the methods, and have both Sampler and Counter implement that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, both of those alternatives aren't great :(

As a third alternative:

  • Copy the methods from Counter to Sampler and have them delegate to the real implementations in Counter

I think that either keeping the Deref impls (despite it being an antipattern) or copying the methods is the best choice. Ultimately, though, the choice is yours and I'll implement whichever one you prefer.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking this over: Some of the methods only make sense for counted events. Not all sampled events are counted events. But there are counters that record their overflows to the mmap area, so certainly Sampler needs the counter-related methods. So I guess there's nothing that can be omitted.

Let's just copy the methods for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the deref impls and used a macro to add the methods to both Counter and Sampler.

perf-event/src/lib.rs Outdated Show resolved Hide resolved
perf-event/src/lib.rs Outdated Show resolved Hide resolved
perf-event/src/lib.rs Outdated Show resolved Hide resolved
perf-event/Cargo.toml Outdated Show resolved Hide resolved
perf-event/src/lib.rs Show resolved Hide resolved
@janaknat
Copy link
Contributor

A small println type example would help in understanding the API usage.

@Phantomical
Copy link
Contributor Author

Phantomical commented Nov 16, 2022

@janaknat I agree. However, what's actually been added in this PR is almost too limited to do a proper example (beyond what is in the tests here). I will add that as one of the (many) follow-up PRs that I'll be making once this one gets merged.

@Phantomical
Copy link
Contributor Author

Phantomical commented Nov 21, 2022

Bytes v1.3.0 has now been released with the get_x_ne methods we depend on 🎉. This means that all the external blockers for this PR have now been resolved.

@jimblandy this is now ready for final review and/or merging, whichever you think is appropriate.

Clippy doesn't like having a method named `next` on a struct when it's
not an implementation of Iterator. This silences the clippy warning by
renaming the method.
The issue was a missing ! in the if guard on config.sample_id_all. This
would cause SampleId to parse fields when _not_ expected to and ignore
everything when it was expected to parse fields.

This commit also includes some test cases to exercise both options.
These are present within perf_event.h but aren't documented in the
manpage yet (not even the one for Linux 6.0).
This is also present within the source headers but undocumented in the
manpage.
@Phantomical
Copy link
Contributor Author

@jimblandy Do you think you'll have time to look at this anytime soon?

@janaknat
Copy link
Contributor

janaknat commented Feb 9, 2023

@Phantomical Is there a branch in your repo which has all the changes for full sampling support that I can fork off of? I'd love to give it a run.

@Phantomical
Copy link
Contributor Author

@janaknat I do! I have everything up in a branch here: https://github.com/Phantomical/perf-event/tree/sampling-work. I haven't touched it in a few months but it is more or less feature complete.

@janaknat
Copy link
Contributor

@Phantomical Thanks. I'll give it a go.

@janaknat
Copy link
Contributor

@Phantomical Is it possible to create a perf report compatible perf.data with the changes in your branch. Ideally, I'd like to use the perf.data that is generated to form flamegraphs through Rust.

Also, is there a source you used to fully understand the perf record functionality of perf_event_open()? I am looking at https://github.com/torvalds/linux/blob/master/tools/perf/Documentation/perf.data-file-format.txt . Curious to know what you are using as reference.

@Phantomical
Copy link
Contributor Author

@janaknat It is technically possible but it might not be the best approach. As far as I know, perf.data is basically a dump of the samples that were read from the perf_event_open ringbuffer. What you probably want is some sort of RawSampler that just gives the byte records without parsing them. That actually sounds pretty useful so I'll try and write it up this weekend and see what I get.

The best documentation on perf_event_open is the manpage. The arch linux one is more recent but is harder to read. The later sections of it cover how reading from the ringbuffer works. For anything not documented in the manpage I looked at the kernel source: everything relevant is defined in the perf-event.h header.

@Phantomical
Copy link
Contributor Author

Closing this in favour of #30 since I believe that to be a better approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants