Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for sampling #1

Closed
wants to merge 8 commits into from
Closed

Conversation

bobbobbio
Copy link

I've been working on some Rust code of my own to create cpu profiles using the linux perf-event sampling API. I saw that you already had a perf-event project going. I tried my hand at adding in some sampling code. Hopefully the PR is welcome. Let me know what needs to be fixed.

Remi Bernotavicius added 3 commits January 11, 2020 19:42
This patch only supports the PERF_RECORD_SAMPLE events, and is missing
support for certain things contained within the sample.

It is also missing support for samples other than PERF_RECORD_SAMPLE.
@bobbobbio bobbobbio requested a review from jimblandy January 30, 2020 04:29
@jimblandy
Copy link
Owner

Oh, this looks really interesting. I am in Germany this week for a Mozilla meeting, but I'll look this over early next week.

Copy link
Owner

@jimblandy jimblandy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for this! I ran out of time to finish the review; here's what I've got so far. More tomorrow.

src/lib.rs Outdated Show resolved Hide resolved
src/sample.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
@bobbobbio bobbobbio requested a review from jimblandy April 20, 2020 04:01
@jimblandy
Copy link
Owner

Fantastic. I am in a crunch right now with work and book, but I will try to look this over as soon as I can.

Comment on lines +980 to +987
mmap(
std::ptr::null_mut(),
SAMPLE_BUFFER_PAGES * page_size(),
PROT_READ | PROT_WRITE,
MAP_SHARED,
file.as_raw_fd(),
0,
) as isize
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've personally used the memmap crate for convenience, I wonder how much sense that would make here?

Comment on lines +944 to +948
/// head in the data section
data_head: AtomicU64,

/// user-space written tail
data_tail: AtomicU64,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if casting *const u64 pointers to &AtomicU64 (or even having a PerfEventDataBuffer struct with the data_* fields) would be preferable to copying the struct.

Or maybe there's a way to have bindgen use different types for some fields in the -sys crate?

(I'm interested in this because it's adjacent to the rdpmc usage, although all examples I've seen for that just compare the value of lock before/after rdpmc, and use barriers, without any otherwise-synchronized loads)

@jimblandy
Copy link
Owner

@eddyb If you're interested, I would love help getting this reviewed and merged. I'm overwhelmed at the moment, but if you need this and can give the API and code a look-through, I'd be happy to add you to the repo as a collaborator.

@eddyb
Copy link

eddyb commented Jul 9, 2020

@jimblandy Thanks for the offer! Although I've been overwhelmed as well, it just happens that I'm currently experimenting with fine-grained userspace instruction-counting for measureme (which powers rustc -Z self-profile), using rdpmc.

Thanks to perf-event-open-sys, I don't need too much to get started, and if I can make a nice API for it I'll try to upstream whichever design I end up with (I want to find the lowest-overhead method I can, as some of the profiled functions are tiny).

@jimblandy
Copy link
Owner

Thanks to perf-event-open-sys, I don't need too much to get started, and if I can make a nice API for it I'll try to upstream whichever design I end up with (I want to find the lowest-overhead method I can, as some of the profiled functions are tiny).

I was hoping this crate's API would be nice! :D

The enable and disable calls should compile to nothing more than an ioctl and some error checking, which seems like the best you can do with the perf_event_open API. It uses File internally, but an unbuffered File is nothing more than the file descriptor with the Drop impl you'll need anyway. The only way to streamline things that I can see would be to have Builder construct the perf_event_attr structure as it goes along.

In any case, I'll look forward to what you come up with!

@eddyb
Copy link

eddyb commented Jul 17, 2020

@jimblandy To expand a bit: I plan to use rdpmc to avoid incurring the cost of a syscall, and the remaining costs I want to golf have to do with the few memory reads from the perf mmap page, around the rdpmc instruction (search for "rdpmc" in the perf_event_open manpage for an example).

Oh and I guess because it's inline assembly, this low-overhead approach would be nightly-only for a while.

@blt
Copy link

blt commented Dec 7, 2021

@jimblandy this is something I'd be very interested in helping getting merged, if you'd be interested in having it resurrected.

@jimblandy
Copy link
Owner

Thanks! My first concern is that I haven't actually tried out the proposed API yet. Have you used this patch in your own work? What's it like?

@blt
Copy link

blt commented Dec 7, 2021

Oh yeah, sure thing. For context I'm in-progress on building a causal profiler for vector so that's where my interest in samples comes from. Here's the example program I built to satisfy myself that the patch was at least nominally functional, after having read it over:

use perf_event::events::Hardware;
use perf_event::sample::{PerfRecord, PerfSampleType};
use perf_event::Builder;
use std::num::NonZeroU64;
use std::sync::{
    atomic::{AtomicBool, Ordering},
    Arc,
};

fn collatz(n: NonZeroU64) -> u64 {
    let mut cur = n.get();
    let mut iters = 0;
    while cur != 1 {
        iters += 1;
        if cur % 2 == 0 {
            cur = cur / 2;
        } else {
            cur = (3 * cur) + 1;
        }
    }
    iters
}

fn main() -> std::io::Result<()> {
    let sampler = Builder::new()
        .kind(Hardware::CPU_CYCLES)
        .sample_frequency(1000)
        .sample(PerfSampleType::ADDR)
        .sample(PerfSampleType::CALLCHAIN)
        .sample(PerfSampleType::CPU)
        .sample(PerfSampleType::IP)
        .sample(PerfSampleType::PERIOD)
        .sample(PerfSampleType::STREAM_ID)
        .sample(PerfSampleType::TID)
        .sample(PerfSampleType::TIME)
        .sample_stream()?;

    sampler.enable()?;

    let do_sample = Arc::new(AtomicBool::new(true));

    // This is the area of the program that is doing the sampling. It runs until
    // signaled to stop from the main thread. The sampler is dropped to disable
    // it.
    let sampler_do_sample = Arc::clone(&do_sample);
    let handle = std::thread::spawn(move || {
        let do_sample = sampler_do_sample;
        while do_sample.load(Ordering::Relaxed) {
            if let Some(PerfRecord::Sample(sample)) = sampler.read(None).unwrap() {
                println!("{:?}", sample);
            } else {
                panic!();
            }
        }
        drop(sampler);
    });

    // This is the area of the program that is being sampled.
    for _ in 0..100_000 {
        for i in 1..100 {
            collatz(NonZeroU64::new(i).unwrap());
        }
    }
    do_sample.store(false, Ordering::Relaxed);

    handle.join().unwrap();

    Ok(())
}

I'm in-progress on reconstructing stack traces from samples but what I have is too much of a mess to be convincing one way or another. I'm happy to keep using this patch and report back on odd problems I hit, more elaborate success, if you'd like.

About the only awkward thing with the API as proposed is the lack of an explicit disable on the streamer. It flips off when dropped. Otherwise, absent any bugs I find in the next few days, it works well.

@bobbobbio
Copy link
Author

I'd be happy to make changes to this to get it merged.

I was interested in implementing some Rust on Linux tailored CPU profiling tools and was hoping to help create safe bindings for everything Linux offers, but I mostly just gave up waiting for someone else to be interested enough to help get stuff merged here or whatever

@blt
Copy link

blt commented Dec 7, 2021

I'd be happy to make changes to this to get it merged.

Rad, thank you!

I was interested in implementing some Rust on Linux tailored CPU profiling tools and was hoping to help create safe bindings for everything Linux offers, but I mostly just gave up waiting for someone else to be interested enough to help get stuff merged here or whatever

Totally makes sense. It's an... elaborate programming target. FWIW the current implementation works for my purposes and I'd be glad to see it landed, fiddle with the performance notes in the PR as my work progresses.

Comment on lines +135 to +142
/// This record is a sample of the current state of some process.
///
/// Corresponds to the anonymous struct under `PERF_RECORD_SAMPLE` in [`perf_event_open`][man] man
/// page.
///
/// [man]: http://man7.org/linux/man-pages/man2/perf_event_open.2.html
#[derive(Debug, Default, PartialEq, Eq)]
pub struct PerfRecordSample {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm currently working on a linux-perf-event-reader crate which has some complete code for parsing these records. It also supports "wrap-around" reading without copies, via RawData::Split.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's great!

One of the reasons I haven't merged this PR is that I am concerned about testability. I don't want to have long sections of detailed, technical code that I have no way to ensure coverage for. But this weekend I finally got around to adding an interface for mocking the underlying system calls, so we can write nice deterministic tests that cover all the edge cases. So hopefully we'll be able to make some progress here.

@jimblandy jimblandy force-pushed the master branch 2 times, most recently from a72f0dd to 9e26171 Compare July 3, 2022 06:18
@janaknat janaknat mentioned this pull request Nov 3, 2022
@Phantomical
Copy link
Contributor

@jimblandy @bobbobbio It doesn't look like there's any action here so is it alright if I revive this PR and try and bring it to a workable state?

@jimblandy
Copy link
Owner

Sure! We're looking for an API that is both safe, and close enough to what's described in the open_perf_event man page that we don't have to redocument the entire thing for it to be usable.

The thing I was most worried about here was that it would add a lot of complexity - and I want that to be covered by tests. The "hooks" feature was supposed to give me a way to write tests that could supply arbitrary data and force the use of every code path, without having to get the real kernel involved - because there's no way to ensure that's going to give you exactly the data you need to properly test the code.

But, nice clean code is always welcome, even if going that far with testing isn't your cup of tea. So yes, it'd be great to have something here.

@jimblandy
Copy link
Owner

This work has been taken up in some newer PRs. Thanks very much to @bobbobbio for this initial work!

@jimblandy jimblandy closed this Nov 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants