Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(profiling): rework internal types #214

Merged
merged 20 commits into from
Aug 16, 2023
Merged

Conversation

morrisonlevi
Copy link
Contributor

@morrisonlevi morrisonlevi commented Aug 12, 2023

What does this PR do?

This continues pulling out things from profiling/src/profiling/mod.rs and puts them into profiling/src/profile/internal/*. Additionally, it introduces traits for a few concepts that were already there, just unnamed (and not tidy):

  • Id, e.g. FunctionId, MappingId, LocationId, StackTraceId, etc.
  • Item, e.g. Function, Mapping, Location, StackTrace, etc.
  • PprofItem, e.g. Function, Mapping, Location, etc but not StackTrace.

The DedupExt trait was simplified, which is partly possible because I understand Rust better now, but also because of the new traits. It enforces the max-size and will panic if the containers get too big. Previously, only mappings checked for being full, which is a bit ironic because it's the least likely in practice to overflow.

Hopefully this explains why it's such a large diff: all these things are connected, although I probably snuck in a change or two that didn't strictly have to be in there (it's hard when it's this big to notice).

Motivation

The goals are to increase cleanliness and separation between the different representations of profiles. We now have api::*, internal::*, and pprof::*, which are fairly contained and separated. There should probably be more cleaning up of the internal::* parts, but this PR is big enough already.

Additional Notes

I would start the review by looking at profiling/src/profile/internal/mod.rs, and reading the 3 trait definitions first.

How to test the change?

Since this targets the internal representation, existing tests ought to be good enough.

danielsn and others added 6 commits August 11, 2023 15:45
This continues pulling out things from src/profiling/mod.rs and puts
them into `internal::*`. Additionally, it introduces traits for a few
concepts that were already there, just unnamed (and not tidy):

- Id, e.g. FunctionId, MappingId, LocationId, StackTraceId, etc.
- Item, e.g. Function, Mapping, Location, StackTrace, etc.
- PprofItem, e.g. Function, Mapping, Location, etc but not StackTrace.

The DedupExt trait was simplified, which is partly possible because I
understand Rust better now, but also because of the new traits. It
enforces the max-size and will panic if the containers get too big.
Previously, only mappings checked for being full, which is a bit
ironic because it's the least likely in practice to overflow.

Hopefully this explains why it's such a large diff: all these things
are connected, although I probably snuck in a change or two that didn't
strictly have to be in there (it's hard when it's this big to notice).
@github-actions github-actions bot added the profiling Relates to the profiling* modules. label Aug 12, 2023
@morrisonlevi morrisonlevi changed the base branch from dsn/smaller_structs to main August 12, 2023 22:14
@morrisonlevi morrisonlevi marked this pull request as ready for review August 12, 2023 22:16
@morrisonlevi morrisonlevi requested a review from a team as a code owner August 12, 2023 22:16
/// Address at which the binary (or DLL) is loaded into memory.
pub memory_start: u64,
/// The limit of the address range occupied by this mapping.
pub memory_limit: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it stands now, we're stuck with this; but if we change from a [start, end) formulation to a [start, start + size) formulation, we could realistically squeeze the size parameter into a u32. These mappings are actually ELF sections (otherwise including both start + file offset is kinda weird), and they probably need to be executable unless we're providing location information for non-code resources (why lol), soooo I personally think u32 would be reasonable there.

Obviously this would be inappropriate for this PR, just making a comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about needing to be executable makes it reasonable to expect it to fit in a u32?

/// The limit of the address range occupied by this mapping.
pub memory_limit: u64,
/// Offset in the binary that corresponds to the first mapped address.
pub file_offset: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realistically, I think this can be bound to a u32. This combined with my other comment would shave a byte off of this, but it would require a refactor at serialization.

profiling/src/profile/internal/function.rs Outdated Show resolved Hide resolved
profiling/src/profile/internal/line.rs Show resolved Hide resolved
profiling/src/profile/internal/mod.rs Show resolved Hide resolved
pub struct StackTrace {
/// The ids recorded here correspond to a Profile.location.id.
/// The leaf is at location_id[0].
pub locations: Vec<LocationId>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be Box<[Location]> and save some bytes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's save this kind of thing for another PR, this one is big enough! But good idea.

profiling/src/profile/internal/stack_trace.rs Outdated Show resolved Hide resolved

#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)]
pub struct ValueType {
pub r#type: StringId,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're making our own struct, should we call this typ instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I do prefer type personally :P )

profiling/src/profile/mod.rs Show resolved Hide resolved
impl<T: Sized + Hash + Eq> DedupExt<T> for FxIndexSet<T> {
fn dedup(&mut self, item: T) -> usize {
impl<T: Item> Dedup<T> for FxIndexSet<T> {
fn dedup(&mut self, item: T) -> <T as Item>::Id {
let (id, _) = self.insert_full(item);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the same assertion as intern does

Copy link
Contributor Author

@morrisonlevi morrisonlevi Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it makes sense to in this case: if it already exists, that's fine. The string case is different because it can't do a simple insert_full because it has to deal with &str vs String differences.

But, it did make me realize we can move that assertion in intern to be a debug_assert.

profiling/src/profile/mod.rs Show resolved Hide resolved
profiling/src/profile/mod.rs Outdated Show resolved Hide resolved
@morrisonlevi morrisonlevi force-pushed the levi/smaller-structs branch 2 times, most recently from a068c5a to 81cbfb8 Compare August 15, 2023 10:22
danielsn
danielsn previously approved these changes Aug 15, 2023
Copy link
Contributor

@danielsn danielsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple minor ⛏️ and comments but LGTM

profiling/src/profile/internal/location.rs Outdated Show resolved Hide resolved
profiling/src/profile/internal/mod.rs Show resolved Hide resolved
@@ -686,20 +487,22 @@ impl Profile {
let mut labels: Vec<Label> = Vec::with_capacity(sample.labels.len());
let mut local_root_span_id_label_offset: Option<usize> = None;
for label in sample.labels.iter() {
anyhow::ensure!(
label.str.is_none() || label.num == 0,
"Invalid label: {:?}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to explain in the message why the label is invalid?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 yes please! I left some suggestions for this on PR #205

Copy link
Contributor Author

@morrisonlevi morrisonlevi Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can combine both of the ensures into a single one:

anyhow::ensure!(
    label.str.is_none() || (label.num == 0 && label.num_unit.is_none()),
    "Invalid label: used both str and num fields: {label:?}"
);

Reasoning:

If  `label.str` is none, then
    it doesn't matter what values are held in `label.num` and `label.num_unit`
else
    then both `label.num` and `label.num_unit` should be zero-representations

Right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some tests to double-check this ^_^ They seem equivalent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good to me! The negations tripped me a bit so I had to pause a bit to mentally parse it, but the logic looks very correct!

profiling/src/profile/mod.rs Show resolved Hide resolved
ivoanjo
ivoanjo previously approved these changes Aug 15, 2023
Copy link
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few notes, but overall LGTM 👍

profiling/src/profile/internal/function.rs Show resolved Hide resolved
Str(StringId),
Num {
num: i64,
num_unit: Option<StringId>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PHP seems to be the only profiler setting num_unit (#204). Do you think it's still useful? Otherwise it could be one more thing to get rid of :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be in favour of removing it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to, we could give an API allowing clients to specify a key, num_unit mapping, and then insert the num_unit to the pprof based on that mapping. Keeps the feature, but saves memory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave it to another PR so we can get this merged.

profiling/src/profile/internal/line.rs Show resolved Hide resolved

#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)]
pub struct ValueType {
pub r#type: StringId,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I do prefer type personally :P )

profiling/src/profile/mod.rs Show resolved Hide resolved
@@ -686,20 +487,22 @@ impl Profile {
let mut labels: Vec<Label> = Vec::with_capacity(sample.labels.len());
let mut local_root_span_id_label_offset: Option<usize> = None;
for label in sample.labels.iter() {
anyhow::ensure!(
label.str.is_none() || label.num == 0,
"Invalid label: {:?}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 yes please! I left some suggestions for this on PR #205

Co-authored-by: Daniel Schwartz-Narbonne <danielsn@users.noreply.github.com>
@morrisonlevi morrisonlevi dismissed stale reviews from ivoanjo and danielsn via c957808 August 15, 2023 22:00
Copy link
Contributor

@danielsn danielsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

@morrisonlevi morrisonlevi merged commit e65f300 into main Aug 16, 2023
@morrisonlevi morrisonlevi deleted the levi/smaller-structs branch August 16, 2023 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
profiling Relates to the profiling* modules.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants