-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(profiling): rework internal types #214
Conversation
This continues pulling out things from src/profiling/mod.rs and puts them into `internal::*`. Additionally, it introduces traits for a few concepts that were already there, just unnamed (and not tidy): - Id, e.g. FunctionId, MappingId, LocationId, StackTraceId, etc. - Item, e.g. Function, Mapping, Location, StackTrace, etc. - PprofItem, e.g. Function, Mapping, Location, etc but not StackTrace. The DedupExt trait was simplified, which is partly possible because I understand Rust better now, but also because of the new traits. It enforces the max-size and will panic if the containers get too big. Previously, only mappings checked for being full, which is a bit ironic because it's the least likely in practice to overflow. Hopefully this explains why it's such a large diff: all these things are connected, although I probably snuck in a change or two that didn't strictly have to be in there (it's hard when it's this big to notice).
/// Address at which the binary (or DLL) is loaded into memory. | ||
pub memory_start: u64, | ||
/// The limit of the address range occupied by this mapping. | ||
pub memory_limit: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it stands now, we're stuck with this; but if we change from a [start, end)
formulation to a [start, start + size)
formulation, we could realistically squeeze the size
parameter into a u32. These mappings are actually ELF sections (otherwise including both start + file offset is kinda weird), and they probably need to be executable unless we're providing location information for non-code resources (why lol), soooo I personally think u32 would be reasonable there.
Obviously this would be inappropriate for this PR, just making a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about needing to be executable makes it reasonable to expect it to fit in a u32?
/// The limit of the address range occupied by this mapping. | ||
pub memory_limit: u64, | ||
/// Offset in the binary that corresponds to the first mapped address. | ||
pub file_offset: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Realistically, I think this can be bound to a u32. This combined with my other comment would shave a byte off of this, but it would require a refactor at serialization.
pub struct StackTrace { | ||
/// The ids recorded here correspond to a Profile.location.id. | ||
/// The leaf is at location_id[0]. | ||
pub locations: Vec<LocationId>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be Box<[Location]>
and save some bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's save this kind of thing for another PR, this one is big enough! But good idea.
|
||
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)] | ||
pub struct ValueType { | ||
pub r#type: StringId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're making our own struct, should we call this typ
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I do prefer type
personally :P )
impl<T: Sized + Hash + Eq> DedupExt<T> for FxIndexSet<T> { | ||
fn dedup(&mut self, item: T) -> usize { | ||
impl<T: Item> Dedup<T> for FxIndexSet<T> { | ||
fn dedup(&mut self, item: T) -> <T as Item>::Id { | ||
let (id, _) = self.insert_full(item); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the same assertion as intern
does
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it makes sense to in this case: if it already exists, that's fine. The string case is different because it can't do a simple insert_full
because it has to deal with &str
vs String
differences.
But, it did make me realize we can move that assertion in intern
to be a debug_assert
.
a068c5a
to
81cbfb8
Compare
81cbfb8
to
041231e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple minor ⛏️ and comments but LGTM
profiling/src/profile/mod.rs
Outdated
@@ -686,20 +487,22 @@ impl Profile { | |||
let mut labels: Vec<Label> = Vec::with_capacity(sample.labels.len()); | |||
let mut local_root_span_id_label_offset: Option<usize> = None; | |||
for label in sample.labels.iter() { | |||
anyhow::ensure!( | |||
label.str.is_none() || label.num == 0, | |||
"Invalid label: {:?}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to explain in the message why the label is invalid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 yes please! I left some suggestions for this on PR #205
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can combine both of the ensures into a single one:
anyhow::ensure!(
label.str.is_none() || (label.num == 0 && label.num_unit.is_none()),
"Invalid label: used both str and num fields: {label:?}"
);
Reasoning:
If `label.str` is none, then
it doesn't matter what values are held in `label.num` and `label.num_unit`
else
then both `label.num` and `label.num_unit` should be zero-representations
Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some tests to double-check this ^_^ They seem equivalent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems good to me! The negations tripped me a bit so I had to pause a bit to mentally parse it, but the logic looks very correct!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few notes, but overall LGTM 👍
Str(StringId), | ||
Num { | ||
num: i64, | ||
num_unit: Option<StringId>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PHP seems to be the only profiler setting num_unit
(#204). Do you think it's still useful? Otherwise it could be one more thing to get rid of :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be in favour of removing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we need to, we could give an API allowing clients to specify a key, num_unit
mapping, and then insert the num_unit
to the pprof based on that mapping. Keeps the feature, but saves memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's leave it to another PR so we can get this merged.
|
||
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)] | ||
pub struct ValueType { | ||
pub r#type: StringId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I do prefer type
personally :P )
profiling/src/profile/mod.rs
Outdated
@@ -686,20 +487,22 @@ impl Profile { | |||
let mut labels: Vec<Label> = Vec::with_capacity(sample.labels.len()); | |||
let mut local_root_span_id_label_offset: Option<usize> = None; | |||
for label in sample.labels.iter() { | |||
anyhow::ensure!( | |||
label.str.is_none() || label.num == 0, | |||
"Invalid label: {:?}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 yes please! I left some suggestions for this on PR #205
Co-authored-by: Daniel Schwartz-Narbonne <danielsn@users.noreply.github.com>
c5a685d
to
d746266
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢
What does this PR do?
This continues pulling out things from
profiling/src/profiling/mod.rs
and puts them intoprofiling/src/profile/internal/*
. Additionally, it introduces traits for a few concepts that were already there, just unnamed (and not tidy):Id
, e.g. FunctionId, MappingId, LocationId, StackTraceId, etc.Item
, e.g. Function, Mapping, Location, StackTrace, etc.PprofItem
, e.g. Function, Mapping, Location, etc but not StackTrace.The
DedupExt
trait was simplified, which is partly possible because I understand Rust better now, but also because of the new traits. It enforces the max-size and will panic if the containers get too big. Previously, only mappings checked for being full, which is a bit ironic because it's the least likely in practice to overflow.Hopefully this explains why it's such a large diff: all these things are connected, although I probably snuck in a change or two that didn't strictly have to be in there (it's hard when it's this big to notice).
Motivation
The goals are to increase cleanliness and separation between the different representations of profiles. We now have
api::*
,internal::*
, andpprof::*
, which are fairly contained and separated. There should probably be more cleaning up of theinternal::*
parts, but this PR is big enough already.Additional Notes
I would start the review by looking at
profiling/src/profile/internal/mod.rs
, and reading the 3 trait definitions first.How to test the change?
Since this targets the internal representation, existing tests ought to be good enough.