-
Notifications
You must be signed in to change notification settings - Fork 3
Conversation
Changing to "I am a...." based on maintainer request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a quick look 🙈
enum SymbolicInfo { | ||
SYMBOLIC_INFO_UNSPECIFIED = 0; | ||
SYMBOLIC_INFO_FULL = 1; | ||
SYMBOLIC_INFO_FUNCTIONS_ONLY = 2; | ||
SYMBOLIC_INFO_NO_INLINE_FRAMES = 3; | ||
} | ||
|
||
SymbolicInfo symbolic_info = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of SymbolicInfo
we maybe should use repeated Label
here. Similar to the original discussion open-telemetry#488 (comment)
// The object this entry is loaded from. This can be a filename on | ||
// disk for the main binary and shared libraries, or virtual | ||
// abstractions like "[vdso]". | ||
uint32 filename_index = 4; // Index into string table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add some
uint32 filename_index = 4; // Index into string table | |
File file = 4; |
and later on :
message File {
uint32 filename_index = 1; // Index into string table
// A unique identifier of a file. The ID is a 16-byte array.
bytes file_id = 2;
}
Filename are usually less unique. And to combine it, message Profile
could be extended to hold a repeated File file_tabe
, so that different locations can reference it.
// A unique identifier of a trace that this linked span is part of. The ID is a | ||
// 16-byte array. | ||
bytes trace_id = 1; | ||
// A unique identifier for the linked span. The ID is an 8-byte array. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// A unique identifier for the linked span. The ID is an 8-byte array. | |
// A unique identifier for the linked span. The ID is an 16-byte array. |
Similar to the trace_id
a 16 byte array should be preferred over a 8-byte array for global uniqueness.
uint64 sample_size = 2; | ||
|
||
// CPU / memory /etc | ||
string type = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For easier handling, maybe we could use an enum here?
string type = 3; | |
enum SamplingType { | |
SAMPLING_TYPE_UNSPECIFIED = 0; | |
SAMPLING_TYPE_CPU = 1; | |
[...] | |
} | |
[..] | |
SamplingType type = 3; |
SymbolicInfo symbolic_info = 6; | ||
|
||
// TODO(@petethepig): I wonder if we need something a little more specialized here | ||
repeated uint32 attribute_set_indices = 7; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could potentially be better if inlined
|
||
message Profile { | ||
repeated SampleType sample_types = 1; | ||
repeated Stacktrace stacktraces = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: perhaps stack_traces
? as two words
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not. "stacktrace" is fine as a term, I actually like it more. As long as we are consistent in using it as one word :)
// case the line information above represents one of the multiple | ||
// symbols. This field must be recomputed when the symbolization state of the | ||
// profile changes. | ||
bool is_folded = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field is fairly exotic. I'd drop it for now.
// it could be the contents of the .note.gnu.build-id field. | ||
uint32 build_id_index = 5; // Index into string table | ||
|
||
enum SymbolicInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in the last sync, it's unclear whether we want to start with tracking all these symbolization options from the start.
Aside, I also find "info" a bit too generic, so I'd name this enum SymbolFidelity
if we decide to keep it.
And then in order to simplify it, I also mentioned in the sync that maybe we should start with just
enum SymbolFidelity {
SYMBOL_FIDELITY_UNSPECIFIED = 0;
SYMBOL_FIDELITY_FULL = 1;
}
But one downside of this is that given this choices, producers will likely start to put the FULL
value when any symbolic info is present.
Given that, I wonder if we should start with a simple bool is_symbolized
field and once we encounter a real need for carrying fidelity information, we add it then. Possibly by adding the separate SymbolFidelity
enum field - I don't see much problem with having two fields where one is just a yes/no boolean and the other one "zooms" into the details.
open-telemetry#498) Co-authored-by: Carlos Alberto Cortez <calberto.cortez@gmail.com>
This PR was previously opened in opentelemetry-proto repo. Now it lives in this new
opentelemetry-proto-profile
repo.Changelog from previous PR:
function_id
tofunction_index
for clarity.has_functions
,has_filenames
, etc with an enum (considering removing altogether)Relevant documents / previous discussions:
The overall design is based on/heavily inspired by:
This PR consists of 2 parts:
profiles.proto
andprofiles_service.proto
which are mostly boilerplate and more or less finished from my perspectiveProfile
implementations, still very much work-in-progresspprof
— included as a reference and as a baseline for future benchmarksdenormalized
— most messages are embedded, no string table, a lot of potential for duplication => easier to work with, potentially larger payloadsnormalized
— most messages are linked using a foreign key, includes a string table, not much duplication => harder to work with, potentially smaller payloadsThings that are currently supported:
Profile
(similar to how it's implemented in pprof and used in go memory profiles via multiplevalue
values)AggregationTemporality
for eachSampleType
(e.g something likealloc_space
in go is cumulative)original_payload
field)Design TODOs / Areas where feedback would be most valuable:
The biggest thing is we need to decide which one of these approaches (denormalized
/normalized
/ mix of the two) we're going to take.I personally think that the optimal end product should be closer togoing withnormalized
implementation, especially considering that on-the-wire efficiency is a particularly important part of OTEL's profiling vision. But we still need to do some benchmarks to confirm that that would actually be significantly more efficient.normalized
.Figure out what to do with pprof messages likekeepMapping
— I'm not sure if we need thoseSampleType
message. CurrentlyType
andUnit
are strings. We might want to turn them into enums, or standardize them some other way, e.g making those a part of OTEL semantic conventionsImplementation TODOs / next steps
Reference Diagrams
For reference, here are graph representations of other signals:
metrics
logs
traces
And these are graph representations of alternatives formats under consideration:
pprof for reference
denormalized
normalized