Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduces Profiling Data Model v2 #239

Merged
merged 37 commits into from
Feb 23, 2024

Conversation

petethepig
Copy link
Member

@petethepig petethepig commented Nov 2, 2023

This is second version of the Profiling Data Model OTEP. After we've gotten feedback from the greater OTel community we went back to the drawing board and came up with a new version of the data model. The main difference between the two versions is that the new version is more similar to the original pprof format, which makes it easier to understand and implement. It also has better performance characteristics. We've also incorporated a lot of the feedback we've gotten on the first PR into this OTEP.

Some minor details about the data model are still being discussed and will be flushed out in the future OTEPs. We intend to finalize these details after doing experiments with early versions of working client + collector + backend implementations and getting feedback from the community. The goal of this OTEP is to provide a solid foundation for these experiments.

So far we've done a number of things to validate it:

  • we've written a new profiles proto described in this OTEP

  • we've documented decisions made along the way in a decision log

  • we've done benchmarking to refine the data representation (see Benchmarking section in a collector PR)

  • diff between original pprof and the new proto: link

We're seeking feedback and hoping to get this approved.


For (a lot) more details, see:

@petethepig petethepig marked this pull request as ready for review November 7, 2023 18:18
@petethepig petethepig requested a review from a team November 7, 2023 18:18
@mtwo
Copy link
Member

mtwo commented Nov 20, 2023

@carlosalberto @jack-berg can we mark this as triaged with priority p0?

@tigrannajaryan
Copy link
Member

tigrannajaryan commented Nov 20, 2023

The main difference between the two versions is that the new version is more similar to the original pprof format

@petethepig More similar, but not fully backwards compatible with pprof?

(I haven't had a chance to review the OTEP yet, just wanted to understand what to expect before I started revewing).

[UPDATE]: I have now reviewed the PR, which mostly answers this question. I still have a follow up comment/question on pprof compatiblity, see below.

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much difference does AnyValue/KeyValue interning make? I would prefer not to have a different type just for profiles. If it is a significant difference we may want to consider adding interning support to existing AnyValue/KeyValue.

text/profiles/0239-profiles-data-model.md Show resolved Hide resolved
text/profiles/0239-profiles-data-model.md Outdated Show resolved Hide resolved
text/profiles/0239-profiles-data-model.md Show resolved Hide resolved
text/profiles/0239-profiles-data-model.md Show resolved Hide resolved
@aalexand
Copy link

How much difference does AnyValue/KeyValue interning make? I would prefer not to have a different type just for profiles. If it is a significant difference we may want to consider adding interning support to existing AnyValue/KeyValue.

One thing is interning, another thing that pprof labels have is unit support. E.g. int64 bytes can be distinguished from int64 microseconds or count of something. This is used by pprof for using appropriate unit suffixes and scaling the label values appropriately.

@petethepig
Copy link
Member Author

petethepig commented Nov 30, 2023

@tigrannajaryan

How much difference does AnyValue/KeyValue interning make? I would prefer not to have a different type just for profiles. If it is a significant difference we may want to consider adding interning support to existing AnyValue/KeyValue.

We did some benchmarking for that. Here are the results (it's also in this spreadsheet, "Attribute Represenations Summary" sheet):
Screenshot 2023-11-29 at 6 16 09 PM

And here's the difference between the 3 implementations:

// Extended
message Sample {
  repeated opentelemetry.proto.common.v1.KeyValue attributes = 9;
}


// ExtendedInterned
message Sample {
  repeated opentelemetry.proto.common.v1.KeyValueInterned attributes = 6;
}


// ExtendedLookup
message Profile {
  // lookup table
  repeated opentelemetry.proto.common.v1.KeyValue attributes = 9;
}

message Sample {
  repeated uint64 attributes = 10;
}

To be clear, in the OTEP we're using ExtendedLookup represenation. It does not do the same kind of interning as it's done in original pprof (that would be ExtendedInterned).

ExtendedInterned uses a string_table for string values and ExtendedLookup uses an attributes lookup table. Extended uses a schema that's more common for other OTEL signals where the attributes are embedded inside other messages (Sample in our case).

@aalexand
Copy link

How much difference does AnyValue/KeyValue interning make? I would prefer not to have a different type just for profiles. If it is a significant difference we may want to consider adding interning support to existing AnyValue/KeyValue.

One thing is interning, another thing that pprof labels have is unit support. E.g. int64 bytes can be distinguished from int64 microseconds or count of something. This is used by pprof for using appropriate unit suffixes and scaling the label values appropriately.

We still need units for labels I think.

@tigrannajaryan
Copy link
Member

If label/attribute units are critical I suggest that we start this discussion early.

I looked at other signal types and there is a few attribute conventions that would benefit from units (but not many):

Span attributes (all bytes):
message.compressed_size
message.uncompressed_size
messaging.message.body.size
messaging.message.envelope.size
faas.max_memory

No many metric attributes that can have units, I was only able to find one (WAh):
hw.battery.capacity

Similarly just one resource attribute with unit (bytes):
host.cpu.cache.l2.size

There may be others that I missed.

You can file an issue in https://github.com/open-telemetry/opentelemetry-proto to begin the discussion.

@jsuereth jsuereth merged commit dc619df into open-telemetry:main Feb 23, 2024
2 checks passed
Kielek added a commit to Kielek/opentelemetry-dotnet-instrumentation that referenced this pull request Feb 26, 2024
Kielek added a commit to Kielek/opentelemetry-dotnet-instrumentation that referenced this pull request Feb 26, 2024
Kielek added a commit to Kielek/opentelemetry-dotnet-instrumentation that referenced this pull request Mar 1, 2024
Kielek added a commit to Kielek/opentelemetry-dotnet-instrumentation that referenced this pull request Mar 4, 2024
florianl added a commit to florianl/oteps that referenced this pull request Mar 6, 2024
For `abc;def` the `locations_start_index` should be `4` as `2` points to `baz`.

Follow up of open-telemetry#239 (comment)
tigrannajaryan pushed a commit to open-telemetry/opentelemetry-proto that referenced this pull request Apr 23, 2024
This is a follow up to [OTEP 239: Introduces Profiling Data Model v2](open-telemetry/oteps#239)

The main motivation behind this PR is that this will allow us to start experimenting with the profiles proto in opentelemetry-collector. I marked the profiles part as `Experimental` to indicate that this is not a final version of the data model.

I copied the proto from the OTEP, and moved `pprofextended.proto` from `profiles/v1/alternatives/pprofextended.proto` to just `profiles/v1/pprofextended.proto`. I did this because I figured we no longer have alternative representations and this will reduce confusion for people outside of Profiling SIG. The rest of the proto stayed the same.

I tested this file with a collector fork and I it compiles properly.
carlosalberto pushed a commit that referenced this pull request May 28, 2024
For `abc;def` the `locations_start_index` should be `4` as `2` points to
`baz`.

Follow up of
#239 (comment)
VinozzZ pushed a commit to VinozzZ/opentelemetry-proto that referenced this pull request Jun 21, 2024
This is a follow up to [OTEP 239: Introduces Profiling Data Model v2](open-telemetry/oteps#239)

The main motivation behind this PR is that this will allow us to start experimenting with the profiles proto in opentelemetry-collector. I marked the profiles part as `Experimental` to indicate that this is not a final version of the data model.

I copied the proto from the OTEP, and moved `pprofextended.proto` from `profiles/v1/alternatives/pprofextended.proto` to just `profiles/v1/pprofextended.proto`. I did this because I figured we no longer have alternative representations and this will reduce confusion for people outside of Profiling SIG. The rest of the proto stayed the same.

I tested this file with a collector fork and I it compiles properly.
VinozzZ pushed a commit to VinozzZ/opentelemetry-proto that referenced this pull request Jun 21, 2024
This is a follow up to [OTEP 239: Introduces Profiling Data Model v2](open-telemetry/oteps#239)

The main motivation behind this PR is that this will allow us to start experimenting with the profiles proto in opentelemetry-collector. I marked the profiles part as `Experimental` to indicate that this is not a final version of the data model.

I copied the proto from the OTEP, and moved `pprofextended.proto` from `profiles/v1/alternatives/pprofextended.proto` to just `profiles/v1/pprofextended.proto`. I did this because I figured we no longer have alternative representations and this will reduce confusion for people outside of Profiling SIG. The rest of the proto stayed the same.

I tested this file with a collector fork and I it compiles properly.
VinozzZ pushed a commit to VinozzZ/opentelemetry-proto that referenced this pull request Jun 21, 2024
This is a follow up to [OTEP 239: Introduces Profiling Data Model v2](open-telemetry/oteps#239)

The main motivation behind this PR is that this will allow us to start experimenting with the profiles proto in opentelemetry-collector. I marked the profiles part as `Experimental` to indicate that this is not a final version of the data model.

I copied the proto from the OTEP, and moved `pprofextended.proto` from `profiles/v1/alternatives/pprofextended.proto` to just `profiles/v1/pprofextended.proto`. I did this because I figured we no longer have alternative representations and this will reduce confusion for people outside of Profiling SIG. The rest of the proto stayed the same.

I tested this file with a collector fork and I it compiles properly.
VinozzZ pushed a commit to VinozzZ/opentelemetry-proto that referenced this pull request Jun 21, 2024
This is a follow up to [OTEP 239: Introduces Profiling Data Model v2](open-telemetry/oteps#239)

The main motivation behind this PR is that this will allow us to start experimenting with the profiles proto in opentelemetry-collector. I marked the profiles part as `Experimental` to indicate that this is not a final version of the data model.

I copied the proto from the OTEP, and moved `pprofextended.proto` from `profiles/v1/alternatives/pprofextended.proto` to just `profiles/v1/pprofextended.proto`. I did this because I figured we no longer have alternative representations and this will reduce confusion for people outside of Profiling SIG. The rest of the proto stayed the same.

I tested this file with a collector fork and I it compiles properly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.