-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support maps and heterogeneous arrays as attribute values #2888
Support maps and heterogeneous arrays as attribute values #2888
Conversation
I have a couple of questions. Do we necessarily need nested maps for HTTP headers? Is there a reason why we couldn't use If we were to accept this change, would this allow attributes such as Lastly, I'm curious if any backends allow querying on nested attributes. Ones that I am familiar with will allow queries such as |
That is what the current PR does, #2841, but it requires splitting up each piece (filename, line number, etc) into a separate array of values which would then be put back together. It is hardly ideal even if it technically works with some deconstruction :) |
Yes, we can do that, but I don't think it is a great solution. It gets worse when the values are non-primitives since now you have to have parallel arrays (e.g. the stacktrace case). Is it doable? Yes. Is it a good way to express such data? I don't think so.
I think we will want to have a follow up PR that makes a proposal for that. This PR doesn't try to address that.
I believe there are backends which accept nested attributes. I remember looking at Collector exporters and IIRC I found a few which didn't do flattening but kept the original nesting. |
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
I heard several times that we need this. Reopening and moving from a draft to a proper PR to discuss more widely. |
@open-telemetry/specs-approvers PTAL. |
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
97f8247
to
1243c4b
Compare
34e5311
to
d8a488d
Compare
@jpkrohling Here is an example from Elastic's demo website readily available: See for example the "kibana.saved_objects" field. See how it is a map of arrays of strings: Another example to look at is this: https://demo.elastic.co/app/discover#/doc/filebeat-*/.ds-logs-apm.error-default-2023.12.22-000014?id=n7DyGI0BGTVq0q2e7pZw See _source and how it contains an "error" which is a map of array of map of array of maps of primitives or maps. Complex, nested data happens all the time in the logging world. |
@jpkrohling this is not a new proposal. It is already part of the spec, written in a Stable document. See here: https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/common#attribute I am quoting the spec:
|
@@ -236,12 +237,14 @@ Disclaimer: this list of features is still a work in progress, please refer to t | |||
| OTLP File exporter | | | - | | - | | | | | | - | | | |||
| Can plug custom LogRecordExporter | | | + | | | | | + | | + | | | | |||
| Trace Context Injection | | | + | | + | | | + | | + | + | | | |||
| Non-homogeneous arrays and maps (including nested) for attribute values| | | | | | | | | | | | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JavaScript has this as experimental as LogAttributes
https://github.com/open-telemetry/opentelemetry-js/blob/bf8714edcba8a2384db857696a1608215548d182/experimental/packages/api-logs/src/types/LogRecord.ts#L20-L22
I support this change in its current form, but prefer a slightly different option:
This design accomplishes symmetry while avoiding artificial constraints on how data can be represented for spans and logs where real use cases exist for complex types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also (currently) agree with #2888 (comment).
## Changes Change the type of `Resource` to reference the definition in the SDK so that the resource's scheme URL is included in logs data model. ## Why The Logs Data Model misses scheme URL of resource. All of the languages reuse a common type representing resource. - Java (stable logs): [reuses a common resource type](https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk/logs/src/main/java/io/opentelemetry/sdk/logs/data/LogRecordData.java) - .NET (stable logs): [reuses a common resource type](https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry.Exporter.OpenTelemetryProtocol/Implementation/OtlpLogRecordTransformer.cs) - C++ (stable logs): [reuses a common resource type](https://github.com/open-telemetry/opentelemetry-cpp/blob/main/sdk/src/logs/read_write_log_record.cc) - Python (experimental logs): [reuses a common resource type](https://github.com/open-telemetry/opentelemetry-python/blob/main/opentelemetry-sdk/src/opentelemetry/sdk/_logs/_internal/__init__.py) These changes can be seen as breaking. However, there is 100% precedence of how log attributes and resource are currently handled in all languages. I see the current description as a bug because we would miss scheme URL of the resource. AFAIK the intention of the OTel Logs Specification authors was to have only nested values for Body and Attributes. Related PRs: - #2888 - open-telemetry/opentelemetry-go#4809
I found this comment to be particularly surprising, as it was not the definition of "vendor neutrality" I had in mind, and by some of the arguments on this issue, it seems that I'm not alone in my confusion. I wasn't able to find anything on the OpenTelemetry website that clearly laid these expectations out with the same clarity as this explanation. Should this be documented more explicitly (or if it's already documented, moved somewhere easier to find) to avoid reader misunderstanding? |
@jack-berg we have already defined attributes differently for logs than for all the other signals. The As was pointed out here, the purpose of this PR is not to change the logs definition. This change will only affect the existing signals and resources that already use these attributes. I don't think we can justify making a change to attributes used by the trace and metric signal and resources so that they can be "unified" with logs when logs already uses a different definition and will not be migrated. Footnotes
|
Have there been any asks by users to add these attribute types to tracing or metrics? |
@MrAlias Isn't the logs attribute type semantically the same as the common attribute proposed in this PR? If not then can you point out the difference? Take notice that the common attributes specification already references log attributes (which in my opinion suggests that this is the same type): opentelemetry-specification/specification/common/README.md Lines 121 to 128 in 65c7dff
|
@MrAlias Yes for traces, a couple issues are linked in the PR description - the stacktrace and the payload use cases. |
Semantically equivalent is not the same as identical. Implementations that have developed a logs attribute type to conform with the specification cannot simply change out that type to another without breaking the API. For example if in Go we build: type LogAttribute struct { .... }
func (Record) SetAttribute(...LogAttribute) We cannot simply redefine to this: func (Record) SetAttribute(...attribute.KeyValue) Any user of the API will break when they try to upgrade. |
I agree. This is why I do not want to release a stable OTel Go Logs API module until this issue/PR is resolved.
This will "negatively" (breaking changes) affect some non-stable implementations (AFAIK Rust and JS). But it could "positively" impact all stable implementations (and Python):
|
I don't believe that this will be an issue for JS as we previously had a |
If we are able to retroactively make breaking changes to a stable portion of the specification because no stable implementation is compliant I think we can just replace our versioning and stability guidelines with:
😆 In all seriousness though, that is going to have to be a separate issue than this one. The TC and likely GC (cc @open-telemetry/governance-committee @open-telemetry/technical-committee ) will need to weight our reputation loss in not doing what we say we will with the desire to correct this situation. I just wanted to point this out as @jack-berg had mentioned conformity and reuse of the existing attribute definition as motivation for this change, but it has been pointed out by @tigrannajaryan and others that that is not the intent of this PR. |
Thank you for your clarifications and patience, @tigrannajaryan. I also appreciate @jack-berg's options and I agree with them as well. We are also aligned that we shouldn't aim for the lowest common denominator, and I hope my previous messages didn't imply that. One of my goals with my questions is to ensure that open-source users can take advantage of the solutions we are giving to the problems they have. And if they can't, what does it mean? That the open-source ecosystem doesn't care about this feature, because users don't care about them? Or that there are other easier ways to solve the same problem? In my opinion, what we shouldn't be doing is creating a feature without taking the broader open-source ecosystem in consideration. My other major point was in having consistency among signals: if the use-case isn't strong enough, we probably shouldn't break the consistency we have now, even if we have inconsistencies elsewhere. I believe those inconsistencies are smaller compared to ruling out metrics for this proposal here. Your example of Elastic helps in showing one open-source solution for logs using complex attribute values. While it isn't open-source, I take that it works for OpenSearch as well, although a confirmation would be desirable. And given that we have @yurishkuro's approval of this issue here, I believe Jaeger users are represented as well, so, I'm happy with the tracing side of things. It's still not clear to me whether users can use that information for their observability scenarios ("show me how all traces containing stacktraces for the line 53 of KotlinExtensions.kt"), or if it's only for reference once the right data point (log or span) has been found. This is what I meant by an end-to-end example: how do we envision things happening to solve the problem at hand. We still have a situation regarding metrics, though, and the one thing I'd ask before moving forward is to sync with the Prometheus folks and come up with an agreement on how to handle this situation. Are they happy to just drop this data? Or to serialize it as JSON? Would they be open to implementing a different way to store this data and not keep it in TSDB? If we don't engage with them, I predict that they will be mad at us because we are specifying something we know they don't support. Why Prometheus? Because it's one of the most relevant, if not the most relevant, open-source metrics database out there, especially for our audience. If the Prometheus WG folks (@dashpole comes to mind) are OK with this change, I'm certainly OK with it as well. |
- A primitive type: string, boolean, double precision floating point (IEEE 754-1985) or signed 64 bit integer. | ||
- An array of primitive type values. The array MUST be homogeneous, | ||
i.e., it MUST NOT contain values of different types. | ||
- An array of values of primitive type [before 1.29.0]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also include a byte array (as for logs attributes)?
Prometheus only supports string label values, so we already need to deal with similar cases today for arrays. Thus far, we've followed https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/common/README.md#attribute, and serialized to json for all of those cases, rather than drop. TBH, I don't think I've ever seen a prometheus metric with an array-typed label from OTel, although I suspect it happens occasionally when we encode Resource as the I definitely agree with the sentiment that prometheus users don't want large, structured, json-encoded attribute values in their metrics. Even adding namespaces to metric labels is seen as too verbose by prometheus folks (although I understand the rationale). In practice, many prometheus users set If, like arrays, maps are rarely used for metric label values, I don't think this will be a huge issue. We can JSON-encode them when they do appear and let users deal with those strings. But I don't see this being useful for metrics, and would personally be against using these in semantic conventions for metrics. |
I spent a bit more time thinking about this issue and came to a conclusion that this change is non-trivial and likely warrants an OTEP. I don't think a PR is the right mechanism for a change like this. I am going to close the PR. I don't plan to create the OTEP myself right now since I don't have time. If someone else wishes to create the OTEP, I can review it. If such an OTEP is proposed I think it needs to at least:
To @pellared and @MrAlias, who are waiting on the resolution to this. I see 3 options:
There is probably some other alternative I am not seeing right now. Thank you everyone for patience and for helpful feedback and comments. |
If we aren't going to accept complex attribute types (#2888) we should explicitly rule them out of future designs. Doing so cements the idea that attributes are "metadata" instead of "data", since if attributes were data, we would not want to artificially limit their structure. Once its clear that attributes are metadata and restricted to a limited set of types, its easy to determine that use cases which require complex types (like event payloads) should seek to put the data elsewhere (like in a log record body). While I was in favor of supporting complex attribute types (#2888) I believe its more important that we commit one way or the other. The uncertainty around the question of whether this type of evolution will occur has muddied the waters of several related conversations. There was consensus on codifying this in the 1/30/24 spec SIG meeting. We should capitalize on this momentum and get this over the finish line. Stalling out just to revisit this same debate in the future is a bad use of time.
## Changes Change the type of `Resource` to reference the definition in the SDK so that the resource's scheme URL is included in logs data model. ## Why The Logs Data Model misses scheme URL of resource. All of the languages reuse a common type representing resource. - Java (stable logs): [reuses a common resource type](https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk/logs/src/main/java/io/opentelemetry/sdk/logs/data/LogRecordData.java) - .NET (stable logs): [reuses a common resource type](https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry.Exporter.OpenTelemetryProtocol/Implementation/OtlpLogRecordTransformer.cs) - C++ (stable logs): [reuses a common resource type](https://github.com/open-telemetry/opentelemetry-cpp/blob/main/sdk/src/logs/read_write_log_record.cc) - Python (experimental logs): [reuses a common resource type](https://github.com/open-telemetry/opentelemetry-python/blob/main/opentelemetry-sdk/src/opentelemetry/sdk/_logs/_internal/__init__.py) These changes can be seen as breaking. However, there is 100% precedence of how log attributes and resource are currently handled in all languages. I see the current description as a bug because we would miss scheme URL of the resource. AFAIK the intention of the OTel Logs Specification authors was to have only nested values for Body and Attributes. Related PRs: - open-telemetry#2888 - open-telemetry/opentelemetry-go#4809
…-telemetry#3858) If we aren't going to accept complex attribute types (open-telemetry#2888) we should explicitly rule them out of future designs. Doing so cements the idea that attributes are "metadata" instead of "data", since if attributes were data, we would not want to artificially limit their structure. Once its clear that attributes are metadata and restricted to a limited set of types, its easy to determine that use cases which require complex types (like event payloads) should seek to put the data elsewhere (like in a log record body). While I was in favor of supporting complex attribute types (open-telemetry#2888) I believe its more important that we commit one way or the other. The uncertainty around the question of whether this type of evolution will occur has muddied the waters of several related conversations. There was consensus on codifying this in the 1/30/24 spec SIG meeting. We should capitalize on this momentum and get this over the finish line. Stalling out just to revisit this same debate in the future is a bad use of time.
Resolves #376
Use cases where this is necessary or useful:
This is a draft PR to see what the change looks like.
If this PR is merged it will be nice to follow it up with: