-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate that duplicate metrics across receivers are intentional #26499
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
My 2c:
|
Notes from the SIG meeting: This should act like a linter, where it alerts the user in the case of non unique metric names (by making the validation fail), to make sure that it is intentional. If it is intentional, the user can input the metric name in an allow list.
With the approach above, this is not disallowed, but more of a confirmation check.
Can you clarify what you mean here ?
Agreed 👍 |
@mackjmr @ "link this to the semantic conventions" I might be thinking too far ahead, and it's likely not a blocking prerequisite. That said, the ideal state/"north star"/eventual convergence(?) of metric and attribute namings is coupled to the opentelemetry semantic conventions. To my understanding the current plan is that only a subset of all metrics/attributes emitted would be tracked by semantic conventions, but I could see motivation for 100% coverage of all sdk instrumentation+collector metric/attribute emission being tracked by semantic conventions. Summary: Tl;Dr: if we see duplicate metric names between receivers, and said metric_name exists in semantic conventions spec, we probably wish to keep the names as-is (assuming they fulfill semantic intent). If the duplicates aren't in semconv, we may eidh to bring the collision to the semantic conventions working group. |
Tldr:tldr: we should additionally discuss this in the semantic conventions working group if we haven't already |
@hughesjj Thank you for clarifying. Agreed, I think that if semantic conventions on metric_name exist, validating whether the duplicate metrics are OK should be a matter of ensuring they both fulfill semantic intent (and perhaps also making sure there is a way to differentiate them with attributes, e.g. scope name). On the other hand, if semantic conventions on metric_name do not exist, it can become difficult to validate. I will bring this issue to the Semantic Conventions SIG to discuss 🙂 This issue is not specific to receivers, and some SDKs are also affected by this, so perhaps this can also be a broader discussion. |
Issue: open-telemetry#26499 If semantic conventions on the duplicate metric exist, validating is a matter of ensuring they both fulfill semantic intent. For Validating duplicate metrics for which there is no semantic conventions, additional discussion is needed with the semantic conventions WG as mentioned in the issue. For the time being, it will be best judgment.
Added this issue to the Semantic Convention WG on Monday: https://docs.google.com/document/d/10xG7DNKWRhxNmFGt3yYd3980a9uwS8lMl2LvQL3VNK8/edit. |
Notes from semantic conventions WG meeting
JVM metrics had same issue, and decision was that for duplicates it was opt-in on 1 side to avoid collisions (e.g. only 1 enabled by default). May not be a good default experience to have duplicate metrics, but should be opt in. This issue should be raised in the specification meeting. |
From the Specification SIG: There was a lot of discussion (between 4:50=>24:24), TLDR is: Two pieces of instrumentation providing the same data is allowed. That said, it can be a problem but there is not a good solution to solve this in a generic way. Should be a collector decision how this is addressed. We could have a way to detect two identical metrics flowing through the collector (this is not only a problem if there are duplicate metrics across receivers, but can also be an issue when there is a misconfiguration and the same receiver is pointed twice to the same instance). For the collector, I think that #26687 is a good step so that we are aware that there are duplicates. In the future, we may want to extend mdatagen to document duplicates in the Generated Readme so that users are aware, and if there are reports of duplicates being a bad user experience, perhaps we can consider having a rule that only 1 should be enabled at a time by default like it was done for JVM metrics in SDKs. cc @dmitryax and @hughesjj if there are any concerns for: #26687. |
The approach sounds good to me. And I like the receivers group concept like |
**Description:** This test alerts a user if a metric is duplicated across receivers, and instructs the user to add said metric to an allow list if this is intentional. If semantic conventions on the duplicate metric exist, validating is a matter of ensuring they both fulfill semantic intent. For validating duplicate metrics for which there is no semantic conventions, additional discussion is needed with the semantic conventions WG as mentioned in the issue. For the time being, it will be best judgment. **Link to tracking Issue:** #26499
…etry#26687) **Description:** This test alerts a user if a metric is duplicated across receivers, and instructs the user to add said metric to an allow list if this is intentional. If semantic conventions on the duplicate metric exist, validating is a matter of ensuring they both fulfill semantic intent. For validating duplicate metrics for which there is no semantic conventions, additional discussion is needed with the semantic conventions WG as mentioned in the issue. For the time being, it will be best judgment. **Link to tracking Issue:** open-telemetry#26499
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Component(s)
cmd/mdatagen
Describe the issue you're reporting
This is a proposal to disallow having metrics with the same name across different receivers to avoid metric collisions. This will be enforced by having the mdatagen validation fail in the case of non-unique metric names defined in
metadata.yaml
across receivers.That said, there is likely to be exceptions to this rule. For example, there is some discussion in #24238 as to whether we should allow using the same metric name in receivers which collect metrics that represent the same underlying data.
There should be a way for to have exceptions, and allow different receivers to emit the same metric, but the general rule should be that this is not allowed.
I propose that this should be allowed for receivers that collect data from the same underlying technology (e.g. container runtime) but different implementations (e.g. docker, containerd, cri-o). Fundamentally, this should be for receivers that act as replacements for collecting the same data, and that cannot be used in conjunction.
To address this, I think we should have a field in the
metadata.yaml
, e.g.group
to indicate that receivers pull the same data. For instance, in the case of adockerstats
receiver +containerd
receiver, they would both be part of groupcontainer runtime
. For the validation, only receivers from a same group will be allowed to have non-unique metric names.The text was updated successfully, but these errors were encountered: