-
Notifications
You must be signed in to change notification settings - Fork 778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust Cardinality Limit to Accommodate Internal Reserves #5382
Adjust Cardinality Limit to Accommodate Internal Reserves #5382
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #5382 +/- ##
==========================================
+ Coverage 83.38% 84.65% +1.27%
==========================================
Files 297 281 -16
Lines 12531 12098 -433
==========================================
- Hits 10449 10242 -207
+ Misses 2082 1856 -226
Flags with carried forward coverage won't be shown. Click here to find out more.
|
I feel the key here is less about dealing with off-by-one (or off-by-two), it is more about user experience/education/expectation. @rajkumar-rangaraj could you update opentelemetry-dotnet/src/OpenTelemetry/Metrics/MetricStreamConfiguration.cs Lines 103 to 114 in f5b7f9c
|
Could you please review the updated PR description? If you agree with the changes, I will proceed to update the code comments. |
The direction looks good to me 👍 |
@rajkumar-rangaraj The SDK MUST create an Aggregator with the overflow attribute set prior to reaching the cardinality limit and use it to aggregate events for which the correct Aggregator could not be created. The maximum number of distinct, non-overflow attributes is one less than the limit, as a result. If we were to stick with the spec wording (not yet stable, still..), then we should count the overflow attribute as contributing to the overall limit. Maybe just do the zero tag case in this PR, and wait for the spec to finalize. |
SDKs SHOULD support being configured with a cardinality limit. The number of unique combinations of attributes is called cardinality. If we were to stick with spec wording, then we should not treat zero tag as a special case either. That should also contribute to the overall limit. |
Spec is not saying anything about zero tags, but spec explicitly talks about overflow contributing to the overall limit. |
Spec says "The number of unique combinations of attributes is called cardinality." A measurement with zero tags qualifies as a unique combination. |
Got it. Our only option is then to break the spec compliance and/or influence the spec to soften the wording there, to allows us to special case. After all, what we are doing is aligned with the OTel mission. |
+1 to adjust specification. We should avoid any differences where possible. Especially if the part of the specification is still unstable and can be easily modified. |
open-telemetry/opentelemetry-specification#3904 (comment) should give us what we need. |
Hey FYI we have this block of code currently: opentelemetry-dotnet/src/OpenTelemetry/Metrics/MetricReaderExt.cs Lines 183 to 190 in 28ead76
If we are going to auto-expand whatever the user chooses for - if (isEmitOverflowAttributeKeySet)
- {
- // We need at least two metric points. One is reserved for zero tags and the other one for overflow attribute
- if (cardinalityLimit > 1)
- {
- this.emitOverflowAttribute = true;
- }
- }
+ this.emitOverflowAttribute = isEmitOverflowAttributeKeySet; |
@@ -1429,7 +1431,7 @@ int MetricPointCount() | |||
} | |||
|
|||
meterProvider.ForceFlush(MaxTimeToAllowForFlush); | |||
Assert.Equal(MeterProviderBuilderSdk.DefaultCardinalityLimit, MetricPointCount()); | |||
Assert.Equal(MeterProviderBuilderSdk.DefaultCardinalityLimit + additionalReserve, MetricPointCount()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test should verify that we don't allow the user to exceed the cardinality cap. The right thing to test here now would be to check that count of metric points exported (excluding zero tags and overflow attribute) is equal to MeterProviderBuilderSdk.DefaultCardinalityLimit
. We probably need to update MetricPointCount()
method to return the count of unreserved MetricPoints instead of all the MetricPoints that were exported in that particular Collect cycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overflow is guarded with an experimental flag. In cases where the overflow experimental attribute is not set, the value will be the cardinality limit + 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you have now updated MetricPointCount()
method definition to account for that.
test/OpenTelemetry.Tests/Metrics/MetricOverflowAttributeTestsBase.cs
Outdated
Show resolved
Hide resolved
test/OpenTelemetry.Tests/Metrics/MetricOverflowAttributeTestsBase.cs
Outdated
Show resolved
Hide resolved
Thanks @utpilla for the very detailed review. I appreciate the time you spent on the quality review. I’ve sent an update, please take a look when you get a chance. |
Discussion #5349 (comment)
Changes
Currently, From the cardinality limit, we internally reserve one spot for zero tags and another spot for overflow if the emit overflow experimental flag is enabled. A few customers have raised concerns that they receive 1 or 2 dimensions less than the specified cardinality limit. For example, consider a scenario where a user sets the cardinality limit to 10. If the emit overflow experimental flag is enabled, we currently set aside 2 of those spots—one for zero tags and one for overflow cases. This means the user effectively gets to use only 8 of their 10 slots for their dimensions.
This proposed change aims to address concerns regarding the effective reduction of the cardinality limit available to users. The key concept here is that the cardinality limit should apply exclusively to metrics that include dimensions, treating metrics with zero dimensions and overflow metrics as special cases that are exempt from the user's specified limit. Currently, the emit overflow attribute functionality is behind an experimental flag, but it is planned for incorporation into the stable SDK. To simplify future adjustments and to provide users with the full extent of their specified limits, we propose adding two additional spots on top of the cardinality limit.
Example:
Imagine a user sets a cardinality limit of 10, expecting to track metrics across up to 10 unique dimension combinations. Under the proposed change, to ensure the user's limit is fully applicable to dimensioned metrics, we would internally adjust the limit to 12. This adjustment accounts for one spot reserved for metrics with zero dimensions and another for overflow metrics, ensuring that the user has a full quota of 10 slots available for their dimensioned metrics. This way, regardless of any special cases, the user effectively retains the full capacity of their specified cardinality limit for metrics that include dimensions.
Merge requirement checklist
CHANGELOG.md
files updated for non-trivial changes