-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[receiver/prometheusreceiver] Fix staleness issue for histograms and summaries #8561
[receiver/prometheusreceiver] Fix staleness issue for histograms and summaries #8561
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a CHANGELOG.md
entry. LGTM conceptually, just a couple minor questions/suggestions.
} | ||
|
||
point.SetExplicitBounds(bounds) | ||
point.SetBucketCounts(bucketCounts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we even need to set bucket counts? IIUC, we just needed the bounds to be able to manufacture stale bucket series with NaN values later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, we don't necessarily need the bucket counts to set the NaN values later. This would just also require a change in the PRW exporter for the line if index >= len(pt.BucketCounts()) {
:
opentelemetry-collector-contrib/pkg/translator/prometheusremotewrite/helper.go
Lines 324 to 335 in fa86bb0
for index, bound := range pt.ExplicitBounds() { | |
if index >= len(pt.BucketCounts()) { | |
break | |
} | |
cumulativeCount += pt.BucketCounts()[index] | |
bucket := &prompb.Sample{ | |
Value: float64(cumulativeCount), | |
Timestamp: time, | |
} | |
if pt.Flags().HasFlag(pdata.MetricDataPointFlagNoRecordedValue) { | |
bucket.Value = math.Float64frombits(value.StaleNaN) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can also be done as a follow-up. I wouldn't be surprised if many exporters don't handle that case correctly.
Description: Fixes bug where staleness NaN cannot be sent for histogram buckets and summary quantiles because these are not sent with the datapoint when it is marked stale, so components like the PRW exporter cannot set these metrics as stale. The actual values of these buckets and quantiles would be the stale NaN, but this sends the value as 0 instead because these need to be type uint64.
Link to tracking Issue: #8492
Testing: Tested with prometheus receiver -> OTLP exporter. Added test cases for otlp_metricfamily.go similar to the ones that already exist for histogram and quantiles, but instead with stale NaN values.