-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define configurable max scale for exponential histogram aggregation #2987
Comments
For positive scales, there are significantly faster techniques using lookup tables (mentioned here ) that avoid the logarithm. As a drawback, the size of the lookup table scales with There was a discussion about limiting the scale in this thread open-telemetry/opentelemetry-proto#322 (comment) in which I favored an upper limit, not only because the lookup table approach is faster, but also because it would have the potential to standardize the mapping of values to bucket indices, which is not possible with the logarithm due to platform-dependent numerical errors. |
The text mentions that the logarithm method is preferred because its "nearly as fast and accurate as the lookup table approach". Is there a reference implementation with a performance comparison lying around anywhere? If the lookup table approach requires limiting |
@jack-berg The results showed that histograms using a logarithm for indexing (NrSketchSimpleSubBucketLogIndexerRecordingSpeedBenchmark, DynaHistDynamicLogOptimalRecordingSpeedBenchmark, DDSketchPaginatedLogarithmicRecordingSpeedBenchmark) require at least 30ms for recording 1M values while those using the lookup table approach (NrSketchSimpleSubBucketLookupIndexerRecordingSpeedBenchmark, DynaHistDynamicOpenTelemetryExponentialBucketRecordingSpeedBenchmark) are significantly faster and can even do that in less than 10ms (DynaHistDynamicOpenTelemetryExponentialBucketRecordingSpeedBenchmark). Since adding a value to the histogram also involves range-checks and reallocations of buckets, the indexing costs are only part of it. Therefore, I think the results might be consistent with the 9x performance difference you have reported. For devices without floating point unit (FPU), the difference could be even more significant. |
I propose adding a Implementations that care to optimize can use the log strategy for indexes greater than 10, the lookup strategy for indexes 10 >= x > 0, and the bitshifting strategy for indexes less than or equal to zero. Users that care about performance can choose a @open-telemetry/specs-metrics-approvers WDYT? |
As a practical example, consider a case where OTLP histograms are being translated into OpenHistogram (Circllhist). In that histogram, which has fixed resolution, there are values of the OTLP scale parameter that are senselessly high resolution, in which case a user would prefer to limit memory growth, staying below MaxSize instead of filling space with higher-than-necessary precision. |
The exponential histogram aggregation defines a max scale without giving concrete advice on what "reasonable" minimum and maximum scales might be.
The max scale is particularly important because as I've recently found, computing the bucket index with the the Logarithm Function is ~9x slower than the bit manipulation technique. The logarithm function is required for positive scales, while the bit manipulation version can be used at scales less than or equal to zero.
I suggest we:
cc @jmacd @jsuereth
The text was updated successfully, but these errors were encountered: