Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add capability to have configurable aggregation temporality for OTLP Registry #3625

Merged
merged 20 commits into from
Apr 10, 2023

Conversation

lenin-jaganathan
Copy link
Contributor

@lenin-jaganathan lenin-jaganathan commented Feb 9, 2023

OTLP format supports delta and cumulative temporality. But currently in micrometer, the support for OTLP Metrics with Delta Temporality is not available. This PR adds the capability to export meters with delta temporality(#3145). Below are some behaviors this PR includes

  • capability to configure aggregation Temporality via property otlp.aggregationTemporality
  • Delta Temporality is supported by using StepMeters wherever applicable
  • Capability to export max as part of the HistogramDataPoint for Delta Temporality

Implementation Information:
When Delta Temporality is enabled the following applies,

  • start time(StartTimeUnixNano) for the meter will be set to the start of the previous step.
  • end time(TimeUnixNano ) for the meter will be set to the end of the previous step or the step for which data is being reported.

(the previous step - indicates the last fully completed step.)

TODO:

  • Add tests for the new features added
  • Make the support for Delta Histogram using a StepHistogram(basically the count, and total should match with the value recorded in Histogram) as mentioned in the OTLP protocol
  • Possibly rename some of the classes and minimize code duplications.

fixes #3145

…registry.

OTLP format supports delta and cumulative temporality. But currently in micrometer, the support for OTLP Metrics with Delta Temporality is not available. This commit adds the capability to export meters with delta temporality. Below are some behaviours this PR includes
* capability to configure aggregation Temporality via property `otlp.aggregationTemporality`
* Delta Temporality is supported by using StepMeters wherever applicable
* Capability to export max as part of the HistogramDataPoint for Delta Temporality
@sonatype-lift
Copy link
Contributor

sonatype-lift bot commented Feb 9, 2023

🛠 Lift Auto-fix

Some of the Lift findings in this PR can be automatically fixed. You can download and apply these changes in your local project directory of your branch to review the suggestions before committing.1

# Download the patch
curl https://lift.sonatype.com/api/patch/github.com/micrometer-metrics/micrometer/3625.diff -o lift-autofixes.diff

# Apply the patch with git
git apply lift-autofixes.diff

# Review the changes
git diff

Want it all in a single command? Open a terminal in your project's directory and copy and paste the following command:

curl https://lift.sonatype.com/api/patch/github.com/micrometer-metrics/micrometer/3625.diff | git apply

Once you're satisfied, commit and push your changes in your project.

Footnotes

  1. You can preview the patch by opening the patch URL in the browser.

import io.micrometer.core.instrument.distribution.*;
import io.micrometer.core.instrument.step.StepDistributionSummary;

class OtlpStepDistributionSummary extends StepDistributionSummary {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure that this is a good idea but this class seem the same as OtlpCumulativeDistributionSummary except:

  1. TimeWindowFixedBoundaryHistogram/DistributionStatisticConfig/etc. are created differently
  2. Cumulative has getStartTimeNanos

So I was thinking if we should have the common logic in a common place, e.g.:

  1. Having only one OtlpDistributionSummary that can handle both (getStartTimeNanos will be there in the step case too).
  2. Having a common parent/super class that cumulative/step can extend and provide the TimeWindowFixedBoundaryHistogram/DistributionStatisticConfig/etc. for example in ctor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think OtlpStepTimer has the same situation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right see one of my TODOs in the PR description. This was one of my POC codes and it needs some re-designing and better code re-organization. Will start on that part and welcome any suggestions there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to ping us if you need some help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I might need some especially in restructuring some of these classes. Already, OtlpMeterRegisrty does what I believe was not How registries were meant to work doing 2 things - Step and Cumulative. And the same goes with the Timers and DistributionSummaries because there are different things inherently - Step and Cumulative. So, any help on this would be helpful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to throw in the idea of get-and-reset semantics for Histograms. Basically, you record the data, snapshot it, and then reset the histogram on every export. This circumvents the whole timer and rotation code, which I am not sure is required for a delta-based export. Just leaving this here as an idea, let me know if there is something I am missing

@lenin-jaganathan lenin-jaganathan force-pushed the otlp_stepregistry branch 2 times, most recently from cbe7b1b to daabc04 Compare February 18, 2023 17:53
- Well, I would prefer having something like a StepWindowHistogram for Step based registries or would improve the existing implementation to have improved accuracy. But adding tests for the existing histogram implementation.
}

@Test
void timerWithHistogram() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shakuzen / @jonatan-ivanov I believe you guys are already aware that the TImeWindowFixedBoundaryHistogram will have very poor accuracy for StepBased registries.

To Summarise:
The issue with that is it uses a sliding bucket which creates 3 buckets(assumption, step=1min, the default buffer length is 3). And these buckets are rotated every 20 seconds (step/buffer length) which makes the histogram always error-prone I,e it will always undercount the values. This is the reason we used DeltaHistogramCounts implementation in SignalFxMeterRegistry for our histogram use-cases. With OTLP coming into the picture, I think we should at least have a similar implementation (or basic StepBased Histogram) which will give more accurate information. And moreover, the percentiles don't align with Step Boundaries which also makes it difficult to correlate them.

Copy link
Member

@shakuzen shakuzen Apr 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to change this. SignalFx was the first step-based registry to support percentile histograms, so we hadn't taken close enough look at this in general, I think. The histogram support is indeed not right, which we can fix after merging this, but I don't usually like merging something I know has a bug.

The percentiles and max not aligning with step boundaries was intentional (there's a relevant old issue with the history #1993), but I don't think it makes sense in the OTLP context because the specification is clear about what the meaning of the max or bucket counts is supposed to represent, and it needs to align with the time interval.

Copy link
Contributor

@pirgeo pirgeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all: Thanks for taking this on! I think this is a great initiative. I have experience both in OTel and a little bit in Micrometer exporters, so feel free to ping me if you are stuck somewhere. I have left a few comments. In general, I agree with you that there might be an opportunity to improve the histogram reporting for step-based histograms. I will give this another go next week, and try it out too.

@@ -79,13 +79,9 @@ public HistogramSnapshot takeSnapshot() {
return snapshot;
}

CountAtBucket[] histogramCounts = this.monotonicBucketCountHistogram.takeSnapshot(0, 0, 0).histogramCounts();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily related to this PR, and might be addressed in a follow-up (also untested, I just happen to have fixed a similar bug recently): the fact that super.takeSnapshot() and monotonicBucketCountHistogram.takeSnapshot() are not synchronized might lead to discrepancies between the count and the bucket counts if some other thread records data in the time between the two method calls.

import io.micrometer.core.instrument.distribution.*;
import io.micrometer.core.instrument.step.StepDistributionSummary;

class OtlpStepDistributionSummary extends StepDistributionSummary {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to throw in the idea of get-and-reset semantics for Histograms. Basically, you record the data, snapshot it, and then reset the histogram on every export. This circumvents the whole timer and rotation code, which I am not sure is required for a delta-based export. Just leaving this here as an idea, let me know if there is something I am missing

@lenin-jaganathan
Copy link
Contributor Author

lenin-jaganathan commented Feb 24, 2023

@pirgeo Thanks for the review sure will reach out for any help. I believe one thing on top of our minds was what to do with AGGREGATION_TEMPORALITY_UNSPECIFIED.
Apart from that on the idea of a Step-based histogram, I do have an open comment on this PR related to that (see this comment). If you are interested, I am thinking something like this for a StepBased Histogram implementation. Basically, a meter that implements both Histogram and StepValue so that it exhibits the behavior of getting and resetting for the step.

@jonatan-ivanov
Copy link
Member

Small update: as I was looking into how to do this without code duplication, I thought having some integration tests could be timely, so: #3668

@bogdandrutu
Copy link
Contributor

@jonatan-ivanov what is the missing part to move forward with this PR?

@lenin-jaganathan
Copy link
Contributor Author

@jonatan-ivanov / @shakuzen Is there any update on this?

# Conflicts:
#	implementations/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpCumulativeDistributionSummary.java
#	implementations/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpCumulativeTimer.java
#	implementations/micrometer-registry-otlp/src/main/java/io/micrometer/registry/otlp/OtlpMeterRegistry.java
#	implementations/micrometer-registry-otlp/src/test/java/io/micrometer/registry/otlp/OtlpMeterRegistryTest.java
@jonatan-ivanov
Copy link
Member

jonatan-ivanov commented Mar 21, 2023

I resolved the conflicts (merged main to this and kept the history) and applied formatting changed. I'm planning to look more into this this week.

@jonatan-ivanov
Copy link
Member

jonatan-ivanov commented Mar 31, 2023

I did some further refactoring and added integration tests, most of the things look good to me, but I have two concerns:

  1. We are changing the publishing timestamp for cumulative from
.setTimeUnixNano(TimeUnit.MILLISECONDS.toNanos(this.clock.wallTime()))

to

.setTimeUnixNano(publishTimeNanos)

In the first case, meters published together can have different timestamps, in the second they can't. E.g.: let's say we have two gages and pulling them can take 5 seconds, in the first case their timestamps will be ~5 seconds apart, in the second case they will be off. I think the original approach is closer to the truth, what do you think?

  1. @lenin-jaganathan Did you try this with a real backend? I wrote an integration test with the OTel collector and it seems only Counter and Gauge works, Timer and DistributionSummary does not; what am I missing?
We are sending the data to the collector (click to reveal an example request)
resource_metrics {
  resource {...}
  scope_metrics {
    metrics {
      name: "test.gauge"
      gauge {
        data_points {
          time_unix_nano: 1680302436000000000
          as_double: 12.0
        }
      }
    }
    metrics {
      name: "test.timer"
      unit: "milliseconds"
      histogram {
        data_points {
          start_time_unix_nano: 1680302434000000000
          time_unix_nano: 1680302436000000000
          count: 1
          sum: 123.0
          max: 123.0
        }
        aggregation_temporality: AGGREGATION_TEMPORALITY_DELTA
      }
    }
    metrics {
      name: "test.distributionsummary"
      histogram {
        data_points {
          start_time_unix_nano: 1680302434000000000
          time_unix_nano: 1680302436000000000
          count: 1
          sum: 24.0
          max: 24.0
        }
        aggregation_temporality: AGGREGATION_TEMPORALITY_DELTA
      }
    }
    metrics {
      name: "test.counter"
      sum {
        data_points {
          start_time_unix_nano: 1680302434000000000
          time_unix_nano: 1680302436000000000
          as_double: 42.0
        }
        aggregation_temporality: AGGREGATION_TEMPORALITY_DELTA
        is_monotonic: true
      }
    }
  }
}

But the Prometheus response does not have them:

# HELP test_counter 
# TYPE test_counter unknown
test_counter{job="test",service_name="test",telemetry_sdk_language="java",telemetry_sdk_name="io.micrometer"} 42.0
# HELP test_gauge 
# TYPE test_gauge gauge
test_gauge{job="test",service_name="test",telemetry_sdk_language="java",telemetry_sdk_name="io.micrometer"} 12.0
# EOF

Additionally, these tests are timing out on CircleCI and I cannot re-trigger them since for some reason they are in @lenin-jaganathan's org. :o
EDIT: higher timeout fixed it but this should not take this long.

@lenin-jaganathan
Copy link
Contributor Author

  1. We are changing the publishing timestamp for cumulative. - I get your point, especially for the cumulative registry. I will have a go at it and get back.

  2. I did try my initial changes with a back-end that supports Delta Aggregation. I will have a look at it again, but I don't expect Prometheus to support the Delta Aggregation(https://opentelemetry.io/docs/reference/specification/metrics/sdk_exporters/prometheus/). If what I assume is right, it should only process cumulative aggregation.

@jonatan-ivanov
Copy link
Member

Yeah, Prometheus not supporting delta is ok (it can't really), but I would have assumed that the OTel collector can accept OTLP with delta and export it in the Prometheus format (as cumulative). :(
Because of this, could you please also revert my last two commits?

@lenin-jaganathan
Copy link
Contributor Author

lenin-jaganathan commented Apr 1, 2023

@jonatan-ivanov I did verify the Histogram datapoint after your comment and I don't see any issues with them. I see the data points in our back end and it just works fine.

Do you want me to revert your last 2 commits?

Edit: I am not sure what Prometheus does exactly, the spec say the Dleta Histograms should be DROPPED or CONVERTED for Prometheus. (https://opentelemetry.io/docs/reference/specification/compatibility/prometheus_and_openmetrics/#histograms-1)

@jonatan-ivanov
Copy link
Member

👍🏼
Yes please if you are restoring the timestamp for cumulative too.
I believe with those the PR should be ready for merging.

…Metrics will use the StepWindow

As Per (Otlp Spec)[https://opentelemetry.io/docs/reference/specification/metrics/data-model/#metric-points], the following seems to hold true
-  When the aggregation temporality is “delta”, we expect to have no overlap in time windows for metric streams - So, we will be using the step window for start and endtime
- For cumulative, we will be using the start time of the meter and the time when the meter is read.
This reverts commit 8f51093.

Revert "Set higher timeout for OTLP integration tests"

This reverts commit 7d1cf36.
Copy link
Contributor

@pirgeo pirgeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it another pass and tried running it with a collector that simply logs what it receives. Looks like it all works to me. I haven't done extensive tests with a backend yet though.

@jonatan-ivanov
Copy link
Member

@lenin-jaganathan Thank you for the changes!

@shakuzen I think this is in a good enough shape to merge it, what do you think?

timer.record(111, TimeUnit.MILLISECONDS);

// This is where TimeWindowMax can be painful and make no sense at all. Need to
// re-visit this. Probably StepMax is what might fit good for OTLP.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a different Max implementation as you say. TimeWindowMax doesn't fit the interval as it should for OTLP.

Copy link
Member

@shakuzen shakuzen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some remaining known issues, but if everyone thinks it would still be beneficial to get this in the RC1 (scheduled for Monday) for wider/easier testing while we fix the remaining issues before GA, I'm okay with that.

@lenin-jaganathan has already pointed out the issues in comments throughout the pull request:

  • partial publish on close fix is not applied to this registry with delta temporality (maybe we can do something in the TCK to catch this proactively in registry implementations)
  • time window max is not right for OTLP delta
  • time window histogram is not right for OTLP delta

Depending on what we do about #2818 that may also cause issues here.

@lenin-jaganathan
Copy link
Contributor Author

lenin-jaganathan commented Apr 7, 2023

@shakuzen I do have a couple of POCs for StepBased Histogram and Max. But I just don't want this to look like a huge PR with too many new things. I would have a PR for StepHistogram and StepMax immediately after this PR. But If you think differently, I can just pile more commit on this. No preferences there, just a matter of getting Delta Registry support in to start with.

@shakuzen
Copy link
Member

Sounds good. Please do follow-up with PRs for the other things. On the partial step issue, is there any reason we can't make OtlpMeterRegistry extend StepMeterRegistry to solve that?

@shakuzen shakuzen merged commit b828fe3 into micrometer-metrics:main Apr 10, 2023
@lenin-jaganathan
Copy link
Contributor Author

@shakuzen Follow-up PR for to address Max and histogram - #3749

Let me just get back on partial publishing. I see an issue with partial publishing in some other Step Registries too (see SignalFxTimer which was made an AbstractTimer just to have a custom Histogram implementation and now since it is no more Step the issue still exists.) And the same may be the case for OTLP as my new PR makes the OTLP Timer extend AbstarctTimer.

@shakuzen
Copy link
Member

Yeah, it's on my list of things to do post-RC1 to more thoroughly check specific-registry implementations for this kind of thing. But it's expected if you make something not extend a Step meter, you're indicating you want to be responsible for any of the behavior that the step meters provide, such as the behavior on registry close. We can't really do anything about that. It depends on each registry implementation what's best for it, but I have to imagine it would be easier to extend and customize StepTimer than not in most cases.

@lenin-jaganathan
Copy link
Contributor Author

lenin-jaganathan commented Apr 10, 2023

I completely agree with the flexibility offered here. But since StepMeter is the interface that defines that we can perform a report with a different value on close, and this being package-private makes it someone cannot implement this behavior. I would think about something that lets implementers give an implementation for this last-minute publish but still let the StepRegistry close that.

is there any reason we can't make OtlpMeterRegistry extend StepMeterRegistry to solve that

Technically, we do that since we almost override everything in the StepRegistry in OTLP. But, doing something like that will make me assume that OTLP is a StepRegistry which it is not as it can be either Step or Cumulative.

jonatan-ivanov added a commit to jonatan-ivanov/spring-boot that referenced this pull request Apr 12, 2023
In Micrometer 1.11.0-RC1, a new property was introduced in OtlpConfig
to define aggregation temporality.
See micrometer-metrics/micrometer#3625
jonatan-ivanov added a commit to jonatan-ivanov/spring-boot that referenced this pull request Apr 13, 2023
In Micrometer 1.11.0-RC1, a new property was introduced in OtlpConfig
to define aggregation temporality.
See micrometer-metrics/micrometer#3625
jonatan-ivanov added a commit to jonatan-ivanov/spring-boot that referenced this pull request Apr 13, 2023
In Micrometer 1.11.0-RC1, a new property was introduced in OtlpConfig
to define aggregation temporality.
See micrometer-metrics/micrometer#3625
@lenin-jaganathan lenin-jaganathan deleted the otlp_stepregistry branch April 29, 2023 06:41
izeye added a commit to izeye/micrometer that referenced this pull request May 27, 2023
@izeye izeye mentioned this pull request May 27, 2023
shakuzen pushed a commit that referenced this pull request May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Configurable aggregation temporality for OTLP registry
5 participants