Add logic to translate metric descriptors and initial flow #247

jsuereth · 2021-12-20T13:31:36Z

If you enable any of the end-to-end integration tests, you'll see metric descriptors sent to the dummy service.
Adds "known domains" configuration, so in the event we add new metric domain types (or system metrics define them), the metric name mapping logic can be configured.
Adds "CreateDefaults" method to metric config structure. This allows us to write method which use configuration without worrying about nil checking repeatedly. Note: There's probably a better "go" way to do this, let me know.
Updates createTimeSeries to call CreateTimeSeries. We'll need to figure out CreateServiceTimeSeries later.
Adds metric name/type/display name mappings (for updated version).
Add simple label-mapping (no label-descriptions possible)
Add constants for Summary mapping.

Not in this PR:

Add "legacy" flag for metric naming conventions that:
- uses external.googleapis.com/OpenCensus/
- sets display name to original metric name (or last part of the path).

exporter/collector/metricsexporter.go

aabmass · 2021-12-20T19:27:40Z

exporter/collector/metricsexporter.go

+// Updates config object to include all defaults for metric export.
+func (cfg *Config) SetMetricDefaults() {


You can add the defaults here

opentelemetry-operations-go/exporter/collector/factory.go

Line 52 in d2378ce

func createDefaultConfig() config.Exporter {

and then update the unit tests to use this method instead of &Config{}

So I wasn't sure how to go from config.Exporter back to Config PTAL

I think what you did is fine.

The other option is to leave the createDefaultConfig returning a config.Exporter and type assert metricMapper{cfg: createDefaultConfig().(*Config)} in the tests.

exporter/collector/metricsexporter.go

exporter/collector/metricsexporter_test.go

exporter/collector/metricsexporter.go

- Update default config method - Simplify some of my lack-of-go expertise.

aabmass

nits but LGTM

aabmass · 2021-12-21T17:08:23Z

exporter/collector/metricsexporter.go

@@ -69,8 +91,12 @@ func newGoogleCloudMetricsExporter(
 		cfg:    cfg,
 		client: client,
 		mapper: metricMapper{cfg},
+		mds:    make(chan *metricpb.MetricDescriptor),


Do we want any buffer for this channel? As is, one slow CreateMetricDescriptor call will block me.mds <- md

In my mind CMD is optimistic. I want to give it the minimal amount of resource and I don't care if any specific call fails. The current state of it, we basically have to include logic around it, but it's dubious whether we want it in the long run.

I'm happy having it on its own "thread" churning away slowly.

To rephrase aaron's point, the current implementation isn't really better than having the CMD call serialized with the CreateTimeSeries call. Either way, a CreateMetricDescriptor call can block the CreateTimeSeries calls. If you want CMD to be optimistic, you could have a buffered channel, but drop CMD calls if the buffer fills up. Otherwise, you probably don't need to bother with the extra goroutine.

That makes sense, but without a buffer (or non-blocking write), L126 in pushMetrics() will block until the background goroutine reads from the channel

Decided to go with a buffer of a few, and I do some pre-filtering of MDs before shoveling in the buffer. I think what we have now matches requirements.

exporter/collector/metricsexporter.go

aabmass · 2021-12-21T17:45:59Z

@jsuereth can you update https://github.com/GoogleCloudPlatform/opentelemetry-operations-go/blob/col-exporter-rewrite/exporter/collector/breaking-changes.md if there are any breaking changes vs the old exporter?

jsuereth · 2021-12-21T18:02:06Z

exporter/collector/metricsexporter.go

 				timeSeries = append(timeSeries, me.mapper.metricToTimeSeries(monitoredResource, extraLabels, metric)...)
 			}
 		}
 	}

 	// TODO: self observability
+	// TODO: Figure out how to configure service time series calls.
+	if false {


@aabmass @dashpole I added this as a reminder that I don't think we have an open bug to support service timeseries calls in this rework. Would one of you mind confirming/opening approrpiately? I think David has the most context on what we need here.

You can link to #225.

You can probably just copy https://github.com/census-ecosystem/opencensus-go-exporter-stackdriver/pull/294/files#diff-9f6cddeabc26d57e3837fe66df90a6288f56e8a0e898cb6004d098da521583aeR656 and use it here.

I assigned the bug to @jsuereth and pulled into the sprint

sounds good, I'll work on that next.

jsuereth · 2021-12-21T18:03:51Z

exporter/collector/metricsexporter.go

+	// prior to shutdown.
+	for md := range me.mds {
+		// Not yet sent, now we sent it.
+		// TODO - check to see if this is a service/system metric and doesn't send descriptors.


FYI - OpenCensus had logic to NOT send metric descriptors on certain domains when we suspect they are "system" (or service) metrics. I can add that in this CL or a follow on. It's related to the design of how we want to handle CreateServiceTimeSeries vs. CreateTimeSeries calls.

cc @dashpole

That makes sense. It would be nice if CreateServiceTimeSeries == don't create MD so we can simplify the implementation

I'd rather not implement that logic if we don't have to

dashpole · 2021-12-21T18:16:27Z

exporter/collector/metricsexporter.go

@@ -58,9 +76,14 @@ func newGoogleCloudMetricsExporter(
 ) (component.MetricsExporter, error) {
 	setVersionInUserAgent(cfg, set.BuildInfo.Version)

-	// TODO: map cfg options into metric service client configuration with
+	// map cfg options into metric service client configuration with


nit: this comment can probably be removed entirely.

dashpole · 2021-12-21T18:18:47Z

exporter/collector/metricsexporter.go

 				timeSeries = append(timeSeries, me.mapper.metricToTimeSeries(monitoredResource, extraLabels, metric)...)
 			}
 		}
 	}

 	// TODO: self observability
+	// TODO: Figure out how to configure service time series calls.
+	if false {


You can probably just copy https://github.com/census-ecosystem/opencensus-go-exporter-stackdriver/pull/294/files#diff-9f6cddeabc26d57e3837fe66df90a6288f56e8a0e898cb6004d098da521583aeR656 and use it here.

dashpole · 2021-12-21T18:25:15Z

exporter/collector/metricsexporter.go

@@ -69,8 +91,12 @@ func newGoogleCloudMetricsExporter(
 		cfg:    cfg,
 		client: client,
 		mapper: metricMapper{cfg},
+		mds:    make(chan *metricpb.MetricDescriptor),


To rephrase aaron's point, the current implementation isn't really better than having the CMD call serialized with the CreateTimeSeries call. Either way, a CreateMetricDescriptor call can block the CreateTimeSeries calls. If you want CMD to be optimistic, you could have a buffered channel, but drop CMD calls if the buffer fills up. Otherwise, you probably don't need to bother with the extra goroutine.

exporter/collector/metricsexporter.go

aabmass · 2021-12-21T19:07:13Z

exporter/collector/metricsexporter.go

+		// Not yet sent, now we sent it.
+		// TODO - check to see if this is a service/system metric and doesn't send descriptors.
+		if !me.cfg.MetricConfig.SkipCreateMetricDescriptor && md != nil && mdCache[md.Type] == nil {
+			err := me.exportMetricDescriptor(context.Background(), md)


You probably want a timeout for this ctx, or is this handled automatically by the client library?

I believe we should be able to configure that on gRPC clients and it'll attach to an existing timeout if one exists, or create one. However, I'm rather weak at Go, as you know :)

I wasn't sure so looked around: https://pkg.go.dev/cloud.google.com/go#hdr-Timeouts_and_Cancellation TLDR; the client lib will set a default timeout if the context doesn't have one already.

@dashpole probably knows better, is the default reasonable for this or should we hardcode something?

I think there is a timeout in the googlecloud exporter's config. I'd expect that to apply to the CMD call as well. Otherwise, i'd expect it to use the default timeout

https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/ebea5bc82e5a146db11d4b496a7a62e2151fe89a/exporter/googlecloudexporter/googlecloud.go#L141

Discussed in meeting, would be better to send the context in the channel from the other goroutine. Fine to do in a follow up PR.

* Skip all fixture tests (#239) * Initial structure for new pdata metrics exporter (#238) * [Metrics Rewrite] add outline with todos for fragmenting work (#240) * [Metrics Rewrite] attribute to label mapping (#243) [Metrics Rewrite] attribute to label mapping * [Metrics Rewrite] support for pdata Sum points (#242) * [Metrics Rewrite] support for pdata Sum points * update breaking-changes.md * use concatentation instead of sprintf * [Metrics Rewrite] support for pdata Gauge points (#244) * Add logic to translate metric descriptors and initial flow (#247) * Fixes from merge. * Fix tests. * Clean up test cases, re-disable integration tests. * Add summary descriptors and label descriptors. * Fix lint issues. * Some fixes from review. * Remove metric import. * Fixes from review. - Update default config method - Simplify some of my lack-of-go expertise. * Add unit test for metric domains. * Fixes from review. * Add breaking changes. * Fixes from review. * Update context to be TODO. * Add support for exponential histograms and exemplars. (#251) * Add support for exponential histograms and exemplars. * Fixes from review. * Fixes from review. * Fixes from discussion. * [Metrics Rewrite] implement monitored resource mapping (#252) * [Metrics Rewrite] implement monitored resource mapping * review fixes * [Metrics Rewrite] update breaking-changes.md for monitored resource (#255) * Add summary mapping to exporter. (#249) * Add config to call `CreateServiceTimeSeries` (#259) * Initial implementation of create service time series. * Add a test case for create service timeseries. * Add logic to auto-detect project id if not configured. * Fix from code review * Fix resource to be one that has retention policy for integration tests. * Add support for histogram to metrics exporter. (#258) BUG=210164184 * Re-enable ops-agent self-metric integration test. (#260) * [Metrics Rewrite] add ExponentialHistogram fixture (#257) * [Metrics Rewrite] add ExponentialHistogram fixture * make tests deterministic * few last changes * close channel instead of sending a message * Enable ops agent host metric integration test. (#264) - There is a bug in upstream agent-metric-processor that sets incorrect units on usage metrics (GoogleCloudPlatform/opentelemetry-operations-collector#72) - We update the expectations for inculsion of units in CreateTimeSeries - We disable metric descriptors (for now). Given the bug in agent-metric-processor, liekly ops-agent will need upstream fix for this first. * add a feature gate, which defaults to false, for using the re-written exporter (#267) * Enable Basic integration tests (#266) * Enable basic counter test. * Enable delta counter metrics. - Note: Delta counters are NOW fake-delta (i.e. cumulatives with limited time windows) * Enable non-monotonic-sum integration test. * Re-enable summary integration test and fix design issues in summary translation. - Summary exports percentiles, not quantiles - Percentiles should include similar double precision in the string. * Fix recordfixtures script to use featuregate (#270) * Skip already seen attribute keys when creating LabelDescriptors (#272) * Reenable GKE metrics agent fixtures (#271) * Update breaking-changes.md for googlecloudmonitoring/point_count self observability (#277) * Move logging to use zap-logger and set up self-observability to match collector expectations. (#275) * Enable metric prefix integraiton tests. (#274) * enable workloadapis prefix integration test. * update unknown domain metrics expect. * Add instrumentationLibraryToLabels method to metrics exporter. (#253) * Add instrumentationLibraryToLabels method to metrics exporter. BUG=https://b.corp.google.com/issues/210164355 * Remove custom_metrics_domains behaviour from metrics-exporter. * Remove dependency on go.opentelemetry.io/collector (#279) * remove dependency on go.opentelemetry.io/collector * add ocgrpc metrics to exporters' self-obs metrics (#280) * Use OC stackdriver exporter to capture self observability metrics as GCM protos (#282) * Capture ocgrpc self observability metrics (#283) * make integrationtest not internal (#285) * Remove internal/ prefix for integrationtest (#288) * Add batching support to metrics-exporter. (#286) * Add batching support to metrics-exporter. * Retry when we fail to write metric descriptors. * Re-enable workload metrics integration tests (#278) * update header year for new files (#296) * Document new CreateMetricDescriptor behavior (#294) * reenable disabled metrics test (#299) Co-authored-by: Aaron Abbott <aaronabbott@google.com> Co-authored-by: Josh Suereth <Joshua.Suereth@gmail.com> Co-authored-by: Thomas Barker <tbarker25@gmail.com> Co-authored-by: Punya Biswal <punya@google.com>

jsuereth added 5 commits December 18, 2021 09:12

Fixes from merge.

17e3529

Fix tests.

bfa7302

Clean up test cases, re-disable integration tests.

72e67a5

Add summary descriptors and label descriptors.

d7bc0ad

Fix lint issues.

0c1cfd2

jsuereth requested review from punya and aabmass December 20, 2021 15:36

jsuereth marked this pull request as ready for review December 20, 2021 15:36

aabmass self-assigned this Dec 20, 2021

jsuereth mentioned this pull request Dec 20, 2021

Wip suereth/summary timeseries #249

Merged

aabmass reviewed Dec 20, 2021

View reviewed changes

jsuereth added 4 commits December 20, 2021 19:57

Some fixes from review.

968adf3

Remove metric import.

455ec16

Fixes from review.

f3229ee

- Update default config method - Simplify some of my lack-of-go expertise.

Add unit test for metric domains.

a652528

aabmass approved these changes Dec 21, 2021

View reviewed changes

Fixes from review.

5e20b7d

jsuereth commented Dec 21, 2021

View reviewed changes

dashpole reviewed Dec 21, 2021

View reviewed changes

Add breaking changes.

3055b17

aabmass reviewed Dec 21, 2021

View reviewed changes

jsuereth added 2 commits December 21, 2021 14:40

Fixes from review.

d9c14ee

Update context to be TODO.

e4da6ee

jsuereth merged commit 20a5d3e into col-exporter-rewrite Dec 22, 2021

jsuereth deleted the wip-suereth/metric-descriptors branch December 22, 2021 15:51

aabmass mentioned this pull request Jan 19, 2022

Add ocgrpc self-obs metrics in test fixtures #281

Closed

dashpole mentioned this pull request Feb 1, 2022

Remove Known Domains #300

Closed

dashpole mentioned this pull request Apr 15, 2022

Send CreateMetricDescriptorRequest asynchronously #26

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add logic to translate metric descriptors and initial flow #247

Add logic to translate metric descriptors and initial flow #247

jsuereth commented Dec 20, 2021 •

edited

Loading

aabmass Dec 20, 2021 •

edited

Loading

jsuereth Dec 21, 2021

aabmass Dec 21, 2021

aabmass left a comment

aabmass Dec 21, 2021

jsuereth Dec 21, 2021

dashpole Dec 21, 2021

aabmass Dec 21, 2021

jsuereth Dec 21, 2021

aabmass Dec 21, 2021

aabmass commented Dec 21, 2021

jsuereth Dec 21, 2021

dashpole Dec 21, 2021

dashpole Dec 21, 2021

aabmass Dec 21, 2021

jsuereth Dec 21, 2021

jsuereth Dec 21, 2021

dashpole Dec 21, 2021

dashpole Dec 21, 2021

dashpole Dec 21, 2021

dashpole Dec 21, 2021

dashpole Dec 21, 2021

aabmass Dec 21, 2021

jsuereth Dec 21, 2021

aabmass Dec 21, 2021

dashpole Dec 21, 2021

dashpole Dec 21, 2021

aabmass Dec 21, 2021 •

edited

Loading

		// Updates config object to include all defaults for metric export.
		func (cfg *Config) SetMetricDefaults() {

Add logic to translate metric descriptors and initial flow #247

Add logic to translate metric descriptors and initial flow #247

Conversation

jsuereth commented Dec 20, 2021 • edited Loading

aabmass Dec 20, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aabmass left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aabmass commented Dec 21, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aabmass Dec 21, 2021 • edited Loading

Choose a reason for hiding this comment

jsuereth commented Dec 20, 2021 •

edited

Loading

aabmass Dec 20, 2021 •

edited

Loading

aabmass Dec 21, 2021 •

edited

Loading