-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collector 0.42.0 - googlecloudexporter - One or more points were written more frequently than the maximum sampling period configured for the metric #18039
Comments
Pinging code owners: See Adding Labels via Comments if you do not have permissions to add labels yourself. |
If you can update to a more recent version of the collector, that would be helpful. v0.70.0 is the latest version, and you are on v0.42.0. Otherwise, the error means you are writing a metric with the same name and set of labels more than once every 5 seconds: https://cloud.google.com/monitoring/quotas#custom_metrics_quotas. This often occurs because you are missing labels required to distinguish points. For example, if I am collecting the same metric from 2 kubernetes pods, but don't distinguish them via a label (e.g. the pod name), they would both be written with the same set of labels, and would "collide". I notice you are writing to the "global" monitored resource. Unless you are collecting metrics from a single entity, that usually indicates that you aren't differentiating between the sources of metrics using the monitored resource. Depending on where you are running, a resource detector may be helpful for getting information about the environment you are running in. |
The quick response is greatly appreciated dashpole; thank you for the insight into what is going on here 🙏 So i implemented a resourcedetection processor into my configuration, and the result changed slightly, where i was not seeing the
Config Used to Implement the
After reading about the I then took to playing around with the metricstransform processor to see if i can have a unique label added to each ingested metric, but have not figured out a way to get a randomized value to be generated for a new label i want added to each ingest metric. reading the documentation, and thinking about it some, i'm taking that this is just not possible, at the very least within this version of the processor, is that correct? If it is correct in that there is not a way to generate a random value for a label one wants to add to each ingested metric through the collector, would you know if there is a way within the instrumentation that a unique identifier of some sort can be generated for each metric? I've dug through the source code trying to uncover this, but am not well versed in Golang, so at this point i have not found that ability myself, but would appreciate a confirmation as to whether that may be possible, if known. i attempted to see if there was a way to configure the Sample Period, but have yet to find that ability as well. If the issue we're encountering is known to be something that a later version of the OpenTelemetry Collector might better handle, it'd be great if you wouldn't mind providing any links to source code, or release notes that may prove this, so then i may have the data i need to get the internal team responsible for this application to upgrade the instrumentation libraries, which in turn will give us the ability to use later versions of the OpenTelemetry Collector. Config Used to Implement the
|
Yes, I don't think that is possible. Would you mind sharing which GCP platform you are running on? You just have to make sure:
I'm assuming 2 is already solved, as that would be quite odd... To solve 1, you can add resource attributes to whatever is sending them to the collector as one option. For example, you could use the OTEL_RESOURCE_ATTRIBUTES on each of the applications sending to the collector to set an attribute to make it unique. However, the googlecloud exporter doesn't attach all resource attributes by default, so you would need to use the metric. resource_filters in our config to add the resource attributes to each metric as a label. However, that config option will require a newer version of our exporter. |
Would you mind sharing which GCP platform you are running on? I'm running a docker container of the collector within a VM instance i installed docker on, through the GCE service, within Google Cloud Platform. If that didn't answer your question let me know. I wasn't the person who put together the instrumentation for our company's application (Terraform Cloud Agent), but my hunch is that we actually do not have #2 solved, and the attributes of the metrics not being unique enough is the issue, and all the metrics are coming from a single source (A single Terraform Cloud Agent, run as a docker container, within the same VM instance that i have the Collector's docker container running within). Would you happen to know of a way at the instrumentation level that a unique attribute could be created for every metric, or possibly a way to modify this sampling period the error talks about to something greater than 5 seconds, as it desires? If not that's okay. I figure these questions are probably better asked within the instrumentation repo, and at the very least, you helped me deduce what level this problem lays in. I'm not positive if this is at all related, but i noticed while looking around the web that this google doc mentioned a |
For otel instrumentation, you should be able to add any attribute (including one with a UUID as a value). But that would be done by hand. You should also be able to modify the interval of the periodic reader (see https://opentelemetry.io/docs/reference/specification/metrics/sdk/#periodic-exporting-metricreader). The add_unique_identifier option could be used if it is written in python, and is recommended if the duplicate metrics are coming from different threads, but it isn't available in other languages. Another possibility is that the duplicate metrics have different instrumentation scopes. In newer versions of the exporter, we add instrumentation_library and instrumentation_version labels to differentiate these (see instrumentation_library_labels in the README). |
Hey dashpole as well as any other lurkers of this issue, Our team successfully upgraded the otel instrumentation libraries in use by our application to UPDATE: neither of these two settings changed the result; error still occurred |
Through some more testing we were able to determine that upgrading the instrumentation libraries from Essentially changing the original:
To this:
which determines the configuration of:
The insight provided by you, @dashpole, was absolutely appreciated, so thank you again for the assistance in helping us understand this issue. |
I am experiencing something very similar:
My problem here is that I have 2 cloud run instances using otel-collector's gcp resourcedetector, but the Resource label of I have verified in metrics explorer that I have correct metric label named I read that metric.resource_filter can be used to add from resource label to metric label. I would need to specify metric label as resource label or have gcp resourcedetector set resource label correctly. I have:
|
@AkselAllas you can delete the |
@dashpole I can confirm. Adding the following processor to the end of my processors fixed my case 🙇 ❤️
Maybe makes sense to write something about this into GCP exporter readme examples? |
Component(s)
exporter/googlecloud
What happened?
Description
Errors seen when collector is configured to export metrics through the
googlecloudexporter
. Not sure if the errors experienced can be resolved through configuration of the collector itself, or if the issue may be with the instrumentation of the telemetry within the TFC Agent (application) emitting the telemetry. Insight into how to resolve is greatly appreciated.Why is version
v0.42.0
of the collector being used? This is because this version of the collector has been deemed internally by the authors of the TFC Agent to be most compatible with the instrumentation libraries currently used by the TFC agent, which the version of the otel libraries used is0.19.0
Steps to Reproduce
collector-gcp.yml
using the one i provided within your working directory0.42.0
of theopentelemetry-collector-contrib
, using the command below within your working directory:Expected Result
ingested metrics are successfully ingested and exported by the collector to Google Cloud Monitoring service
Actual Result
Errors seen for multiple metrics ingested by the collector from a TFC agent:
Collector version
v0.42.0
Environment information
Environment
OS: (e.g., "Ubuntu 20.04")
OpenTelemetry Collector configuration
Log output
Additional context
No response
The text was updated successfully, but these errors were encountered: