-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[exporter/prometheusremotewrite] - out of order sample #11438
Comments
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping |
I really think it does. Are you on EKS using Fargate for compute? Have you not set CPU requests or limits? We have recently seen this configuration resulting in dropped metrics as a result of .25vCPU being allocated by default. Are you on ECS running the collector as a sidecar and shipping it metrics that don't have sufficiently identifying resource attributes and are not using the ECS resource detector plugin to ensure they're available? We've also seen that result in this error. It's also possible that you are sending OTLP metrics with sufficiently identifying resource attributes, but your PRW exporter configuration doesn't include TL;DR: there are many ways to get "out of order sample" errors and without more information about your deployment environment we probably can't help you. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping |
I am using AWS Beanstalk for both collector and applications to 'trace'. Collector configuration is pasted above. I am not aware of any explicit limit about cpu ... Are you suggesting that a too slow vCPU may cause this issue ? Thank you for your support |
FWIW I'm seeing this as well on
@Aneurysm9 |
@amoscatelli I don't see any resource identifiers on the metrics in your error message. Can you try enabling
|
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping |
This issue has been closed as inactive because it has been stale for 120 days with no activity. |
Describe the bug
Under load, prometheusremotewrite (in otel/opentelemetry-collector-contrib:0.53.0 docker image) from time to time drops metrics (from different source applications) because of a "out of order sample" error.
This seems similar to :
open-telemetry/opentelemetry-collector#2315
Steps to reproduce
Running otel collector docker image.
Configure the otel collector to send metrics to a remote prometheus using prometheusremotewrite.
Send metrics to otel collector from applications.
What did you expect to see?
No errors and no dropped metrics.
What did you see instead?
Errors with dropped metrics.
What version did you use?
opentelemetry-collector-contrib:0.53.0
What config did you use?
Environment
I really think doesn't matter.
Additional context
Some logs :
The text was updated successfully, but these errors were encountered: