Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Prometheus Compliance Tests] TestRemoteWrite/otelcollector/Counter failed on main #35119

Closed
songy23 opened this issue Sep 10, 2024 · 13 comments · Fixed by prometheus/compliance#106
Assignees
Labels
ci-cd CI, CD, testing, build issues comp:prometheus Prometheus related issues priority:p1 High release:blocker The issue must be resolved before cutting the next release

Comments

@songy23
Copy link
Member

songy23 commented Sep 10, 2024

Component(s)

No response

Describe the issue you're reporting

E.g. https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/10791551495/job/29929199424?pr=35112

=== NAME  TestRemoteWrite/otelcollector/Counter
    counter.go:35: 
        	Error Trace:	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/compliance/remotewrite/sender/cases/counter.go:35
        	            				/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/compliance/remotewrite/sender/main_test.go:108
        	            				/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/compliance/remotewrite/sender/main_test.go:72
        	Error:      	Should be true
        	Test:       	TestRemoteWrite/otelcollector/Counter
        	Messages:   	found zero samples for {__name__="counter"}
@songy23 songy23 added ci-cd CI, CD, testing, build issues comp:prometheus Prometheus related issues needs triage New item requiring triage labels Sep 10, 2024
@VihasMakwana
Copy link
Contributor

VihasMakwana commented Sep 10, 2024

Not able to reproduce this on my darwin arm64...

Same for linux amd64

With some effort, I'm able to reproduce this. Will try to track down and fix it.

@atoulme
Copy link
Contributor

atoulme commented Sep 12, 2024

This is breaking on main and all PRs.

@TylerHelmuth
Copy link
Member

@dashpole can you take a look at this?

@dashpole
Copy link
Contributor

I'll try to take a look. cc @jmichalek132 @ArthurSens since this is related to the PRW exporter.

@dmitryax
Copy link
Member

dmitryax commented Sep 19, 2024

Marking as a release blocker so we don't release with broken CI

@dashpole
Copy link
Contributor

Tested against older versions of the prometheus/compliance repo and confirmed it was not a recent regression there

@dashpole
Copy link
Contributor

In this run, the test actually failed, but the compliance test step passed: https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/10495822874/job/29075112350

@dashpole
Copy link
Contributor

So prior to #35071, the test was failing, but the step itself was not failing. The PR added set -o pipefail && \, which seems to have made the failure fail the test.

@dashpole
Copy link
Contributor

I took a wild guess and was correct! The compliance test needs to set add_metric_suffixes: false on the prometheusremotewrite exporter for tests to pass. I assume the counter was being renamed to counter_total...

@dashpole
Copy link
Contributor

prometheus/compliance#106 fixes tests for me locally

@ArthurSens
Copy link
Member

Merged, could we try again?

@djaglowski
Copy link
Member

I updated #35026 and it worked there! Thanks @dashpole for getting to the bottom of this.

@crobert-1
Copy link
Member

Thanks again @dashpole, really appreciate the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-cd CI, CD, testing, build issues comp:prometheus Prometheus related issues priority:p1 High release:blocker The issue must be resolved before cutting the next release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants