Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/hostmetrics] ARM github action runner fails to run tests #32136

Closed
atoulme opened this issue Apr 3, 2024 · 2 comments · Fixed by #32135
Closed

[receiver/hostmetrics] ARM github action runner fails to run tests #32136

atoulme opened this issue Apr 3, 2024 · 2 comments · Fixed by #32135

Comments

@atoulme
Copy link
Contributor

atoulme commented Apr 3, 2024

Component(s)

receiver/hostmetrics

Describe the issue you're reporting

See run here: https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8531494580/job/23371252002?pr=32135

hostmetricsreceiver tests fail on ARM github actions runner with:

=== Failed
ERROR rerun aborted because previous run had a suspected panic and some test may not have run
=== FAIL: . TestGatherMetrics_EndToEnd (0.26s)
    hostmetrics_receiver_test.go:175: 
        	Error Trace:	/home/runner/actions-runner/_work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/hostmetrics_receiver_test.go:175
        	            				/home/runner/actions-runner/_work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/hostmetrics_receiver_test.go:148
        	            				/home/runner/actions-runner/_work/_tool/go/1.22.0/arm64/src/runtime/asm_arm64.s:1222
        	Error:      	Not equal: 
        	            	expected: 24
        	            	actual  : 23
        	Test:       	TestGatherMetrics_EndToEnd
    hostmetrics_receiver_test.go:177: 
        	Error Trace:	/home/runner/actions-runner/_work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/hostmetrics_receiver_test.go:177
        	            				/home/runner/actions-runner/_work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/hostmetrics_receiver_test.go:148
        	            				/home/runner/actions-runner/_work/_tool/go/1.22.0/arm64/src/runtime/asm_arm64.s:1222
        	Error:      	map[string]struct {}{"system.cpu.load_average.15m":struct {}{}, "system.cpu.load_average.1m":struct {}{}, "system.cpu.load_average.5m":struct {}{}, "system.cpu.time":struct {}{}, "system.disk.io":struct {}{}, "system.disk.io_time":struct {}{}, "system.disk.merged":struct {}{}, "system.disk.operation_time":struct {}{}, "system.disk.operations":struct {}{}, "system.disk.pending_operations":struct {}{}, "system.disk.weighted_io_time":struct {}{}, "system.filesystem.inodes.usage":struct {}{}, "system.filesystem.usage":struct {}{}, "system.memory.usage":struct {}{}, "system.network.connections":struct {}{}, "system.network.dropped":struct {}{}, "system.network.errors":struct {}{}, "system.network.io":struct {}{}, "system.network.packets":struct {}{}, "system.paging.faults":struct {}{}, "system.paging.operations":struct {}{}, "system.processes.count":struct {}{}, "system.processes.created":struct {}{}} does not contain "system.paging.usage"
        	Test:       	TestGatherMetrics_EndToEnd

=== FAIL: internal/scraper/pagingscraper TestScrape/Standard (0.00s)
    paging_scraper_test.go:86: 
        	Error Trace:	/home/runner/actions-runner/_work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/internal/scraper/pagingscraper/paging_scraper_test.go:86
        	Error:      	Not equal: 
        	            	expected: 4
        	            	actual  : 2
        	Test:       	TestScrape/Standard

=== FAIL: internal/scraper/pagingscraper TestScrape (0.00s)
panic: runtime error: index out of range [2] with length 2 [recovered]
	panic: runtime error: index out of range [2] with length 2

goroutine 7 [running]:
testing.tRunner.func1.2({0xad00a0, 0xc000038930})
	/home/runner/actions-runner/_work/_tool/go/1.22.0/arm64/src/testing/testing.go:1631 +0x2c8
testing.tRunner.func1()
	/home/runner/actions-runner/_work/_tool/go/1.22.0/arm64/src/testing/testing.go:1634 +0x47c
panic({0xad00a0?, 0xc000038930?})
	/home/runner/actions-runner/_work/_tool/go/1.22.0/arm64/src/runtime/panic.go:770 +0x124
go.opentelemetry.io/collector/pdata/pmetric.MetricSlice.At(...)
	/home/runner/go/pkg/mod/go.opentelemetry.io/collector/pdata@v1.4.1-0.202403271[81](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8531494580/job/23371252002?pr=32135#step:11:82)407-1038b67c85a0/pmetric/generated_metricslice.go:56
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/internal/scraper/pagingscraper.TestScrape.func3(0xc0000a8340)
	/home/runner/actions-runner/_work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/hostmetricsreceiver/internal/scraper/pagingscraper/paging_scraper_test.go:100 +0x644
testing.tRunner(0xc0000a8340, 0xc00004[82](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8531494580/job/23371252002?pr=32135#step:11:83)20)
	/home/runner/actions-runner/_work/_tool/go/1.22.0/arm64/src/testing/testing.go:1689 +0x1[84](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8531494580/job/23371252002?pr=32135#step:11:85)
created by testing.(*T).Run in goroutine 6
	/home/runner/actions-runner/_work/_tool/go/1.22.0/arm64/src/testing/testing.go:1742 +0x5e8

DONE 258 tests, 3 failures in 32.097s
make[2]: *** [../../Makefile.Common:126: test] Error 3
make[1]: *** [Makefile:165: receiver/hostmetricsreceiver] Error 2
make: *** [Makefile:113: gotest] Error 2
make[1]: Leaving directory '/home/runner/actions-runner/_work/opentelemetry-collector-contrib/opentelemetry-collector-contrib'
Error: Process completed with exit code 2.
@atoulme atoulme added the needs triage New item requiring triage label Apr 3, 2024
@atoulme atoulme changed the title [receiver/hostmetrics] [receiver/hostmetrics] ARM github action runner fails to run tests Apr 3, 2024
Copy link
Contributor

github-actions bot commented Apr 3, 2024

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

This was referenced Apr 3, 2024
@andrzej-stencel
Copy link
Member

This issue is caused by the fact that the test TestGatherMetrics_EndToEnd is tightly coupled to the underlying operating system specifics. In this case, depending on whether the system has a swap partition or not, the receiver will create or not create the system.paging.usage metric, causing the test to pass or fail.

The specific test failure mentioned in the issue comes from the ARM64 runners from Actuated (introduced in #32135), which apparently do not have a swap partition. This is why the metric system.paging.usage is not created on them, causing the test to fail.

This should probably be fixed by making the TestGatherMetrics_EndToEnd test pass independently of whether the underlying system has a swap partition or not.

@andrzej-stencel andrzej-stencel removed the needs triage New item requiring triage label Apr 4, 2024
andrzej-stencel pushed a commit that referenced this issue Apr 8, 2024
**Description:**
Add an ARM build using Actuated, which will run an ARM github action
runner on CNCF allocated Equinix colocated ARM servers.

**Link to tracking Issue:**
Fixes #12920
Fixes #32136
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants