Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/carbon] Flaky Test - Test_Server_ListenAndServe/udp #10916

Closed
djaglowski opened this issue Jun 13, 2022 · 16 comments
Closed

[receiver/carbon] Flaky Test - Test_Server_ListenAndServe/udp #10916

djaglowski opened this issue Jun 13, 2022 · 16 comments
Assignees

Comments

@djaglowski
Copy link
Member

=== RUN   Test_Server_ListenAndServe/udp
    server_test.go:63: 
        	Error Trace:	server_test.go:63
        	Error:      	Received unexpected error:
        	            	listen udp 127.0.0.1:60411: bind: An attempt was made to access a socket in a way forbidden by its access permissions.
        	Test:       	Test_Server_ListenAndServe/udp
--- FAIL: Test_Server_ListenAndServe (0.01s)
    --- PASS: Test_Server_ListenAndServe/tcp (0.01s)
    --- FAIL: Test_Server_ListenAndServe/udp (0.00s)
FAIL
FAIL	github.com/open-telemetry/opentelemetry-collector-contrib/receiver/carbonreceiver/transport	0.062s

Observed here: https://github.com/open-telemetry/opentelemetry-collector-contrib/runs/6862350950?check_suite_focus=true#step:7:951

@djaglowski djaglowski added bug Something isn't working flaky test a test is flaky labels Jun 13, 2022
@djaglowski
Copy link
Member Author

Possibly related to #1426, #10151, #10872. cc @bogdandrutu

@jspaleta
Copy link
Contributor

pardon,
Looking at the testutil function being used to determine available ports.... it looks like the testutil function has tcp baked in. But this test is meant to be a udp service.

Does the testutil function findAvailableAddress appears to have tcp hardcoded in the net.listen call.

There doesn't appear to be a way to instruct the testutil functions to search for a free udp port.

ln, err := net.Listen("tcp", "localhost:0")

@djaglowski
Copy link
Member Author

@jspaleta, thanks for looking into this. Are you willing to submit a PR for this? If so, I will assign the issue to you.

@jspaleta
Copy link
Contributor

@jspaleta, thanks for looking into this. Are you willing to submit a PR for this? If so, I will assign the issue to you.

Hey,
I'm sort of new in the otel space. My preference would be to fix up the common testing utilities so it can work with udp(or any network the net.listen supports) as well, but that's a potential breaking change across a lot of tests as I would need to extend the logic of the testing utility so you can/must specify the network you want to listen on. I guess I could have it fallback to tcp if unspecified.

@jspaleta
Copy link
Contributor

jspaleta commented Aug 1, 2022

@djaglowski you can assign me, I think I'm up to speed enough know to do a PR to extend the test suite.

@jspaleta
Copy link
Contributor

jspaleta commented Aug 2, 2022

@djaglowski
PR #12888 is up now with refactored testutil.
The udp listening components like carbon and statsd tests will need to be modified after this PR is merged to make use of the new testutil function provided in the PR.

@jspaleta
Copy link
Contributor

jspaleta commented Aug 2, 2022

okay with this merged, I can now open the PRs to fix up the disabled tests for carbon an statsd

@jspaleta
Copy link
Contributor

jspaleta commented Aug 2, 2022

@djaglowski
PR #12901 is up now with the changes to re-enable the udp listen service test for carbon and statsd receivers

@bogdandrutu
Copy link
Member

@jspaleta PR does not have the effect we want, reverting now.

@jspaleta
Copy link
Contributor

jspaleta commented Aug 3, 2022

@bogdandrutu,
oh no that's unfortunate.
Is there a test log with error output I can look at with go file line context for where the failure is happening. The UDP issue was definitely a real race issue..but there maybe another race.

@jspaleta
Copy link
Contributor

jspaleta commented Aug 3, 2022

@bogdandrutu
is the problem happening for both carbon and statsd receiver?

@jspaleta
Copy link
Contributor

jspaleta commented Aug 3, 2022

Okay I've identified a bug in the reciever tests. It has noth ing to do with the udp port binding, which was a potential other problem.

The new test failure happens when the UDPserver reads in a 0 byte length message from its UDP packet connection. The deferred waitGroup.Done() fires, but there is not msg copied into the msg channel because the zero length buffer read doesnt result in a msg being copied to the channel.

I'll work on refactoring the tests now to catch this.

@jspaleta
Copy link
Contributor

jspaleta commented Aug 3, 2022

@djaglowski
new PR up #12919 with additional race condition protection in place.

@djaglowski
Copy link
Member Author

Thanks for digging into this @jspaleta.

@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Nov 10, 2022
@github-actions
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants