-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flaky test: TestPeersTotal #8135
Comments
Looking at CircleCi config, it looks like Running that job on |
The test on which this one is based has been fixed for the same error in libp2p/go-libp2p-swarm@0538806. @aschmahmann I'm not sure how to fix this as I don't know the original design rationale behind the test nor the function we are testing. The original fix changes its test to track peers instead of open connections (because of what's explained in the commit message) but in our function being tested we track connections and I imagine changing that to tracking peers instead would have unintended side effects in the callers consuming this information. |
Next step is for @petar to look to identify next step. We may not need this if this test is a copy from libp2p. |
I've added more logging to the test for now, so that next time it breaks we can understand why the connections are established over different transports. |
This is blocked until we find another occurrence of this so we can look at the more verbose logs. @petar : I assume you haven't seen another instance? |
Nope. The logging code is checked into master. Waiting for the flakiness to
occur again.
…On Fri, 8 Oct 2021 at 08:16, Steve Loeppky ***@***.***> wrote:
This is blocked until we find another occurrence of this so we can look at
the more verbose logs.
@petar <https://github.com/petar> : I assume you haven't seen another
instance?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8135 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACFTS7YYZW37HGORK6Y77TUF4DLPANCNFSM442WZGXQ>
.
|
@petar Following up on this: On the extra logging, we will normally dial both TCP and QUIC/UDP simultaneously and add both connections to the list, see the swarm dial worker logic for more details. As mentioned above this is already acknowledged upstream and that is why libp2p/go-libp2p-swarm@0538806 no longer checks the list of connections but explicitly the list of peers in its own test (because one peer might have more than one active connection). I think we should port this fix here as well, consulting the peer list instead of the connection list where appropriate. My main doubt is understanding the actual purpose of |
@schomatis I agree. The ipfs test should use peer count, and not be dependent on the underlying implementation of swarm (which can vary the connection count per peer). |
Ok, will push something for this next week. |
Per 2021-11-12 verbal and slack: the test should be updated to assert on the number of peers (not the number of connections). @galargh: if you have other comments/context from Slack, feel free to add. |
Nothing to add, #8135 (comment) says it all. |
Picking this up again. |
https://app.circleci.com/pipelines/github/ipfs/go-ipfs/4610/workflows/d1451a93-9637-482c-b9e1-f1914dfaa687/jobs/51152/tests#failed-test-0
After re-running it is green, so filing this issue so we don't forget to fix this).
The text was updated successfully, but these errors were encountered: