Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

swarm: flaky TestDialSimultaneousJoin #1421

Open
marten-seemann opened this issue Apr 22, 2022 · 11 comments
Open

swarm: flaky TestDialSimultaneousJoin #1421

marten-seemann opened this issue Apr 22, 2022 · 11 comments
Assignees
Labels
kind/bug A bug in existing code (including security flaws)

Comments

@marten-seemann
Copy link
Contributor

=== RUN   TestDialSimultaneousJoin
      dial_test.go:578: third dial succedded; conn: <swarm.Conn[*tcp.TcpTransport] /ip4/127.0.0.1/tcp/50014 (12D3KooWRuYVGEsecrJJhZsSoKf1UNdBVYKFCmFLNj9ucZiSQCYj) <-> /ip4/127.0.0.1/tcp/50015 (12D3KooWGEcD5sW5osB6LajkHGqiGc3W8eKfYwnJVVqfujkpLWX2)>
      dial_test.go:560: second dial succedded; conn: <swarm.Conn[*tcp.TcpTransport] /ip4/127.0.0.1/tcp/50014 (12D3KooWRuYVGEsecrJJhZsSoKf1UNdBVYKFCmFLNj9ucZiSQCYj) <-> /ip4/127.0.0.1/tcp/50015 (12D3KooWGEcD5sW5osB6LajkHGqiGc3W8eKfYwnJVVqfujkpLWX2)>
      dial_test.go:[588](https://github.com/libp2p/go-libp2p/runs/6129949662?check_suite_focus=true#step:7:588): 
          	Error Trace:	dial_test.go:588
          	Error:      	Received unexpected error:
          	            	failed to dial 12D3KooWGEcD5sW5osB6LajkHGqiGc3W8eKfYwnJVVqfujkpLWX2:
          	            	  * [/ip4/127.0.0.1/tcp/50016] failed to negotiate security protocol: context deadline exceeded
          	Test:       	TestDialSimultaneousJoin
  --- FAIL: TestDialSimultaneousJoin (0.26s)
@marten-seemann marten-seemann added the kind/bug A bug in existing code (including security flaws) label Apr 22, 2022
@marten-seemann marten-seemann changed the title flaky TestDialSimultaneousJoin swarm: flaky TestDialSimultaneousJoin Apr 22, 2022
@schomatis
Copy link

Assigning myself.

@schomatis
Copy link

@vyzo Looking at the code related to TestDialSimultaneousJoin, is it correct that the line we're trying to trigger is:

// but first do a last one check in case an acceptable connection has landed from
// a simultaneous dial that started later and added new acceptable addrs
c, _ := w.s.bestAcceptableConnToPeer(pr.req.ctx, w.peer)

@vyzo
Copy link
Contributor

vyzo commented Jul 12, 2022

I don't recall targeting a specific line, just making sure we have a test for joined dials.

@schomatis
Copy link

@vyzo Ok, could you point me to 'joined dials' in the code to better understand what are we trying to test, please?

@schomatis
Copy link

And particularly how are we enforcing (or approaching) the "simultaneous" part of the test.

@vyzo
Copy link
Contributor

vyzo commented Jul 12, 2022

It's the invariant that two concurrent dials to the same addresses are joined.

@schomatis
Copy link

Ok, but how do you define concurrent in practice?

@schomatis
Copy link

What I'm seeing here is the first dial timeouting before the second one has a chance to hit and I'm trying to figure out how to better guarantee that simultaneity.

@schomatis
Copy link

It's the invariant that two concurrent dials to the same addresses are joined.

This extends to 'same peer' also right? (This might be implicit in what you just stated, just double checking because I'm new in libp2p.)

@vyzo
Copy link
Contributor

vyzo commented Jul 12, 2022

This extends to 'same peer' also right? (This might be implicit in what you just stated, just double checking because I'm new in libp2p.)

yes, of course -- the dials are peer specific.

@vyzo
Copy link
Contributor

vyzo commented Jul 12, 2022

What I'm seeing here is the first dial timeouting before the second one has a chance to hit and I'm trying to figure out how to better guarantee that simultaneity.

Uhm, maybe somehow delay the first dial until the second one happens (with a channel probably).
Might need to add some test scaffolding in the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws)
Projects
Status: 🥞 Todo
Development

No branches or pull requests

4 participants