-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hole-punching tutorial appears to trigger panic in listening client, preventing completion #2621
Comments
In retrospect, I did a pretty bad job at skimming through the existing github issues, as my issue is clearly just a duplicate of this pre-existing issue here: #2601. At least its not just me. I'll have another poke around this weekend, though I can't imagine I'm going to uncover anything more productive then the information already in that issue. |
Out of curiosity I recompiled the I even tried compiling the older Sorry to bother you @mxinden, but could you let me know what OS these examples were originally tested on? (If you know that is.) If its just a matter of me having a different OS on my relay then maybe I can use that and keep on learning libp2p despite this issue. |
Sorry, late reply here. Thanks @Foemass for the detailed report! I am running with Would you mind running once more with the change below to understand which listener is failing? modified protocols/dcutr/examples/client.rs
@@ -172,8 +172,8 @@ fn main() -> Result<(), Box<dyn Error>> {
futures::select! {
event = swarm.next() => {
match event.unwrap() {
- SwarmEvent::NewListenAddr { address, .. } => {
- info!("Listening on {:?}", address);
+ SwarmEvent::NewListenAddr { address, listener_id } => {
+ info!("Listening on {:?}, listener ID {:?}.", address, listener_id);
}
event => panic!("{:?}", event),
} In case you ignore the listener closing event, what is the behavior of the code after that? modified protocols/dcutr/examples/client.rs
@@ -214,6 +214,7 @@ fn main() -> Result<(), Box<dyn Error>> {
info!("Observed address: {:?}", observed_addr);
break;
}
+ e @ SwarmEvent::ListenerClosed { .. } => info!("{:?}", e),
event => panic!("{:?}", event),
}
} Do you see the same behavior when running in a different environment, e.g. your laptop? |
Seems like @mathiversen has fixed this particular issue on their side. Mind sharing what the problem was? |
Hi, thanks for the help! I've run it with both your code edits, here's the client log:
Debug Output
And here's another run of the same code with a "debug" log:
Debug Output
And here's the relay log (no code edits since last time):
Debug Output
My laptop is nearly identical in terms of environment, but I'll try the client on it tomorrow and get back to you. I might also be able to try the client on a windows install at some point to see if that makes a difference. |
Sorry for the delay, I had a crazy busy workweek. I've now compiled and run the client in a bunch of different environments, with some curious results.
It should be noted that, (just like mathiversen observed,) only the listener is touchy about connecting, the dialer should connect fine in any of the above scenarios. I'm not really sure what to make of the listener working fine through an iPhone hotspot, because I have no idea how strict the firewall is on an iOS mobile hotspot. But I suppose this could be an indicator that my generic un-configured ISP-provided router is somehow too strict for NAT punch-through? any thoughts @mxinden? |
Thanks for providing all the details above @Foemass! I think the relevant log line is the below from the dialing client:
In other words, the relay tells the dialing client that it has not (yet) received a reservation request from the listening client. Just to double check, does the listening client print |
When in listening mode, wait for the relay to accept our reservation request. Only then can a client in dialing mode establish a relayed connection to us via the relay. See also libp2p#2621 (comment)
See #2642 which helps to see whether the relay accepted our reservation request. |
Thanks for that mxinden, Sorry I only just realized I must have had some kind of brain fart 8 days ago. My brain thought I was running/logging the listening client, but apparently I copy and pasted the line for the dialing client instead and never ran the listener that night, facepalm well that's late night debugging for you. I've now actually run the listening client with your previous changes from 9 days ago (not the more recent ones directly above) and it doesn't print "ReservationReqAccepted". Here's the logs I -should- have posted 8 days ago.
Debug Output
Debug Output
Regarding my latest layman theory of "my router is somehow too strict for NAT punch-through" I managed to successfully do NAT punch/passthrough on my network earlier today with a C# library, so that can't be the case. Anyway, i'll give your new pull request a go sometime this week and get back to you. It's late over here so i'm going to have another brain fart if I try it now. |
Hi @mxinden I pulled the master branch and (after dealing with the new cmake requirement) added your altered client.rs (protocols/dcutr/examples/client.rs) In most circumstances your changes don’t seem to make much difference. See the resulting logs below. (Since it was a new pull I haven’t included your changes from #2621 (comment), so you might find the logs at #2621 (comment) more useful) Client (Listener):
Debug Output
Relay:
Debug Output
The listener -doesn't- panic if I act fast and run the listener while the relay is only displaying its inital batch of logs. (So in the example above i'd need to run whilst only the logs marked with 2022-05-15T22:07:29Z were visible.) This weird timing trick also only works when using RUST_LOG=debug, without debug logging the listener still panics (nomatter how fast I connect). When I use this trick and the listener doesn't panic I can’t seem to connect to it with a dialling client and get no log saying the relay actually accepted the reservation request, so I've probably just discovered a different unrelated way to upset the relay. Incidently I’m not having any luck using webrtc.rs to perform hole-punching either. Looks like fate (or more likely my own ignorance) is determined to force me back to .NET where hole punching, unaccountably, works on my network. |
Thanks for the logs! Very helpful.
This is the relevant part. More specifically The Relay needs to return a set of addresses, via which the Relay is reachable, to the listening client. In case the Relay does not yet know under which addresses it is reachable, it will provide none. Providing none will make the listening client error: rust-libp2p/protocols/relay/src/v2/protocol/outbound_hop.rs Lines 133 to 135 in d21cd5f
The listening client announces itself as "listening for incoming connections via the relay". If it doesn't know the public addresses of the relay, it can not announce itself. How to make sure the relay knows its public addresses:
I will propose a patch similar to #2642 but on the relay side, to make this more prominent in the Relay logs, and likely to be able to pass the address as a flag. |
Ah-ha! Well that makes perfect sense... now somebody has explained it to me ;). Thanks @mxinden, you've made my day. This also explains mathiversen's observation in the other issue (that doing a failed dialing first causes everything to work.) Given I should ideally run the AutoNAT protocol on my client first to see if using a relay is even necessary... Perhaps having the relay serve as the other end for that initial check could "hit two birds with one stone"? Would connecting to perform AutoNat first have the side effect of speaking the identify protocol and thus telling the relay its own address? |
Thanks for bearing with us!
Pinging @mathiversen directly, in case the above is helpful.
Yes, that would be a good option. Will see what I can do in the example code. |
…#2642) When in listening mode, wait for the relay to accept our reservation request. Only then can a client in dialing mode establish a relayed connection to us via the relay. See also #2621 (comment)
Thanks again for the help @mxinden, I've tested the latest client.rs with libp2p 0.44.0 and its working wonderfully. Just thought I'd confirm it so you know I'm happy for this issue to be closed whenever, (be that now or when v0.45.0 is released.) |
@mxinden, Sorry for the delay. Yes I can confirm all my issues above are resolved, they all ultimately shared the same cause. |
Great. Closing then. Happy to hear what you are building in case you want to share @Foemass. |
… (#2642) When in listening mode, wait for the relay to accept our reservation request. Only then can a client in dialing mode establish a relayed connection to us via the relay. See also libp2p/rust-libp2p#2621 (comment)
Summary
Hello, I'm new to your wonderful library, apologies if I've made some rookie mistake. I'm encountering a frustrating issue with the hole-punching tutorial here wherein the listening client panics with the following message:
thread 'main' panicked at 'ListenerClosed { listener_id: ListenerId(2), addresses: [], reason: Ok(()) }', protocols/dcutr/examples/client.rs:217:26
To the best of my knowledge I've followed the tutorial word for word, with an exception of completing the address used for libp2p-lookup.
I've compiled both the "client" and "relay_v2" with:
and deployed relay_v2 to this kind of AWS Lightsail virtual server:
And opened the following in the AWS firewall:
So that the suggested ping, telnet, libp2p-lookup tests can complete. (And I've checked them, they all complete successfully).
Skimming through the github issues it's evident other people are getting further in the tutorial, so I'm hoping somebody can help me figure out what's different in my case .
Expected behavior
Presumably the listening client should continue to run instead of panicking.
Actual behavior
The listening client panics with the aforementioned message. Full logs for both "client" and "relay_v2" follow:
RUST_LOG=info ./relay_v2 --port 4001 --secret-key-seed 0
Debug Output
RUST_LOG=info RUST_BACKTRACE=full ./client --secret-key-seed 1 --mode listen --relay-address /ip4/[relay ip removed by me only here on github]/tcp/4001/p2p/12D3KooWDpJ7As7BWAwRMfu1VU2WCqNjvq387JEYKDBj4kx6nXTN
Debug Output
Version
Would you like to work on fixing this bug?
Maybe. (Assuming there actually is a bug and this isn't all some rookie mistake on my part I'm completely willing to assist in anyway I can, but I haven't got nearly enough experience in Rust or libp2p to take point on the matter.)
The text was updated successfully, but these errors were encountered: