-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/findpeer #708
Fix/findpeer #708
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -265,10 +265,10 @@ func (dht *IpfsDHT) handleFindPeer(ctx context.Context, from peer.ID, pmes *pb.M | |||
if targetPid == dht.self { | ||||
closest = []peer.ID{dht.self} | ||||
} else { | ||||
closest = dht.betterPeersToQuery(pmes, from, dht.bucketSize) | ||||
|
||||
// Never tell a peer about itself. | ||||
if targetPid != from { | ||||
closest = dht.betterPeersToQuery(pmes, from, dht.bucketSize) | ||||
|
||||
Comment on lines
-268
to
+271
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This might be ok and save on some bandwidth in the even a peer does look up itself, but we already don't send a peer back to itself even when calling Line 694 in 03d4b62
This prevents us from sending back the closest peers which is part of the general query contract. It doesn't mean we can't do this optimization, but it just might not be worthwhile. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok, you can ignore it. |
||||
// Add the target peer to the set of closest peers if | ||||
// not already present in our routing table. | ||||
// | ||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd need to poke around a bit more to be sure, but it does look like there may be a bug in running FindPeers against peers behind relays.
However, this change is quite bad for the network since it means we may end up dialing relayed nodes as part of the query which can increase lookup times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My usage scenario is as follows:
for PublicQueryFilter reason, peer based on 3. conditions cannot find peers based on 4. but they all connect to bootstrap(relay). (ipfs dht peer)
if you add something on peer based on 4. it can be find on peer based 3. (ipfs dht findprovs), but findpeer can not work, this is not logical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, we should first ensure the correctness of the query logic, and then consider how to reduce overhead(may increase lookup times).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a difference between kademlia query logic and the logic of the top layer queries (i.e. FindPeer, PutValue, FindProvidersAsync, etc.). Sacrificing kademlia logic to fix bugs in the top layer logic is problematic.
The bug that we have to live with for the time being (until we do a DHT version bump/full network upgrade) is that a peer doing a query cannot distinguish at the protocol level between peerinfos that are sent back corresponding to DHT servers vs ones that we just happen to be connected to and exactly match the query. We may need to add some extra logic to deal with this case.
A further illustration as to why this fix is wrong is that if we had a query filter that removed all peers whose IP addresses were in the range 4...* but then it turned out that when I did a FindPeer query for that peer that it's address was in fact 4.3.2.1 I should probably still get a result returned since this result is a value returned from the DHT not a part of the query itself.
This means the fix would need to live either in the query logic, the find peer logic, or somewhere along that code path. We should have a Go test illustrating the problem so we don't have a regression in the future and can test the fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the fix cause kademlia’s query sacrificed.
the result returned by kademlia’s query is incomplete (with missing), and the upper layer (i.e. FindPeer, PutValue, FindProvidersAsync, etc.) cannot do further filtering.
summary: the bottom layer result set is incomplete, the upper level cannot be correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but as mentioned above ("A further illustration ...") only fixing a single QueryFilter implementation is insufficient. The fix needs to live somewhere else because the problem may effect more QueryFilter implementations. An example solution that might be better although is still not great is if the query filter was skipped entirely if
target == peerID
.Again though having a Go test demonstrating the issue makes this much more concrete to talk about (e.g. you can setup a test that uses an easier to work with QueryFilter)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix is corret for PublicQueryFilter(relayAddr can publicly accessible), further optimization should be considered later.
Can you be more specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, sorry it's taken me a little while to respond. I just added a test that fails in order to show that this bug occurs at a different layer than just the PublicQueryFilter code (e.g. the test doesn't even call it). Dealing with this aside from PublicQueryFilter itself is about correctness not optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your point of view!!!
At present, there will be many situations where there is no external network IP in my use scene. These nodes need to provide public network access services through relay nodes. Based on this, the current repair is temporarily friendly to me. You can consider it again and look forward to more Good solution!!!