Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/findpeer #708

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion dht_filters.go
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ func PublicQueryFilter(_ *IpfsDHT, ai peer.AddrInfo) bool {

var hasPublicAddr bool
for _, a := range ai.Addrs {
if !isRelayAddr(a) && isPublicAddr(a) {
if isRelayAddr(a) || isPublicAddr(a) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd need to poke around a bit more to be sure, but it does look like there may be a bug in running FindPeers against peers behind relays.

However, this change is quite bad for the network since it means we may end up dialing relayed nodes as part of the query which can increase lookup times.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My usage scenario is as follows:

  1. private network
  2. there is a bootstrap node which has a publick ip, and then it's a relay node.
  3. some peer on cloud server which can accessible by LB will announce LB addr autonat addr
  4. some peer on home terminal can be accessible by static relayAddr will announce a RelayAddr. (use docker to startup some node which cannot be publicly accessible unless use found a relay addr)

for PublicQueryFilter reason, peer based on 3. conditions cannot find peers based on 4. but they all connect to bootstrap(relay). (ipfs dht peer)

if you add something on peer based on 4. it can be find on peer based 3. (ipfs dht findprovs), but findpeer can not work, this is not logical.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, we should first ensure the correctness of the query logic, and then consider how to reduce overhead(may increase lookup times).

Copy link
Contributor

@aschmahmann aschmahmann Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, we should first ensure the correctness of the query logic, and then consider how to reduce overhead(may increase lookup times).

There's a difference between kademlia query logic and the logic of the top layer queries (i.e. FindPeer, PutValue, FindProvidersAsync, etc.). Sacrificing kademlia logic to fix bugs in the top layer logic is problematic.

The bug that we have to live with for the time being (until we do a DHT version bump/full network upgrade) is that a peer doing a query cannot distinguish at the protocol level between peerinfos that are sent back corresponding to DHT servers vs ones that we just happen to be connected to and exactly match the query. We may need to add some extra logic to deal with this case.

A further illustration as to why this fix is wrong is that if we had a query filter that removed all peers whose IP addresses were in the range 4...* but then it turned out that when I did a FindPeer query for that peer that it's address was in fact 4.3.2.1 I should probably still get a result returned since this result is a value returned from the DHT not a part of the query itself.

This means the fix would need to live either in the query logic, the find peer logic, or somewhere along that code path. We should have a Go test illustrating the problem so we don't have a regression in the future and can test the fix.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the fix cause kademlia’s query sacrificed.
the result returned by kademlia’s query is incomplete (with missing), and the upper layer (i.e. FindPeer, PutValue, FindProvidersAsync, etc.) cannot do further filtering.

summary: the bottom layer result set is incomplete, the upper level cannot be correct.

Copy link
Contributor

@aschmahmann aschmahmann Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the bottom layer result set is incomplete

Sure, but as mentioned above ("A further illustration ...") only fixing a single QueryFilter implementation is insufficient. The fix needs to live somewhere else because the problem may effect more QueryFilter implementations. An example solution that might be better although is still not great is if the query filter was skipped entirely if target == peerID.

Again though having a Go test demonstrating the issue makes this much more concrete to talk about (e.g. you can setup a test that uses an easier to work with QueryFilter)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix is corret for PublicQueryFilter(relayAddr can publicly accessible), further optimization should be considered later.

An example solution that might be better although is still not great is if the query filter was skipped entirely if target == peerID.

Can you be more specific?

Copy link
Contributor

@aschmahmann aschmahmann Mar 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, sorry it's taken me a little while to respond. I just added a test that fails in order to show that this bug occurs at a different layer than just the PublicQueryFilter code (e.g. the test doesn't even call it). Dealing with this aside from PublicQueryFilter itself is about correctness not optimization.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your point of view!!!
At present, there will be many situations where there is no external network IP in my use scene. These nodes need to provide public network access services through relay nodes. Based on this, the current repair is temporarily friendly to me. You can consider it again and look forward to more Good solution!!!

hasPublicAddr = true
}
}
Expand Down
4 changes: 2 additions & 2 deletions handlers.go
Original file line number Diff line number Diff line change
Expand Up @@ -265,10 +265,10 @@ func (dht *IpfsDHT) handleFindPeer(ctx context.Context, from peer.ID, pmes *pb.M
if targetPid == dht.self {
closest = []peer.ID{dht.self}
} else {
closest = dht.betterPeersToQuery(pmes, from, dht.bucketSize)

// Never tell a peer about itself.
if targetPid != from {
closest = dht.betterPeersToQuery(pmes, from, dht.bucketSize)

Comment on lines -268 to +271
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be ok and save on some bandwidth in the even a peer does look up itself, but we already don't send a peer back to itself even when calling dht.betterPeersToQuery.

// Dont send a peer back themselves

This prevents us from sending back the closest peers which is part of the general query contract. It doesn't mean we can't do this optimization, but it just might not be worthwhile.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, you can ignore it.
The number of peers in my test network is less than K (closest peeers), so dht.routingTable.NearestPeers may contains itself.

// Add the target peer to the set of closest peers if
// not already present in our routing table.
//
Expand Down