Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/findpeer #708

Closed
wants to merge 3 commits into from
Closed

Fix/findpeer #708

wants to merge 3 commits into from

Conversation

sandmanhome
Copy link

  1. peer may recv findpeer request which the from peer query itself by ipfs dht findpeer

  2. PublicQueryFilter should allow peer which is publicly accessible by RelayAddr

  • peer behind nat can be publicly accessible by RelayAddr
  • dht findpeer whill use PublicQueryFilter to filter peer on queryPeer

Copy link
Contributor

@aschmahmann aschmahmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the submission and bug report.

Are you able to replicate the issue with not finding peers whose only public addresses are relay addresses? If you'd be willing to write a Go test for that it would be much appreciated and make it easier to evaluate a fix that won't cause network problems (as the submitted one does).

Comment on lines -268 to +271
closest = dht.betterPeersToQuery(pmes, from, dht.bucketSize)

// Never tell a peer about itself.
if targetPid != from {
closest = dht.betterPeersToQuery(pmes, from, dht.bucketSize)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be ok and save on some bandwidth in the even a peer does look up itself, but we already don't send a peer back to itself even when calling dht.betterPeersToQuery.

// Dont send a peer back themselves

This prevents us from sending back the closest peers which is part of the general query contract. It doesn't mean we can't do this optimization, but it just might not be worthwhile.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, you can ignore it.
The number of peers in my test network is less than K (closest peeers), so dht.routingTable.NearestPeers may contains itself.

if !isRelayAddr(a) && isPublicAddr(a) {
if isRelayAddr(a) || isPublicAddr(a) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd need to poke around a bit more to be sure, but it does look like there may be a bug in running FindPeers against peers behind relays.

However, this change is quite bad for the network since it means we may end up dialing relayed nodes as part of the query which can increase lookup times.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My usage scenario is as follows:

  1. private network
  2. there is a bootstrap node which has a publick ip, and then it's a relay node.
  3. some peer on cloud server which can accessible by LB will announce LB addr autonat addr
  4. some peer on home terminal can be accessible by static relayAddr will announce a RelayAddr. (use docker to startup some node which cannot be publicly accessible unless use found a relay addr)

for PublicQueryFilter reason, peer based on 3. conditions cannot find peers based on 4. but they all connect to bootstrap(relay). (ipfs dht peer)

if you add something on peer based on 4. it can be find on peer based 3. (ipfs dht findprovs), but findpeer can not work, this is not logical.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, we should first ensure the correctness of the query logic, and then consider how to reduce overhead(may increase lookup times).

Copy link
Contributor

@aschmahmann aschmahmann Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, we should first ensure the correctness of the query logic, and then consider how to reduce overhead(may increase lookup times).

There's a difference between kademlia query logic and the logic of the top layer queries (i.e. FindPeer, PutValue, FindProvidersAsync, etc.). Sacrificing kademlia logic to fix bugs in the top layer logic is problematic.

The bug that we have to live with for the time being (until we do a DHT version bump/full network upgrade) is that a peer doing a query cannot distinguish at the protocol level between peerinfos that are sent back corresponding to DHT servers vs ones that we just happen to be connected to and exactly match the query. We may need to add some extra logic to deal with this case.

A further illustration as to why this fix is wrong is that if we had a query filter that removed all peers whose IP addresses were in the range 4...* but then it turned out that when I did a FindPeer query for that peer that it's address was in fact 4.3.2.1 I should probably still get a result returned since this result is a value returned from the DHT not a part of the query itself.

This means the fix would need to live either in the query logic, the find peer logic, or somewhere along that code path. We should have a Go test illustrating the problem so we don't have a regression in the future and can test the fix.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the fix cause kademlia’s query sacrificed.
the result returned by kademlia’s query is incomplete (with missing), and the upper layer (i.e. FindPeer, PutValue, FindProvidersAsync, etc.) cannot do further filtering.

summary: the bottom layer result set is incomplete, the upper level cannot be correct.

Copy link
Contributor

@aschmahmann aschmahmann Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the bottom layer result set is incomplete

Sure, but as mentioned above ("A further illustration ...") only fixing a single QueryFilter implementation is insufficient. The fix needs to live somewhere else because the problem may effect more QueryFilter implementations. An example solution that might be better although is still not great is if the query filter was skipped entirely if target == peerID.

Again though having a Go test demonstrating the issue makes this much more concrete to talk about (e.g. you can setup a test that uses an easier to work with QueryFilter)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix is corret for PublicQueryFilter(relayAddr can publicly accessible), further optimization should be considered later.

An example solution that might be better although is still not great is if the query filter was skipped entirely if target == peerID.

Can you be more specific?

Copy link
Contributor

@aschmahmann aschmahmann Mar 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, sorry it's taken me a little while to respond. I just added a test that fails in order to show that this bug occurs at a different layer than just the PublicQueryFilter code (e.g. the test doesn't even call it). Dealing with this aside from PublicQueryFilter itself is about correctness not optimization.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your point of view!!!
At present, there will be many situations where there is no external network IP in my use scene. These nodes need to provide public network access services through relay nodes. Based on this, the current repair is temporarily friendly to me. You can consider it again and look forward to more Good solution!!!

@BigLep BigLep added the need/analysis Needs further analysis before proceeding label Mar 22, 2021
@BigLep
Copy link

BigLep commented Mar 22, 2021

@aschmahmann will close/convert to an issue so further analysis can be done.

@aschmahmann
Copy link
Contributor

closed by #711

@aschmahmann aschmahmann closed this May 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/analysis Needs further analysis before proceeding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants