Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving previously seen nodes for later bootstrapping #3926

Closed
dsvi opened this issue May 16, 2017 · 17 comments · Fixed by #8856
Closed

Saving previously seen nodes for later bootstrapping #3926

dsvi opened this issue May 16, 2017 · 17 comments · Fixed by #8856
Assignees
Labels
effort/hours Estimated to take one or several hours P2 Medium: Good to have, but can wait until someone steps up

Comments

@dsvi
Copy link

dsvi commented May 16, 2017

Version information:

N/A

Type:

Enhancement

Severity:

Medium

Description:

Bootstrap nodes built into ipfs client don't seem to work for me at all. Had to find and add some manually.
Generally having so centralized bootstrap system for otherwise decentralized network is its weakness:
#3908
Should ipfs automatically save all the nodes it seen and the time of last seen, to clean out the list later?
That should make the bootstrapping more reliable and truly decentralized.

@whyrusleeping
Copy link
Member

Yes! definitely. The blocker here is making sure we store peerstore data to a persistent datastore (right now all peer info is kept in memory).

@djdv
Copy link
Contributor

djdv commented May 19, 2017

Related anecdote
Perfect Dark relies on users to manually add nodes, the client retains peers like you had mentioned. It seems to work well for their network/peers. You add a peer to connect to, the client will exchange peer lists with them and add all of them to a ledger, on the next bootstrap the client uses that ledger, it attempts to establish a connection with everyone in it, after enough connection failures peers are removed from the ledger while successful ones remain, this keeps it from filling up with too many dead nodes.

Not sure what approach would be good for ipfs in terms of storing peers and managing stored peers for bootstrap, or what other systems have tried.

@whyrusleeping whyrusleeping added the help wanted Seeking public contribution on this issue label Sep 2, 2017
@whyrusleeping
Copy link
Member

@djdv that sounds like a great approach. Especially tracking failures for removal.

@whyrusleeping whyrusleeping added this to the Ipfs 0.4.12 milestone Sep 2, 2017
@Kubuxu Kubuxu modified the milestones: Ipfs 0.4.12, go-ipfs 0.4.13 Nov 6, 2017
@makew0rld
Copy link
Contributor

So did this happen? Thanks.

@makew0rld
Copy link
Contributor

@Kubuxu @whyrusleeping
What's the state of this?

@Stebalien
Copy link
Member

@bigs is working on storing the peerstore on disk (libp2p/go-libp2p-peerstore#28). After that, it'll be a matter of remembering which peers tend to reliably be online.

@bigs
Copy link
Contributor

bigs commented Jul 12, 2018

yup. we'll also need to consider how we re-initialize our TTLs after rebooting, but this is coming down the pike.

@lidel
Copy link
Member

lidel commented Feb 15, 2019

Ideas being discussed in libp2p/go-libp2p-kad-dht#254

@blurHY
Copy link

blurHY commented Feb 18, 2019

GFW blocked all default bootstrap nodes

GFW is the national firewall of China

@Geo25rey
Copy link

@whyrusleeping Do you know the status of this issue?

@aschmahmann
Copy link
Contributor

aschmahmann commented Oct 30, 2020

@Geo25rey there are a number of the issues (and some PRs) in the DHT around this that you can checkout.

The short version is that there's interest in doing this and some good proposals, but the plan is not to do it until we've landed higher priority work on performing smooth upgrades to the DHT protocol (e.g. libp2p/go-libp2p-kad-dht#616).

@Geo25rey
Copy link

we've landed higher priority work on performing smooth upgrades to the DHT protocol

@aschmahmann What do you mean by that?

@lidel
Copy link
Member

lidel commented Mar 14, 2022

Two years later, neither libp2p/go-libp2p-kad-dht#616 nor libp2p/go-libp2p-kad-dht#254 happened.

Meanwhile, various countries and companies can cripple connectivity by blocking well-known list of hardcoded bootstrappers.

I propose we do something rather than keeping the current broken state.
Does not have to be perfect, MVP could live purely in go-ipfs code.
When upstream libs support persistence and fancy logic, we can switch to them.

Help wanted

If someone opens a PR that is preserving currently connected peers (ipfs swarm connect) across restarts, I'm happy to review it.

MVP:

  • persist the list of connected peers every 15m / swarm peers is enumerated / on shutdown
  • extend bootstrap.go logic to re-connect to persisted peers
    • grab some number of peers that are in your routing table that you're connected to and use them as additional temporary bootstrappers, but don't add them with peerstore.PermanentAddrTTL - these should not be permanent bootstrappers
      • instead, use peerstore.TempAddrTTL or peerstore.ProviderAddrTTL to ensure we keep them around for a short period of time

@lidel lidel added P2 Medium: Good to have, but can wait until someone steps up effort/hours Estimated to take one or several hours labels Mar 14, 2022
@schomatis schomatis self-assigned this Mar 28, 2022
@schomatis schomatis removed the help wanted Seeking public contribution on this issue label Mar 28, 2022
@schomatis
Copy link
Contributor

Per @lidel, taking this one since no one in the community has tackled it yet (if someone is still interested I can be the guide and reviewer but should say so now).

@schomatis
Copy link
Contributor

Actively working on this. It's taking some time to do it right, will push something by EOW.

@schomatis
Copy link
Contributor

Beautiful people of the ipfsphere, I have a very WIP PR with an initial implementation of this in #8856. Any feedback will be very useful to better understand the use cases we should support.

@ElianaTroper
Copy link

ElianaTroper commented Jan 31, 2023

One thing that might be useful considering for this (and potentially other cases) is generalizing a bit to have bootstrapping/peers grouped. If I'm shipping an application it might make sense to have 3 groups of bootstrapping nodes - the default nodes to connect to the broader decentralized IPFS ecosystem, a set of nodes that the application shipper is running to enable closer connections to other users that are also using the app (and to take some load off of the default nodes), and previously seen nodes. I could see this being useful with a combo of priority levels and percent of nodes to connect to within a group (e.g. 100% of application shipper nodes at priority 1, 25% of default nodes to reduce the burden on those also at priority 1, and 33% of previously connected nodes at priority 2). I wouldn't want to sample randomly among all nodes in the bootstrapping list, I would want to sample within these groups to maintain better performance of my application.

I don't think this improvement should hold up the initial code, but this would be a more general approach that would solve this PR and would more broadly help IPFS in other use cases too.

Additionally, having nodes grouped could return useful feedback in the future (e.g. all nodes in the default bootstrapping list are not connecting - maybe there's an issue occurring there or they're blocked by e.g. GFW or corporate firewall) and this could trigger further actions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort/hours Estimated to take one or several hours P2 Medium: Good to have, but can wait until someone steps up
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.