-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resolving a node's own /ipns entry: trade-offs between consistency and partition tolerance #2278
Comments
ProposalsIf we care more about consistency, let's inform the user immediately that because they are offline they cannot mount If we care more about partition tolerance, let's happy resolve from the local dht cache when the network is unavailable or does not provide a response. It seems important to treat IPNS the same way throughout |
Couldn't we split the differences (for me the most important thing is that records are resolved fast and seamlessly for end users) and fix some issues with IPNS until Record System gets implemented (in undefined future) as currently it is really unusable for some applications (hosting websites vis IPNS, even Problem arises because not for every application best possible consistency is important and due to CAP it means performance suffers. ProposalIntroduce time to old and use it with conjunction with existing expiration time (24h) to allow regulated caching of resolved IPNS records. Time to old would specify how long node can use IPNS entry without confirming that it is actual. This cache would be separate from the DHT as passive updates (without full network resolution) shouldn't update the timer. If cached IPNS record is used and half of time to old has elapsed, full network resolution of that record should be performed. This gives us known and sure caching time and still allows best possible consistency (time to old = 0) as then caching should not be performed. It also creates two stage cache (if time to old is two times longer than resolution time) where entries that are being used would never be unavailable to the user (it happens now, while resolving page from gateway via IPNS it takes 3-10s at first then from about a minute, minute and a halt it is almost instant and then takes again 3-10s depending on many factors). In conjunction with #1921 it would allow for usage of IPNS in applications where latency of response is important. This also implies that publishing a IPNS entry with time to old greater than 0 and resolving it locally until that time elapses is instant. |
Ref: a similar, recent discussion: #2178 @Kubuxu: I'm not sure I see the difference between the existing TTL (EOL) and the proposed time to old. TTL already means "trust this value until it expires". They both seem to answer the same question: "how long can I trust this record to be valid?" I spoke with @jbenet a bit about this. I'll try to summarize what he said and my own understanding. For the sake of retrieving a record that we have cached locally, there are two relevant checks that happen:
My concern was that the DHT was placing such a harsh constraint on record retrieval: getting the latest record from 16 sources. This is an unreasonable requirement to place on the retrieval all records -- not all applications require this level of rigour. @jbenet's explanation was that we should eventually have a record system layer above (or maybe below) the DHT layer. This would allow a separation of concerns between record retrieval and DHT value retrieval. The new algorithm for retrieving a record would instead become:
This sounds reasonable: records should push their validity requirements onto the record system, not onto the DHT. It's a subtle difference, but I think this abstraction resolves the concerns mentioned here and in #2178. |
For those landing here via a search: since 0.4.19 offline resolution is supported via |
On a recent issue, @jbenet wrote:
Presently, we write our own pubkey to our local DHT on
ipfs init
, so that it gets replicated to other nodes on the network.However, when our node tries to resolve its own key, it queries the network, choosing to not trust its local cache in case the value has changed.
This behaviour results in some painful UX:
At the root of this is the CAP theorum:
The current approach puts consistency first: we make every possible effort to resolve a key to its newest value, and fail if we don't feel we can make that guarantee.
As a result, we fail to be partition tolerant: nodes not on the network or nodes who are connected only to peers who don't [yet] have your key in their DHTs.
@whyrusleeping and I discussed this a bit in IRC, but hoping to get thoughts from @jbenet (and anyone else interested!) on this.
The text was updated successfully, but these errors were encountered: