Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPNS-over-PubSub as an Independent Transport #218

Merged
merged 6 commits into from
Apr 30, 2020
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 95 additions & 0 deletions naming/pubsub.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# ![](https://img.shields.io/badge/status-wip-orange.svg?style=flat-square) IPNS PubSub Router

Authors:

- Adin Schmahmann ([@aschmahmann](https://github.com/aschmahmann))

Reviewers:

-----

# Abstract

[Inter-Planetary Naming System (IPNS)](/README.md) is a naming system responsible for the creating, reading and updating of mutable pointers to data.
IPNS consists of a pubic/private asymmetric cryptographic key pair, a record type and a protocol.
aschmahmann marked this conversation as resolved.
Show resolved Hide resolved
Part of the protocol involves a routing layer that is used for the distribution and discovery of new or updated IPNS records.

The IPNS PubSub router uses as a base [libp2p PubSub](https://github.com/libp2p/specs/tree/master/pubsub), but layers on it persistence so as to ensure IPNS updates are always available to a connected network.
aschmahmann marked this conversation as resolved.
Show resolved Hide resolved
An inherent property of the IPNS PubSub Router is that IPNS records are republishable by peers other than the peer that originated the record.
Stebalien marked this conversation as resolved.
Show resolved Hide resolved
This implies that as long as a peer on the network has an IPNS record it can be made available to other peers (although the records may be ignored if they are received after the IPNS record's End-of-Life/EOL).

# Organization of this document

- [Introduction](#introduction)
- [Protocol](#protocol)
- [Overview](#overview)
- [API Spec](#api-spec)
- [Integration with IPFS](#integration-with-ipfs)

# Introduction

Each time a node publishes an updated IPNS record for a particular key it is propagated by the router into the network where network nodes can choose to accept or reject the new record.
When a node attempts to retrieve an IPNS record from the network it uses the router to query for the IPNS record(s) associated with the IPNS key; the node then validates the received records.

In this spec we address building a router based on a PubSub system, particularly focusing on libp2p PubSub.

# PubSub Protocol Overview

The protocol has four components:
- [IPNS Records and Validation](/README.md)
- [libp2p PubSub](https://github.com/libp2p/specs/tree/master/pubsub)
- Translating an IPNS record name to/from a PubSub topic
- Layering persistence onto libp2p PubSub

# Translating an IPNS record name to/from a PubSub topic

For a given IPNS local record key described in the IPNS Specification the PubSub topic is:

**Topic format:** `/record/base64url-unpadded(key)`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see ipfs/specs/naming#local-record uses /ipns/base32(<HASH>).
We could use the same naming convention in all places, if possible.

What would be less work, change this to /ipns/base32(key) or the other way around?

Copy link
Contributor Author

@aschmahmann aschmahmann Sep 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about the inconsistency, there's two issues here:

  1. I think that area of the IPNS spec could use some clarification. Here, we're actually referring to the ipfs/specs/naming#routing-record /ipns/BINARY_ID.
  2. This spec is currently using /record instead of /ipns. Should we make it use /ipns? If starting with /ipns is not necessary for routing records across multiple routers then we should probably just remove the section on routing records.

What do you think?

Edit: Just an update/clarification that IPNS over PubSub currently uses /record as the topic. It's possible this was an oversight, is this something we should change going forward?

Copy link
Contributor Author

@aschmahmann aschmahmann Sep 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked @Stebalien what he thought about this and his understanding of the linked IPNS is that:

local-record = what we put in a local datastore = /ipns/base32(key) where key = multihash of IPNS public key
routing-record = what the form of the routing record is (BINARY_ID = binary representation of key where key = multihash of IPNS public key

This means our topic format is /record/base64url-unpadded(/ipns/BINARY_ID). The topic format being /record is therefore not actually incompatible with the already described routing-record for IPNS.

However, we should give BINARY_ID a new name in the other spec that makes more sense. One set of words we tend to use is IPNS private key = signing key, IPNS public key = verification key, IPNS key = multihash(IPNS public key). However, we could probably use better words, any suggestions?

Copy link
Member

@lidel lidel Sep 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aschmahmann
I've been thinking about naming and got some ideas on how to simplify things a bit:

👍 unify popular terms

As you noted, public/private provides good mental model and we should keep it in specs.

What is ambiguous is IPNS key which is just a shortcut/fingerprint that acts as "IPNS identifier".

I suspect something like this would have the least cognitive overhead:

  • IPNS Private Key = signing key
  • IPNS Public Key = verification key
  • IPNS ID / Identifier = multihash(IPNS Public Key)

Renaming BINARY_ID to BINARY_IPNS_ID should do the trick, it match "IPNS ID" nicely.

✍️ Rename pubsub topic, make it a valid IPNS path?

My concerns with /record/:

  • ambiguity during debugging
    • can't guess what is the purpose of a /record/ topic without deserializing base64
  • adding unnecessary noise when doing ipfs pubsub ls
    • can't just open IPNS path without doing additional conversion
      (compare: /record/{base64} vs /ipns/{cid})

What if we simplify this and rename pubsub topic to follow text representation proposed in libp2p/specs#209? Basically /ipns/cid(IPNS ID)

For CIDv1 default text representation would be in Base32:

base32(cid(1, libp2p-key-codec, multihash(IPNS Public Key)))

Key idea here is that when you do ipfs pubsub ls, you get topics that are valid IPNS content paths. User can just copy topic name and IPNS resource will load, without doing any additional conversion (huge UX win).

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unify popular terms

👍 to that whole section. Those names sound fine to me and are at least not ambiguous anymore.

Rename pubsub topic, make it a valid IPNS path

This bothers me too, especially the ipfs pubsub ls being pretty unhelpful bit.

I think the reason for /record was to indicate that PubSub has this generalized strategy for dealing with records (e.g. the "keep latest record" strategy could be replaced with basically any function that looks like state = merge(state, new record) as described in libp2p/go-libp2p-pubsub-router#36).

I'm a little concerned about using /ipns/base32(cid(1, libp2p-key-codec, multihash(IPNS Public Key))) as a PubSub topic. While libp2p/specs#209 still allows PeerIDs to be searched for or referenced on the network by multihash, this change would make IPNS IDs effectively CIDs since the network level representation in PubSub is the topic name.

A couple options to do something like this might be:

  1. Just make IPNS IDs CIDs
    • this will likely run into the same problems as PeerIDs as CIDs (e.g. bump CID version number and now the topic is different)
  2. Use /ipns/base64(multihash(IPNS Public Key)) (could also be base32 if we wanted)
    • indicates it's an IPNS record, but not super helpful for figuring out which one (since IPNS IDs are not base64(multihash(IPNS Public Key)
    • could combine this with an aliasing function for debugging tools (e.g. ipfs pubsub ls or Wireshark) that processes it into a format we might prefer like /ipns/base32(cid(1, libp2p-key-codec, multihash(IPNS Public Key)))

@Stebalien any thoughts on /record vs /ipns for the PubSub topic?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

local-record = what we put in a local datastore = /ipns/base32(key) where key = multihash of IPNS public key

Turns out this isn't how it works in go. Instead, we just base32 the entire key: base32("/ipns/binary_peer_id).

This spec is currently using /record instead of /ipns. Should we make it use /ipns? If starting with /ipns is not necessary for routing records across multiple routers then we should probably just remove the section on routing records.

We use /record because the go-libp2p-routing-helpers is record type agnostic. It's not specific to IPNS records, it works with all records. We wanted a namespace this special router could own inside the pubsub topic namespace so it wouldn't conflict with other pubsub topics.

Really, IPNS-over-PubSub should be renamed to Records-over-PubSub.

can't guess what is the purpose of a /record/ topic without deserializing base64

The issue here was that record keys are arbitrary binary but pubsub requires utf-8 keys.

adding unnecessary noise when doing ipfs pubsub ls

I agree this kind of sucks but this is one of the reasons we have ipfs name pubsub subs.


My thoughts: leave it as it is, maybe rename this spec to "records over pubsub".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could rename the spec as this is really just the spec for go-libp2p-routing-helpers.

However, given that the record agnostic go-libp2p-routing-helpers requires a record validator and implicitly assumes that other subscribers to the same topic have the same validators it might be reasonable for it to take both a protocol prefix and a validator.

The question is, do we get anything out of having this /record namespace since it looks like /ipns might be more helpful?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we get to reuse the same validator logic, the same record keys, the same everything. We can add support for a new record type to the entire system just by adding the record type to the validator.


where base64url-unpadded is an unpadded base64url as specified in [IETF RFC 4648](https://tools.ietf.org/html/rfc4648)

# Layering persistence onto libp2p PubSub

libp2p PubSub does not have any notion of persistent data built into it. However, we can layer persistence on top of PubSub by utilizing [libp2p Fetch](https://github.com/libp2p/specs/).
aschmahmann marked this conversation as resolved.
Show resolved Hide resolved

The protocol has the following steps:
1. Start State: Node `A` subscribes to the PubSub topic `t` corresponding to the local IPNS record key `k`
2. `A` notices that a node `B` has connected to it and subscribed to `t`
3. Some time passes (might be 0 seconds, or could use a more complex system to determine the duration)
lidel marked this conversation as resolved.
Show resolved Hide resolved
4. `A` sends `B` a Fetch request for `k`
5. If Fetch returns a record that supersedes `A`'s current record then `A` updates its record and Publishes it to the network
Stebalien marked this conversation as resolved.
Show resolved Hide resolved

Note: PubSub does not guarantee that the a message sent by a peer `A` will be received by a peer `B` and it's possible
aschmahmann marked this conversation as resolved.
Show resolved Hide resolved
(e.g. in systems like [gossipsub](https://github.com/libp2p/specs/tree/master/pubsub/gossipsub))
that this is true even if `A` and `B` are already connected. Therefore, whenever `A` notices **any** node that has
connected to it and subscribed to `t` it should run the Fetch protocol as described above. However, developers may have routers
with properties that allow the amount of time in step 3 to increase arbitrarily large (including infinite) amounts.

# Protocol

A node `A` putting and getting updates to an IPNS key `k`, with computed PubSub topic `t`

1. PubSub subscribe to `t`
2. Run the persistence protocol, both to fetch data and return data to those that request it
3. When updating a record do a PubSub Publish and keep the record locally
4. When receiving a record if it's better than the current record keep it and republish the message
lidel marked this conversation as resolved.
Show resolved Hide resolved
5. (Optional) Periodically republish the best record available
lidel marked this conversation as resolved.
Show resolved Hide resolved

Note: 5 is optional because it is not necessary. However, receiving duplicate records are already handled efficiently
lidel marked this conversation as resolved.
Show resolved Hide resolved
by the above logic and properly running the persistence protocol can be difficult (as in the example below). Periodic
republishing can then act as a fall-back plan in the event of errors in the persistence protocol.

Persistence Error Example:
1. `B` connects to `A`
2. `A` gets the latest record (`R1`) from `B`
3. `B` then disconnects from `A`
4. `B` publishes `R2`
5. `B` reconnects to `A`

If `A`'s checking of when `B` reconnects has problems it could miss `R2` (e.g. if it polled subscribed peers
every 10 seconds)

# Implementations

- <https://github.com/ipfs/go-ipfs/tree/master/namesys>