Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EIP1577 - Multiaddr support for ENS #79

Closed
daviddias opened this issue Nov 28, 2018 · 14 comments
Closed

EIP1577 - Multiaddr support for ENS #79

daviddias opened this issue Nov 28, 2018 · 14 comments

Comments

@daviddias
Copy link
Member

Conversation happening at https://ethereum-magicians.org/t/eip1577-multiaddr-support-for-ens/1969

@raulk
Copy link
Member

raulk commented Nov 28, 2018

Was reviewing EIP-1577. My comments:

  1. I sense that multicodec and multiaddr are being mixed in a way I don't quite follow.
  2. I was confused by the separate ensdomains/multicodec repo.
  3. The EIP says:

The encoding of the value depends on the content type specified by the protoCode; for instance, types in the range 0x00-0xf0 are encoded using multihash, meaning their value is formatted as follows [...]

I am assuming they refer to the ensdomains/multicodec codelist, but I don't find anything that points to specific ranges being reserved for specific purposes.

  1. In the fallback section, it's not clear what "the multiaddr interface" refers to. The spec talks about multicodec and that's the first time that multiaddr is mentioned.

During my research process I noticed our prose concerning the interrelationship of multicodec, multiaddr and multihash could be a bit off. Or at least it feels imprecise (to this reader).

For example, the multicodec README says:

A chunk of data identified by multicodec will look like this:

<multicodec><encoded-data> (1)

It is worth noting that multicodec works very well in conjunction with multihash and multiaddr, as you can prefix those values with a multicodec to tell what they are.

Then, in the multihash README, the values are described as:

<varint hash function code><varint digest size in bytes><hash function output> (2)

However, based on my understanding:

<multicodec> (1) == <varint hash function code> (2)
<encoded-data> (1) == <varint digest size in bytes><hash function output> (2)

If my interpretation is correct, wouldn't it be more precise to state that multiaddr and multihash embed multicodec codepoints? The term "prefix" is misleading.

@Stebalien
Copy link
Member

I sense that multicodec and multiaddr are being mixed in a way I don't quite follow.

I believe the confusion stems from the fact that multiaddrs are paths and /ipfs/Qm..., /ipns/Qm... are paths (/ipfs/Qm... is even a valid multiaddr). Worse, multiaddr has the word "address" in it but we don't use them to address content.

If my interpretation is correct, wouldn't it be more precise to state that multiaddr and multihash embed multicodec codepoints? The term "prefix" is misleading.

I'm not sure I follow. We say "prefix" because a mulithash is <varint codec><length><data> and a multiaddr is <varint multicodec><stuff>. In both cases, we prefix some data with a multicodec to create either a multihash or a multiaddr.


Note: this also came up here: #73

@raulk
Copy link
Member

raulk commented Nov 28, 2018

In both cases, we prefix some data with a multicodec to create either a multihash or a multiaddr.

@Stebalien I'm reading the README as a spec -- maybe I shouldn't. But with those lenses, if multihash is defined <varint codec><length><data> and multiaddr as <varint multicodec><stuff>, multicodec does not prefix multihash and multiaddr, multicodec is the prefix. Personally I tripped over that, thinking that multicodec somehow encapsulates the other two.

In the case of multiaddr, given that the <varint multicodec><stuff> groups are repetitive, saying that codec is the prefix falls short. Yes, all composed multiaddrs will start with a multicodec, but there will be more instances in a single composed multiaddr, right?

@Stebalien
Copy link
Member

Ah, I see. You're right, we don't prefix the multihash, we prefix the length-delimited hash digest.

Yes, all composed multiaddrs will start with a multicodec, but there will be more instances in a single composed multiaddr, right?

So... this is one of the issues with multiaddrs. Without understanding the what each protocol code means, I can't break a multiaddr into a sequence of (codec, value) pairs. Really, it's better to think about multiaddrs as being recursively defined. That is, to parse a binary multiaddr, you:

  1. Read off the multicodec.
  2. Lookup the protocol definition for that codec.
  3. Pass everything else to that codec.
  4. The codec should consume as much as it wants.
  5. The codec should then recursively parse everything it doesn't want as a multiaddr.

Really, this also applies to strings as a single multiaddr "component" could either be:

  • /ip4/1.2.3.4
  • /quic (no argument)
  • /unix/a/b/c/d.... (everything after /unix is an argument).

@raulk
Copy link
Member

raulk commented Nov 29, 2018

@Stebalien have we considered introducing boundary markers for values? This has the ability to solve the nesting and the value identification problem at once. Using square brackets as the textual representation:

  • /unix[/a/b/c/d]/grpc
  • /ip4[1.2.3.4]/udp[8888]/quic
  • /ip4[1.2.3.4]/tcp[8888]/p2p-circuit[/ip4[2.3.4.5]/udp[9999]] (nesting example)

@Stebalien
Copy link
Member

We've discussed it for embedding path values (/unix/[/...]/...) but not in other cases.

The real issue with brackets is that they kind of break the path abstraction (coming from the plan9 "everything lives in the filesystem namespace" camp).

If you're interested in some ramblings on this subject... https://gist.github.com/4764975c3b5ea33202324d8e9ec0985d (not posting a PR/issue as I don't want to add too much confusion to the discussion unless we decide to go with it).

(note: defining these recursively is really just my opinion; by spec, multiaddrs are still defined as a list).

@ghost
Copy link

ghost commented Mar 15, 2019

They're just linking IPFS content in EIP-1577, and the same earlier in EIP-1062. So the answer is easy, they should either:

  • go with CIDv1-in-base32,
  • or ideally, go with dweb paths (/ipfs/$cid, /ipns/$peerid, /dat/$key) for much more interop

They're doing basically the ENS version of dnslink.

Both of EIP-1577 and EIP-1062 have been merged as a draft on the standards track, so I'm not sure whether they can be corrected or whether there needs to be a new EIP draft. @raulk you spent a ton of time in Ethereum, what do you think?


As for multiaddr, there's nothing to do about these EIPs. I'll pull your thoughts above into a new design discussion.

@Stebalien
Copy link
Member

We've gone with a multicodec. That is, we've defined swarm-ns, ipld-ns, and ipfs-nsmulticodecs to be used in "compressed" paths of the form...`, etc. Unfortunately, they really didn't want to use text paths (we tried).

@ghost
Copy link

ghost commented Mar 15, 2019

@Stebalien was that in an EIP too?

@ghost
Copy link

ghost commented Mar 15, 2019

Ooh - I see. EIP-1577 defined the multicodec usage, and the *-ns codecs are in multiformats/multicodec.

I suppose byte length matters regarding gas cost.

@ghost
Copy link

ghost commented Mar 15, 2019

It's still unfortunate that EIP-1577 has multiaddr in its title...

@ghost
Copy link

ghost commented Mar 15, 2019

Nevermind - I was reading the first version of the draft only. The current version at https://eips.ethereum.org/EIPS/eip-1577 doesn't mention multiaddr at all.

All good!

@pldespaigne
Copy link

For everyone wanting to use EIP1577 there is now a js implementation here : content-hash

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants