-
Notifications
You must be signed in to change notification settings - Fork 998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial libp2p standardization #935
Conversation
Should we standardise on peer ids? |
At prysmatic labs, we are using multiple pubsub topics where each topic name is a 1:1 mapping to a message type. An example would be that attestation announcements come across the wire as Has there been any analysis of the trade-offs for this? Subscribing to only the topics you care about may reduce your network traffic while adding some potential complexity |
Having 1:1 topic mappings also reduces the need for a type nipple. The message is either the expected type or not. |
@JustinDrake - PeerId's seem to be specified here: https://github.com/ethereum/eth2.0-specs/blob/dev/specs/networking/node-identification.md#peer-id-generation |
@prestonvanloon - Agree that potentially we want a topic for each message type (which removes the need for a nibble). In gossipsub for each topic, we have to maintain a set of mesh peers. So there may be some overhead in managing two sets of overlay networks for the blocks/attestations. I guess this is the point of this PR to agree on how we should proceed. I believe the WhiteBlock guys will be doing some testing for gossipsub in the near future and we will be able to have a more informed opinion on which way to proceed. In regards to having a separate protocol id for requests/responses in the RPC protocol seems like a useful separation as we will not need to encode the different types. However it seems to go against current libp2p protocol designs which typically have a singular protocol id per protocol. I'd opt to go whichever is the most performant. |
Would it make sense to push for BLS12-381 keys in the longer term, to unify the consensus and networking layers? |
@AgeManning Agreed on the RPC part, it seems sub-optimal to open multiple streams between two peers. cc: @zscole is this on your radar? |
@JustinDrake - Makes sense for consistency. The caveats might be that BLS will likely take longer to verify (compared to secp256k1) and there is talk of generating new key pairs for every message sent across the pubsub network (to anonymize validators) (I believe BLS generation is comparable to secp256k1 keys so this shouldn't be an issue). |
I don't see why multiple streams would be necessary for anything other than FTP. I don't think it would present any inherent optimization, but this is something we can test for to validate if really necessary. |
|
||
## Identify | ||
|
||
#### Protocol Id: `/ipfs/id/1.0.0` (to be updated to `/p2p/id/1.0.0`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is /p2p/
not yet supported? and why not use /eth/
here? Is this something global beyond the eth protocol?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep so, the protocols that come with libp2p have their own protocol IDs. Currently (to my knowledge) these are hard coded and not customisable. I'd have to make a pr to libp2p or rewrite the protocol to change the naming. (For rust-libp2p).
I assume the same for go in order for the protocols to be interoperable.
I'm curious if we want to make the protocol IDs customisable within the libp2p protocols.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to customise all of these protocol IDs. Doing so will segregate your network from the IPFS or libp2p public networks at the protocol level, such that even if peers happen to cross-connect, they will share no protocols in common, and therefore, won't be able to interact via streams.
Ideally you'll override the protocol ID as low as in multistream-select
(to be renamed to multiselect
). This is used during connection bootstrapping, and if these IDs don't match, peers will disconnect immediately. This is the logically strictest segregation.
Looks like I missed this - I'm going to close my PR (#975) in favor of this one. @AgeManning would you mind if I committed changes to this one in order to merge our efforts? |
@mslipper - Of course. I made this mainly to start the discussion. All edits/commits welcome. :) |
#### Protocol Id: `/ipfs/id/1.0.0` (to be updated to `/p2p/id/1.0.0`) | ||
|
||
The Identify protocol (defined in go - [identify-go](https://github.com/ipfs/go-ipfs/blob/master/core/commands/id.go) and rust [rust-identify](https://github.com/libp2p/rust-libp2p/blob/master/protocols/identify/src/lib.rs)) | ||
allows a node A to query another node B which information B knows about A. This also includes the addresses B is listening on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the relationship to discv5
? Is this supposed to be a temporary substitute, a permanent complement, or a replacement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, temporary placeholder. This is currently what lighthouse is using in place of a proper discovery protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add an explicit note that this is a placeholder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm working on discv5 at the moment. Currently, I think this protocol may still be useful for NAT traversal and can be fed into the discv5 protocol. Potentially we don't need it at all. I'll add a placeholder note for now.
|
||
## Discovery | ||
|
||
#### Protocol Id: `/eth/serenity/disc/1.0.0` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the current plan for discv5 is to use an independent transport.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, agree. This entire document will be updated with discv5 soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought there was a plan to implement discv5 as a libp2p protocol?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm currently working on this. My current plan is to mimic mDNS in the integration into libp2p. In this case, there will be no protocol id, but will allow implementations to use the discv5 discovery gadget
which listens separately on a UDP port. Messages from this gadget will be available to the user to pass into other libp2p protocols. i.e we discover a node, this discovery can be used to dial a tcp connection or add to a kademlia DHT in libp2p.
Once, I've built an implementation, I will update this document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice to have at least the ability to run discv5 over other transports - what would that look like, in this setup?
### Topics | ||
|
||
*The Go and Js implementations use string topics - This is likely to be | ||
updated to topic hashes in later versions - https://github.com/libp2p/rust-libp2p/issues/473* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked about the topic hash in the issue. It seems not finalized yet, but shouldn't be a big issue.
Co-Authored-By: AgeManning <Age@AgeManning.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for undertaking this, @AgeManning! I believe this document creates a lot of clarity for the community.
From my end, I'll publish a short primer on connection bootstrapping and protocol negotiation in https://discuss.libp2p.io, as I believe this is the area that has created the most confusion so far.
|
||
Libp2p raw gossipsub messages are sent across the wire as fixed-size length-prefixed byte arrays. | ||
|
||
The byte array is prefixed with an unsigned 64 bit length number encoded as an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
64-bit varint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. This is based off the current floodsub implementation in rust-libp2p (figure all implementations should be standardized). Data is sent via this function:
https://github.com/libp2p/rust-libp2p/blob/master/core/src/upgrade/transfer.rs#L32-L41
Notice the the len of the data is using the unsigned_varint
package with a u64_buffer():
https://github.com/libp2p/rust-libp2p/blob/master/core/src/upgrade/transfer.rs#L45 and
https://github.com/libp2p/rust-libp2p/blob/master/core/src/upgrade/transfer.rs#L46
|
||
#### Eth2.0 Specifics | ||
|
||
Each message has a maximum size of 512KB (estimated from expected largest uncompressed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth specifying that this is Eth2.0's choice, as gossipsub does not impose a max message length, to my knowledge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speaking of the maximum size of a message, even though there isn't a specified constant, I played with the max size of DelimitedReader
, and now suspect it(1 MB?) is the upper bound of the message size in go implementation, IIUC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Floodsub in rust-libp2p has a maximum size of 2048 bytes:
https://github.com/libp2p/rust-libp2p/blob/master/protocols/floodsub/src/protocol.rs#L60
I've made this configurable in the gossipsub implementation.
The max size exists so that if we see a large amount of data coming through a stream we only read a maximum amount. I understand this to prevent potential DOS vectors from malicious user's sending arbitrary large streams of data.
|
||
## Identify | ||
|
||
#### Protocol Id: `/ipfs/id/1.0.0` (to be updated to `/p2p/id/1.0.0`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to customise all of these protocol IDs. Doing so will segregate your network from the IPFS or libp2p public networks at the protocol level, such that even if peers happen to cross-connect, they will share no protocols in common, and therefore, won't be able to interact via streams.
Ideally you'll override the protocol ID as low as in multistream-select
(to be renamed to multiselect
). This is used during connection bootstrapping, and if these IDs don't match, peers will disconnect immediately. This is the logically strictest segregation.
The protocol has two configurable parameters, which can be used to identify the | ||
type of connecting node. Suggested format: | ||
``` | ||
version: `/eth/serenity/1.0.0` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, to my knowledge, this is currently not the case in rust-libp2p. I'll make a PR to make these configurable.
|
||
## Discovery | ||
|
||
#### Protocol Id: `/eth/serenity/disc/1.0.0` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought there was a plan to implement discv5 as a libp2p protocol?
On the topic (pun intended) of dedicated topics per message type and shard vs. wildcard topics per shard, I'd suggest the former. A few reasons off the top of my head:
|
True. We should make sure there are only two type nipples. ✌️😂🤠 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay @AgeManning, looks good to me. Ready to merge?
@djrtwo - I consider this still a discussion PR and it will be updated in a week or so with discv5. However it might be useful as a guideline to start libp2p co-ordination so happy to merge this WIP PR if you like. I can make future PR's to update it :). |
I've added a small paragraph for Discv5. |
This is really cool. Why should clients optionally support yamux? |
Merging. |
@fjl - I originally made this as there was no discussion on what clients should use. By default, in rust, we can support both simultaneously. I've not done any testing or benchmarking between either protocol, but my feel is that mplex is minimal and yamux is more complex and potentially used more in prod (I have no real knowledge here however). I opted for the minimal by default and left the other libp2p-default multiplexer as optional. This was in part to spark discussion with people who know more about this than me (which didn't happen). |
Ahead of multi-client testnets, I think it will be useful to start a discussion on standardizing the libp2p protocols and their respective message formats.
This is a minimal initial draft which outlines some of the protocols, their id's and suggested formats which could be used for eth2.0 clients.
The protocols and formats listed here are currently in use by lighthouse but are entirely malleable and currently exist as suggestions for client implementations.