Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Libp2p Standardization Update #1281

Closed
wants to merge 8 commits into from
Closed

Conversation

AgeManning
Copy link
Contributor

Overview

Currently there are a few suggestions about how Ethereum 2 clients should implement the RPC in the context of libp2p as well as various encoding/decoding and formats. Specifically, #1158 proposes segregating protocol Ids for each RPC request.

Furthermore, the current RPC specification has implementation issues (some of which are discussed in #1158) and does not primarily focus on libp2p implementations.

This PR consolidates #1158, #1100 and the RPC specification into a libp2p-specific context into libp2p_standardization.md. This document is aimed for client implementations to discuss and come to consensus on the final libp2p framework used in all implementations.

I'm also happy to help maintain this iterative specification.

@zah
Copy link

zah commented Jul 10, 2019

Thank you for continuing the work on this, Age.

One small nitpick: I see the use of varint length prefixes as something specific to the negotiated encoding. The SSZ encoding requires it, but other serialization schemes may have a built-in record length information. I would create a dedicated section within the document detailing the SSZ encoding with examples of exchanged messages.

Copy link

@mhchia mhchia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eth-2 RPC looks really solid and complete. Thanks for the work. I left some questions. 🙏

specs/networking/libp2p-standardization.md Show resolved Hide resolved
specs/networking/libp2p-standardization.md Outdated Show resolved Hide resolved
specs/networking/libp2p-standardization.md Outdated Show resolved Hide resolved
@AgeManning
Copy link
Contributor Author

AgeManning commented Jul 10, 2019

@zah - Thanks!
I agree. I'll add a dedicated example for SSZ encoding.

I've extracted the varint from the encoding as I envisaged it as part of the libp2p framework. By this, I mean that a receiver always reads the varint, regardless of encoding, then proceeds to read the exact amount of bytes specified by the length. Once the bytes are read, the stream is closed and the arbitrary encoding is used to decode the received content. In retrospect, I think this train of thought is primarily based on how my implementation works, and if an encoding already defines a length-prefix, it may be duplicated in this scenario.

In the case of SSZ, SSZ-encoded objects do not have an overall length-prefix. Given the initial part of an SSZ-encoded object, a receiver does not have a way of telling how many more bytes will be sent by the sender. If we know the schema, and the object only has fixed-length fields, we could know the length of the payload, but this is in general not true. By length-prefixing the payload, a receiver can always know how many bytes need to read, and can reject the stream if the length is too large.

To be clear, is your suggestion, that we ensure all encodings apply some length prefix (potentially in various forms), and for each encoding implement in the libp2p logic to read the encoding length and to determine how many bytes to read?

@zah
Copy link

zah commented Jul 10, 2019

The document doesn't need to use too abstract terms at the moment, but the encoding is generally responsible for implementing a framing scheme that divides the stream into individual messages.

With SSZ in particular, we can say that the records are wrapped in a SSZ envelope which consist of adding a varint length prefix to the serialized bytes. I guess we need some feedback from the community in order to decide whether we want anything fancier such as including the length prefix only for variable-sized records. In our own implementation, it would be relatively easy to implement this.

1.3](https://github.com/libp2p/specs/blob/master/tls/tls.md). It is our
intention to transition to use TLS 1.3 for encryption between nodes, rather
than Secio.*
1.3](https://github.com/libp2p/specs/blob/master/tls/tls.md) or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is fine for now, but I think we should pick TLS1.3 or Noise asap.
My understanding is that TLS1.3 language support will likely block us from adopting in the near future and should investigate Noise

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep agreed. Perhaps we raise this in the call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This didn't come up in the call. But indeed, TLS 1.3 language support is not universal for now. For example, nodejs lacks such support. Protocol Labs is funding @jasnell's work on adding QUIC support to nodejs (https://github.com/nodejs/quic) + TLS 1.3, because it's a prerequisite.

The libp2p team is very favourable to Noise. I believe rust-libp2p introduced early experimental support (but hasn't submitted a spec to libp2p/specs yet), and we've toyed with the idea of adding support for go-libp2p. Here's a spec: libp2p/go-libp2p#631

I think we should prioritise these efforts. I want to support the IK, IX and XX handshakes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An issue with Noise is the compatibility with QUIC. There's nQUIC, and I've asked @marten-seemann (our QUIC expert) to look into it this quarter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering QUIC is also not yet widely supported, I'm okay with Noise without QUIC for now with the intention to move to TLS 1.3 + QUIC as support becomes available.

I'd say getting a Noise spec written in libp2p/specs is a top priority at this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this and will update the wording to reflect this above, unless there are any objections?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Note that clients supporting QUIC will advertise so in their multiaddrs, so other clients/languages supporting that combo should immediately benefit from that enhanced transport. Clients/languages not supporting QUIC will simply not dial those addresses, and will instead fallback to TCP automatically.

specs/networking/libp2p-standardization.md Outdated Show resolved Hide resolved
specs/networking/libp2p-standardization.md Outdated Show resolved Hide resolved
specs/networking/libp2p-standardization.md Outdated Show resolved Hide resolved
AgeManning and others added 3 commits July 11, 2019 09:29
Co-Authored-By: Danny Ryan <dannyjryan@gmail.com>
Co-Authored-By: Danny Ryan <dannyjryan@gmail.com>
Co-Authored-By: Danny Ryan <dannyjryan@gmail.com>
@AgeManning
Copy link
Contributor Author

@zah - Yep agree with everything you're saying.

My suggestion was to keep the framing static across all encodings. The reason for this, is that in the RPC implementation we are opening a stream sending a request and receiving a response then closing the stream. I figured a standard single frame which allows for all encodings to provide a max-size to be sent per stream would be applicable.

I'm happy to change this also and leave it up to the encoding and ensure all encodings allow us to size a single-framed response.

@AgeManning
Copy link
Contributor Author

@zah - Agree with you and have updated to shift the length-prefix in the encoding.
I've added a small example with SSZ, let me know if you think it needs further details.

Things still to consider:

  • RPC_MAX_SIZE - what should be the maximum size of RPC payloads a client should accept
  • Gossipsub encoding - Gossipsub sends data via protobuf - but the data within this is up to us. Should we add an <Encoding> string to the gossipsub protocol id or specify a specific compression algorithm. I expect we want to compress the bytes rather than just send ssz-encoded objects over gossip

* **ProtocolPrefix** -- the RPC messages are grouped into families identified
by a shared LibP2P protocol name prefix. A conforming implementation is
expected to support either all messages appearing within a family or none of
them. In this case, we use `/eth/serenity/rpc`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no strong objections to using this prefix, but I'll share my reasoning for why I used /ETH/BeaconChain/ in my own proposal.

The current RPC procs describe everything necessary to sync and propagate changes to the beacon chain. As soon as we add shards to the mix, we'll need new RPC procs for syncing the state of shards. Custom execution environments may also need their own specialized protocols for communicating with the relayers in the network.

"Serenity" is a term that encompasses all of these future developments and the wording in the spec suggests that a node should support either all of the protocols or none of them, but I see some potential for more specialized software that syncs only with the beacon chain for example.

Thus, I tried to come up with the most unambiguous prefix name that is unlikely to clash with anything else in the future. "/ETH/BeaconChain/" seemed to fit that bill. The choice of PascalCase was another silly concern trying to win 1 byte in the multiselect handshakes :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also not strongly opinionated on this, but would opt for whatever we go for, to be somewhat consistent with all protocols.
Hopefully there will be more input by someone and I'm happy to go with whatever is the consensus.

Copy link
Contributor

@djrtwo djrtwo Jul 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are to go with /eth/beacon_chain, that would imply what other protocols...

  • /eth/shard -- specific requests to shard chains, likely most calls would have 1st param as shard_id
  • /eth/network -- akin to net_ methods in eth1 rpc

Any others?

I think this makes sense, imo.

Another question is should the initial component of the prefix be /eth2 -- potentially allows for better segregation if components of eth1 end up using the libp2p for any part of the stack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any others?

I guess everything phase 2 related -- propagating transactions, syncing state, requesting light client proofs, etc., and everything replicated for the different execution environments.

Another question is should the initial component of the prefix be /eth2 -- potentially allows for better segregation if components of eth1 end up using the libp2p for any part of the stack

Originally, I preferred eth2, mostly because everything else is called eth2 as well. But one could think of cases in which eth1 and eth2 clients would want to speak with each other and then having a different prefix might be awkward (thinking about this idea of storing raw block data in Swarm or another sort of persistence layer). But even then I guess we could have an entirely different prefix for those cases?

Copy link
Contributor Author

@AgeManning AgeManning Jul 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make this discussion more concrete, there is only really two protocol Id's we currently need to specify. Gossipsub and the RPC. The topics within gossipsub can be set independently and without a prefix. I hadn't imagined we would segregate peers participating in the RPC but only on shard chains via protocol id's, rather via the HELLO message. However it is possible to also segregate via the protocol id.
Is the consensus:
gossipsub: /eth2/beacon_chain/gossipsub/1.0.0
rpc: /eth2/beacon_chain/rpc/<version>/<encoding>
?

Copy link
Contributor

@raulk raulk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few initial comments.

specs/networking/libp2p-standardization.md Outdated Show resolved Hide resolved
protocol.
* **Protocol Id** - A byte string used in the libp2p framework to negotiate
substreams for specific protocols.
* ** Close a (sub)stream** - Close the local end of a stream. I.e `stream.close()`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are currently revising the behaviour of stream-wise Close.

See libp2p/go-libp2p-core#10, and this Google Sheet for our thinking/research: https://docs.google.com/spreadsheets/d/1O7IgyiMiZo1kWUsNVzTpj-551clCc_mrAjUDacE6a88.

Gist: libp2p streams are full-duplex and we'll be introducing methods for full-duplex close and half-close on read and write sides.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising this. I'm not familiar with the intricacies of these. To my knowledge, rust-libp2p (which I'm most familiar with) only allows for closing the local end of streams. I'm not sure if this is general across all implementations.
Perhaps in this specification, we don't distinguish between the two? What are your thoughts?

@djrtwo
Copy link
Contributor

djrtwo commented Jul 16, 2019

Gossipsub encoding - Gossipsub sends data via protobuf - but the data within this is up to us. Should we add an string to the gossipsub protocol id or specify a specific compression algorithm. I expect we want to compress the bytes rather than just send ssz-encoded objects over gossip

We should certainly compress in gossipsub.

Options:

  • specify encoding via the gossip sub protocol id. gives us flexibility to add/change
  • specify it across the board singularly for gossipsub without using the protocol id

I'm not certain we should support a non-compressed encoding here, but specifying the compressed encoding via the protocol ID seems to be in the spirit of libp2p and gives us flexiblity in upgrading down the line if needed

specs/networking/libp2p-standardization.md Outdated Show resolved Hide resolved
@@ -135,24 +161,369 @@ number of shard subnets is defined via `SHARD_SUBNET_COUNT` and the shard
Each Gossipsub
[Message](https://github.com/libp2p/go-libp2p-pubsub/blob/master/pb/rpc.proto#L17-L24)
has a maximum size of 512KB (estimated from expected largest uncompressed block
size).
size). Clients SHOULD reject messages that are over this size limit.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add and MUST NOT send

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep I'll add this. I originally didn't because I was worried that with and MUST NOT send implementers may then assume no messages over the size limit will get sent and then not implement protective measures in the case that some do.

need to establish streams through `multistream-select`, rather, act
as a standalone implementation that feeds discovered peers/topics (ENR-records) as
`multiaddrs` into the libp2p service.
`multiaddrs` into the libp2p service. The libp2p service subsequently forms
connections and streams with discovered peers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not now, but at some point we have to specify what discovery topics we have and what special fields we require in the ENR (if any).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good point! I'll add in the bare necessity for discovery, and leave it open for added fields.

English letters, digits and underscores (_).
* **SchemaVersion** -- a semantic version consisting of one or more numbers
separated by dots (.). Each schema is versioned to facilitate backward and
forward-compatibility when possible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the point of putting the version behind the message name is to have different versions for individual RPC request/response pairs, instead of just one for the whole protocol. I'm not sure if that's a good idea for multiple reasons:

  • It's unnecessary granularity: If we want, we can just increment the whole protocol's version, even if we just update a single message.
  • It doesn't go well with "A conforming implementation is expected to support either all messages appearing within a family or none of them.", because it's possible that two implementations both fully support the family, but incompatible versions for some/all messages.
  • I suspect that often the versions will not be independent. For instance, if we add a new way to sync, multiple message types will be updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree entirely. This is an important discussion. We get some benefits when pushing things into multi-stream select, such as not requiring a method id in the RPC anymore. There is some discussion on the versioning here: #1158
I'm interested to know your thoughts on the above discussion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unnecessary granularity: If we want, we can just increment the whole protocol's version, even if we just update a single message.

The granularity helps manage upgrades such that when specialized clients use only a part of the protocol, they need not be upgraded in some scenarios. I think this is a win, specially with the broadcast parts.

I suspect that often the versions will not be independent.

any such dependencies can be specified in the message that requires them.

I'm concerned about the use of semver here - it is nice for humans selecting a collection of versions to test together before using in application, but automating message selection in-protocol based on it seems like a can of worms in terms of subtle issues. This ties in with using SSZ for encoding - SSZ has no concept of optional fields really - no way of introducing purely additive changes - meaning that every time a request or response is changed, we effectively need to create a new incompatible protocol version.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no way of introducing purely additive changes

There is one type of upgrade that is supported by SSZ - it's adding a field at the end of a struct.

Otherwise, I agree that using a single number as a version is reasonable enough for both humans and machines.

* **ProtocolPrefix** -- the RPC messages are grouped into families identified
by a shared LibP2P protocol name prefix. A conforming implementation is
expected to support either all messages appearing within a family or none of
them. In this case, we use `/eth/serenity/rpc`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any others?

I guess everything phase 2 related -- propagating transactions, syncing state, requesting light client proofs, etc., and everything replicated for the different execution environments.

Another question is should the initial component of the prefix be /eth2 -- potentially allows for better segregation if components of eth1 end up using the libp2p for any part of the stack

Originally, I preferred eth2, mostly because everything else is called eth2 as well. But one could think of cases in which eth1 and eth2 clients would want to speak with each other and then having a different prefix might be awkward (thinking about this idea of storing raw block data in Swarm or another sort of persistence layer). But even then I guess we could have an entirely different prefix for those cases?

specs/networking/libp2p-standardization.md Outdated Show resolved Hide resolved
@AgeManning
Copy link
Contributor Author

A gossipsub encoding tag on the end of the protocol-id does seem in the spirit of libp2p.
The implementation could be tricky when supporting multiple encodings, depending on logic.
If Node1 supports encoding A, Node 2 supports A, B and node 3 supports B. I think there will be two subnets for A, and B and then should Node 2 route packets from the B subnet to A?
Whereas if we specify the encoding as some prefix in the data, all nodes would be on the same subnet but the message propagation properties of gossipsub would be modified depending on the percentage of nodes supporting the same protocol (as nodes that can't decode a block won't propagate it).
@raulk - Do you have thoughts on the best way to handle multiple encodings/compression for gossipsub?
I guess we just keep all nodes segregated on their own subnets based on the encodings they support and therefore a protocol id including encoding/compression makes sense?

@zah
Copy link

zah commented Jul 17, 2019

Initially, I though that it would be a problem to specify the encoding in the Gossipsub topic name, but now I think it's reasonable. Each client will broadcast on a single topic using the preferred (e.g. latest) encoding. The client will also subscribe to all other topics/encodings that it can support.

Thus, when we upgrade the network, newer clients will see the traffic from older clients, but the older clients will quickly become unusable. This is only slightly different from having a single pre-determined topic where the encoding is specified in the message content. The pros and cons go like this:

  • With multiple topics, the clients will always get only messages relevant to them.
  • With a single topic, the client will know that it's out of date and it can notify the client operator somehow. This seems like a minor benefit, because there are other possible heuristics to detect this and the operator is not that likely to miss such major news such as an upcoming network upgrade.

@raulk
Copy link
Contributor

raulk commented Jul 17, 2019

@AgeManning

The reason for this, is that in the RPC implementation we are opening a stream sending a request and receiving a response then closing the stream.

Note that this is currently a bit costly in libp2p due to multistream-select 1.0 negotiation being chatty and stateless. multiselect 2.0 will introduce 0-RTT negotiation by way of shared protocol tables and session resumption. So you'll be able to open streams with zero negotiation cost, as long as that protocol has been negotiated in that session/in the recent past with that peer.

I acknowledge that the Multiselect 2.0 conversation in libp2p/specs#95 (comment) is arguably hard to follow, so @marten-seemann and I are putting together a document summarising the current thinking and technical design.

@raulk
Copy link
Contributor

raulk commented Jul 17, 2019

@AgeManning

@raulk - Do you have thoughts on the best way to handle multiple encodings/compression for gossipsub?

Let me think about this and discuss this with a bunch of people. Will get back to you soon.

@raulk
Copy link
Contributor

raulk commented Jul 17, 2019

@AgeManning

@raulk - Do you have thoughts on the best way to handle multiple encodings/compression for gossipsub?

A priori, the suggested approach of embedding the encoding and compression in the topic name appears reasonable. It enables selective consumption based on one's own capabilities, and predisposes to efficient control message piggybacking. A few questions:

  • Are you intending for nodes in the network to "translate" between serialisation formats and compression algorithms?
  • How do you guarantee that a piece of data emitted by peer P in a specific encoding/compression combination can be effectively disseminated and understood by the entire network, despite the heterogeneity?
  • Or is the setup such that nodes can safely ignore messages they don't understand, yet the network can come to consensus?

Suggestion: may be worthwhile to develop a small-scale PoC (e.g. 100 nodes) and capture metrics on write amplification, memory usage, network bandwidth, and control message piggybacking efficiency. The libp2p core team can help with that.

substreams.


#### Requests and Responses
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I get it right, it is suggested that each request-response would need to open a new dedicated stream.
Wouldn't this cause a significant overhead in terms of latency, traffic and client computational resources?
Opening a stream requires additional negotiation roundtrip when negotiating a protocol and may required more with multiple protocol versions.
I'm also not sure how 'lightweight' is stream creation in different libp2p implementations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in terms of latency; little in terms of traffic and computational resources. I pointed this out here: #1281 (comment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was also briefly discussed in #1158. As @raulk mentions there will be overheads in multistream select 1 and potential versioning issues/features where each RPC has it's own version/protocol-id.
We opted for a forward-looking design, where we assume that multistream-select2 is adopted and the overhead is minimized. This doesn't mitigate a potential versioning issue, but we gain some benefits, such as not sending the method_id and encodings in each RPC request. This is of course still up for debate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raulk Will multistream-2 be able to handle nested protocols in one round like /secio/1.0.0/mplex/2.6.0 ?

@AgeManning
Copy link
Contributor Author

AgeManning commented Jul 17, 2019

@zah - I was originally considering an encoding in the protocol-id, but I agree it seems to make more sense having multiple topics segregating the encodings and a single protocol id.

@raulk

Note that this is currently a bit costly in libp2p due to multistream-select 1.0 negotiation being chatty and stateless. multiselect 2.0 will introduce 0-RTT negotiation by way of shared protocol tables and session resumption.

Yep, this is understood. Do you have a rough ETA for multiselect 2.0. A month? End of the year? Two years?

Are you intending for nodes in the network to "translate" between serialisation formats and compression algorithms?

This is of course a possibility, but in my opinion I don't think we should add this complexity. As @zah mentions, if the network upgrades to a new format, perhaps we have a mechanism that newer nodes can notify other nodes that their format is out of date. We may introduce nodes that do translation on older topics (so older clients can still receive blocks/attestations, albeit slower), but I don't think we should include it in this document.

How do you guarantee that a piece of data emitted by peer P in a specific encoding/compression combination can be effectively disseminated and understood by the entire network, despite the heterogeneity?

I guess we don't. In such an upgrade perhaps we stress that clients should all be updated. In reality, it is probably reasonable to have a few clients do the translation from topic/encoding to another. As long as one client does this translation, the data should propagate on both topics. We would have to test how quickly this works for consensus however. Perhaps others have more thoughts on this, @djrtwo?

Or is the setup such that nodes can safely ignore messages they don't understand, yet the network can come to consensus?

We are using gossipsub to propagate new blocks/attestations. It would not be good to lose these, consensus should still work depending on the % of validators split across topics. As above, technically this is solveable using translating nodes, whether we do this or require all clients to update is a separate question imo. But I don't see the design choice of using topics to separate encodings to be a prohibitive design choice when updating encodings.

Suggestion: may be worthwhile to develop a small-scale PoC (e.g. 100 nodes) and capture metrics on write amplification, memory usage, network bandwidth, and control message piggybacking efficiency. The libp2p core team can help with that.

Absolutely. This is something I'm interested in, and we (lighthouse) will be doing version's of this with our client, but more granular tests for specific libp2p protocols should also be done (we would like to help when/where we can, just may not have the time/manpower). Perhaps other parties are interested in this also?

@arnetheduck
Copy link
Contributor

We should certainly compress in gossipsub.

has anyone looked into the gains here, and algorithms? hashes compress poorly, not sure about the rest.

multiple topics segregating the encodings

+1 for multiple topics I think :) during transition periods it's also easy for clients to publish to multiple topics and keeping it separate at this level should allow the topic listening mechanisms to adapt to new-vs-old client ratios so there's a natural ramp-up/down as people upgrade. One thing necessary for this to work well: that topic subscription is used as a propagation filter meaning that not everyone receives all data on all topics - is this the case with libp2p? Also in this scenario, how is peer connectivity handled? If I'm not listening to a topic, can I publish to it?

ping @raulk

@arnetheduck
Copy link
Contributor

We opted for a forward-looking design, where we assume that multistream-select2 is adopted and the overhead is minimized.

as a thought experiment, let's say we opt for an optimal solution given the capabilities of today and use a single stream to not burden multiselect 1. What and how difficult would the steps be to change the protocol to use multiple streams later?

@AgeManning
Copy link
Contributor Author

AgeManning commented Jul 18, 2019

that topic subscription is used as a propagation filter meaning that not everyone receives all data on all topics - is this the case with libp2p?

Yes. Gossipsub/floodsub will only publish/propagate to nodes subscribed to the topic.

Also in this scenario, how is peer connectivity handled? If I'm not listening to a topic, can I publish to it?

Yes. This is the fanout peers in gossipsub. Essentially a node still keeps a list of other peer's topics even if it is not subscribed them. This way, the node has a set of peers to publish messages to on topics it's not subscribed to.

as a thought experiment, let's say we opt for an optimal solution given the capabilities of today and use a single stream to not burden multiselect 1. What and how difficult would the steps be to change the protocol to use multiple streams later?

I think this may be implementation specific. I understand this is not too difficult for nimbus to change to. It's not too difficult for us (Lighthouse) to support either. In a single stream scenario we add back complications such as tracking of method_ids and request/response id's (unless we assume ordering).
I'm not opposed to this btw.

@zah
Copy link

zah commented Jul 18, 2019

as a thought experiment, let's say we opt for an optimal solution given the capabilities of today and use a single stream to not burden multiselect 1. What and how difficult would the steps be to change the protocol to use multiple streams later?

I think this may be implementation specific. I understand this is not too difficult for nimbus to change to. It's not too difficult for us (Lighthouse) to support either. In a single stream scenario we add back complications such as tracking of method_ids and request/response id's (unless we assume ordering).
I'm not opposed to this btw.

From my experience, using a single stream requires quite a lot of additional code for tracking the requests that are in flight. In Nimbus, our codebase is structured in a way such that the low-level details of the protocol are abstracted away and there is shared high-level code that holds only the application logic needed for responding to a particular request. Here is an example:

https://github.com/status-im/nim-beacon-chain/blob/0a2d09e7fe3ff556201b569f19c11dd1bcf337df/beacon_chain/sync_protocol.nim#L270-L282

I implemented a single-stream back-end for this code only because the older spec required it. The multi-stream back-end is quite a bit simpler.

@arnetheduck
Copy link
Contributor

Leaving aside for a moment the code implications, how would the upgrade look from a network perspective of running clients? How.. difficult.. would a negotiated upgrade be?

From a code perspective, I read that as going from ms1 to ms2 is a simplification, thus trivial.

@zah
Copy link

zah commented Jul 18, 2019

Well, it will be trivial only if you have planned for it. I shared our implementation approach as a tip.
Otherwise, migrating to a new protocol is not hard. The clients will just negotiate and select different stream types in libp2p.

@djrtwo
Copy link
Contributor

djrtwo commented Jul 18, 2019

We would have to test how quickly this works for consensus however. Perhaps others have more thoughts on this, @djrtwo?

I don't expect to have much mismatch on gossiping the high throughput items on validating nodes. Coordinating network upgrade (even if outside of hardfork) is certainly likely. To force everyone's hand, can just couple network upgrades at same time as hard fork so there isn't much option to not upgrade the software.

@djrtwo
Copy link
Contributor

djrtwo commented Jul 18, 2019

Should be noted that the additional latency in negotiating streams in ms1 for RPC calls will likely not hinder the protocol too much, especially in early deployments. The thing we would be most concerned about is latency wrt pubsub gossiping algs.

string is:
`beacon_block/ssz` and the `data` field of a Gossipsub `Message` is an
ssz-encoded `BeaconBlock`.
- `ssz_snappy` - All objects are ssz-encoded and then compressed with `snappy`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be more efficient to compress the stream before multiplexer (i.e. compress the whole transport stream)? May be libp2p has a suitable compressor out-of-the-box ?
@raulk ?

@@ -71,14 +98,14 @@ optionally [yamux](https://github.com/hashicorp/yamux/blob/master/spec.md)

## Gossipsub

#### Protocol id: `/eth/serenity/gossipsub/1.0.0`
#### Protocol id: `/eth2/beacon_chain/gossipsub/1.0.0/`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This protocol uses libp2p gossipsub as a pubsub provider. How would a handler for this protocol look like? Would it use gossipsub handler as a delegate?

We might want to explicitly mention gossipsub in the spec, e.g. /meshsub/1.0.0. For the sake of conformance with rpc I'd suggest to specify beacon chain pubsub in a following way:

  • /eth2/beacon_chain/pubsub/<SchemaVersion>/<Encoding>
  • /meshsub/1.0.0 is used as underlying protocol

Schema version would handle changes in message format if e.g. an update in aggregation algorithm leads to message patching. If encoding becomes a part of protocol id then topic id could get rid of it which makes forward compatibility less complicated.

There is a question here. What if we want to upgrade to /meshsub/2.0.0? What do we do then? But I guess the same question could be addressed to /mplex and the others.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current train of thought for putting the encoding inside the topics is so that peers that support different encodings / schemas still connect on the protocol. Peers then in principle could subscribe to two different topics/encodings and act as a relay without having to support two protocols.

In the event of an upgrade, some nodes could identify other nodes by subscribing to older topics and informing them an upgrade is required.

In regards to the protocol id. The consensus has been to change the protocol ids to /eth/.. to avoid accidental connection with ipfs and other non eth-specific nodes.
If we want to upgrade to a new version of meshsub, we would have to support the new protocol-id. I imagine we would rename the protocol-id to /eth2/beacon_chain/gossipsub/2.0.0

In regards to forwards compatibility, I believe the idea would be to coordinate a hard fork such that everyone moves to a new network, either by incrementing the protocol-id version, or in the change of an encoding, a topic prefix. Currently, there are plans to support two encodings. An empty postfix, which has no compression (for interop) then a snappy encoding, likely for mainnet which will be realised via two topics, i.e beacon_block_snappy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Peers then in principle could subscribe to two different topics/encodings and act as a relay without having to support two protocols.

IMO, acting as a relay makes things more complicated than supporting two protocols. A relay will have to match different topic ids to a single logical topic and support two encodings. While supporting two protocols only means supporting two encodings. Please, let me know if I am missing something.

A number of topic ids matching to one topic turns into a number of net graphs linked with each other via relays. It requires to have reasonably high number of relays in order to keep not yet upgraded peers subscribed to that topic.

@mkalinin
Copy link
Contributor

Forward compatibility principle described in EIP-8 is a very nice feature. It preserves network liveness during various updates by keeping ability of old peers to speak with upgraded ones.

In order to be forward compatible devp2p has a negotiation process that includes protocol version exchange. Peers are sending highest supported version to each other and starts communication with minimal version supported by both ends.

We could leverage Discovery v5 ENRs to get a list of capabilities supported by remote peer before establishing a connection with it. To get to that in our case nodes will need to advertise a couple of lists:

  • supported schema versions
  • supported encodings

Using this data node could build protocol id before it starts to communicate with remote peer. Node picks schema version and encoding according to its own preferences and remote peer capabilities.

For example, if encoding would be updated to e.g. xml we would have following scenario:

  • upgraded nodes start sending new ENRs with eth2_codecs: ssz, xml and xml takes preference over ssz
  • if remote peer supports xml then to establish connection upgraded node builds protocol id with new encoding /eth2/beacon_chain/rpc/1.0.0/hello/xml
  • otherwise, it uses old one /eth2/beacon_chain/rpc/1.0.0/hello/ssz

Schema version could be updated in the same fashion.

This approach has a drawback. There will be time gap between restarting a node and spreading updated ENRs across the network. In particular it means that two nodes will have to be restarted twice in order to start communicate with updated capabilities.
Possible workarounds to that:

  • exchange capabilities during negotiation process (Eth1 approach)
  • send updated ENR before establishing connection with remote peer

What others think about forward compatibility and this particular approach?


Shards are grouped into their own subnets (defined by a shard topic). The
number of shard subnets is defined via `SHARD_SUBNET_COUNT` and the shard
`shard_number % SHARD_SUBNET_COUNT` is assigned to the topic:
`shard{shard_number % SHARD_SUBNET_COUNT}_attestation`.
`shard{shard_number % SHARD_SUBNET_COUNT}_beacon_attestation`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will node advertise a shard topic that it is assigned to? A straightforward solution would be to use ENR records for this purpose. If this is what we want then maybe it worth listing under ENR records of Discovery section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be missing something. If a validator is assigned to attest to a shard, it should subscribe and publish on that shard topic.

If in phase 1 a node wants to communicate blocks and messages dedicated to that shard, I imagine we will have a specific shard topic for that. This scenario is not included in this phase 0 spec however.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From gossipsub spec:

Also note that as part of the pubsub api, the peer emits SUBSCRIBE and UNSUBSCRIBE control messages to all its peers whenever it joins or leaves a topic. This is provided by the the ambient peer discovery mechanism and nominally not part of the router. A standalone implementation would have to implement those control messages.

We need a mechanism that spreads node's subscription updates all over the network. From my understanding Discovery v5 was designed bearing capability updates in mind.

This scenario is not included in this phase 0 spec however.

Depending on aggregation strategy and number of peers in the network subnets can become a unit of aggregation process in phase 0.

@AgeManning
Copy link
Contributor Author

Using this data node could build protocol id before it starts to communicate with remote peer. Node picks schema version and encoding according to its own preferences and remote peer capabilities.

This is what multistream-select does for us. Given a set of encodings/versions of an RPC message, we can locally order these in preference when multistream negotiates which protocol to use.
For upgrading, we would preference newer protocols over older protocols.

@AgeManning
Copy link
Contributor Author

I'm currently working with Raul and others to incorporate feedback given here and will be updating this in the coming days.

@djrtwo
Copy link
Contributor

djrtwo commented Aug 1, 2019

Closing in favor of #1328

@djrtwo djrtwo closed this Aug 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.