Optimize backing for latency and liveness #4386

eskimor · 2021-11-26T17:11:44Z

Without contextual execution, getting a big parachain block backed is a tight squeeze as we only have two seconds before the block producer must have seen all required statements. This can be witnessed by issues like this one.

Those two seconds are spent in the backing process:

Receiving PoV from collator/other validator
Validating it
Propagating statements

Delivering a collation to a validator currently times out after 1 seconds, which is kind of sensible, considering that we only have 2 seconds in total, although it might be worthwhile to increase that limit a bit, as it seems to be a very significant part of the whole process.

Optimizing for Latency

The collator protocols is currently optimized for bandwidth, when in reality the real problem will most likely be latency in a globally distributed network. Ping times can get in the range of hundreds of milliseconds. If we consider TCP handshakes and TCP slow start, transferring megabytes of data can easily break the timeout, despite nodes having loads of bandwidth. Things we should do to mitigate this:

Have collators pre-connect properly, so we get the TCP handshake out of the hot path.
We should reduce the timeout for starting parallel downloads here. Reasoning: Usually not the bandwidth is the problem, but latency - so a limited amount of parallel uploads will improve matters not make it worse.
We could make the MAX_UNSHARED_UPLOAD_TIME even 0, to fully optimize for latency, if we limit the number of parallel uploads.
Reduce backing group sizes, which reduces the total amount of work that needs to be done in the backing phase.

With 3) POV distribution should hardly be necessary anymore, which I think is good, because POV distribution only starts after a candidate has been validated, which again adds latency, which will hardly make up for the reduced bandwidth demands on the collator - also parachains control their collators, so if they are not happy with performance, they can beef them up.

Optimizing for Liveness

In addition or instead to point 4 of the previous section, we should also think about reducing the number of required backing votes. As already discussed a few times, the security of Polkadot comes from approval checking, in backing we really should be more concerned about liveness than security.

Consider a backing group of size 5, right now we would require 3 votes for the backing to succeed. If this backing group is widely distributed around the world, then some of those validators will have a very short round trip time to the current collator, while others might have a very long round trip time. By reducing the number of required votes, we can take this to our advantage. For example, if we would only demand one vote, then we could make backing succeed as long as a single validator in the backing group is in the same region as the collator, even when pushing limits.

Or in other words, we could make it so that a single good enough validator will suffice for the parachain to make progress, while now we require three out of five to be good and nearby.

Considerations of reduced number of required backing votes

If we optimize for liveness and only require a single backing vote, the stake risked by an attacker is only a third of what would be at risk if we required three votes. I would not consider this a big problem though, as we can easily make up for it, by increasing the number of required approval checkers - increasing the risk of getting caught.

On a network with disputes, but without slashing, a single malicious validator can reduce parachain performance way more than a single slow backing group would.

Conclusion

Contextual Execution is absolute top priority right now, but will need some time to be implemented properly, in the meantime we should make sure parachains can work as smoothly as possible. Parachain teams expect load on Polkadot to be higher than on Kusama, so any problems we are seeing on Kusama, might become worse on Polkadot.

Further improvements

With high latency TCP slow start is a big problem. Requiring several round trips for transferring data will ruin the effective bandwidth, even if nominal bandwidth is plenty. Until we finally have QUIC support, we might integrate some OS setting detection into the Polkadot binary, which will inform the operator about non optimal TCP settings and how to fix them.

Forks make matters worse, so tackling this might also help, depending on the outcome of this.

Implementation

Apart from collator pre-connect, which is an issue on its own, going forward with this issue, I would suggest the following:

Reduce MAX_UNSHARED_UPLOAD_TIME from 400ms to max 200ms, maybe even just 150ms.
Reduce number of needed backing votes - from a liveness perspective, 1 would be ideal, but 2 will also improve things over the current 3.
Increase network timeout a bit, to let's say 1.2 seconds.
Leave reputation changes alone: Slow validators will still ban the collator, but this should do no harm to the network as the collator will still be able to send its collation to fast validators.

I would expect these measures already having a big effect. Because requests are answered on a first come - first serve basis, so low latency connections have an advantage of getting answered first and those connections will likely also be able to upload the candidate in a very short amount of time ... after 200ms it might already be finished. If only one vote is required this first fast upload would already suffice for the parachain to make progress, but if not the reduced MAX_UNSHARED_UPLOAD_TIME should definitely help with getting other backing validators to succeed at leaast. Given that it seems that we should really worry about latency the most, I would vouch for 150ms MAX_UNSHARED_UPLOAD_TIME.

The text was updated successfully, but these errors were encountered:

burdges · 2021-11-26T17:32:59Z

We know some of this like 1 helps even once we have contextual execution.

I donno either way about 2 honestly..

I think 3 sounds harmless, except that parachain liveness requires they share the block within the parachain.

We do risk temporary liveness hiccups if we shrink the backing group size, but..

It's fairly harmless to back with less than 1/2 of the backers I think, but it somewhat increases block producers' power for MEV and censorship, but we envision other solutions there.

eskimor · 2021-11-27T12:07:29Z

Other things we could do, by @crystalin :

Provide documentation on how to configure your OS TCP stack to reduce time to send a block (currently, with a ping of 100 (US <-> EU) for exemple, it takes as much time on a 10Mbps than a 1Gbps to transfer 3Mb (the usual PoV for big blocks). Because all the time is spent on the network congestion system, that is performing a "slow start" requiring multiple round trips to start sending data.
Adds support for the binary to detect the OS setting for TCP and display information/advice on how to improve (for validators)
Review the timeout limit of 1000ms to receive the collation (block + PoV). 1000ms with a ping of 100ms is not possible without a well configured OS
Provide optimized binary for polkadot (that will improve mostly CPU but will give more time to receive/send data over network)

For longer terms we have:

Adapt a low latency transport layer based on UDP (QUIC?) instead of TCP
Support contextual execution (which is getting started now) which will allow longer networking timeouts
Provide better incentivization for validators to include parachain blocks

burdges · 2021-11-28T10:26:42Z

We always wanted QUIC but initial efforts lacked tuning knowledge I think. We should either hold connections open or else use the 0-RTT option in TLS 1.3.

I doubt validators run modified code now, so incentivization changes nothing now. We've done initial design work for incentivization that helps maintain this longer term.

crystalin · 2021-11-28T16:17:30Z

I disagree with you about incentivization. It needs to be improved.

It is true that probably no-one is running customized binaries, but the incentivization is to make sure they are trying to do their best to get the blocks included. I suspect a lot of them are running very poor hardware & connection, and currently there is no reason for them to improve that situation as it doesn't impact their rewards on Kusama.

However it impacts a lot the parachains. A slow validator will never back a parachain block (We observe that in Moonriver, when blocks are not included during the full validator rotation length (2 mins)).

Having a strong incentivization (for the validator backing the parachain, but also the validator producing the block to include the parachain block), would motivate the operators to improve their setup (hardware and configuration).

We see that on Moonriver, where the chances to get a block in (and to get the rewards with it) are highly related with their setup and hardware. We only have ~50 collators, but ALL of them are running high end hardware/setup, optimized for block production.

AlistairStewart · 2021-11-29T16:48:50Z

It doesn't cost much to get the additional security required to reduce the backing votes to 1. I don't know how much the next step - distribution for availability - depends on how many people had the block already. This might end up timing out more often if the one guy who has the PoV block has bad networking.

eskimor · 2021-11-29T17:20:14Z

Thanks @AlistairStewart ! Good point with with regards to availability-distribution,but I think that should be fine:

availability-distribution has relaxed timing requirements - if it takes two blocks, so be it - hurts performance, but liveness is there.
We only require one vote, that does not mean that there will only be one backer having voted and if so, then things have been tight and slow availability is still better, than not getting the candidate in at all.

burdges · 2021-11-30T00:45:36Z

We do not afaik count backing checkers towards approval @AlistairStewart so no security changes required there. It's interesting whether doing only 1 backer helps or hurts overall, so maybe worth testing. A priori, it sounds like contextual execution makes the 1 backer trick no longer helpful, but maybe with future pipelining ideas? I donno..

As I said above @crystalin, we're only discussing optimizations here, because right now incentivization cannot impact observable performance, assuming nobody runs modified code yet. I'll do a guide PR for our incentivization design eventually, but not this week.

This PR: - Reduces MAX_UNSHARED_UPLOAD_TIME to 150ms - Increases timeout on collation fetching to 1200ms - Reduces limit on needed backing votes in the runtime This PR does not yet reduce the number of needed backing votes on the node as this can only be meaningfully enacted once the changed limit in the runtime is live.

* First step in implementing #4386 This PR: - Reduces MAX_UNSHARED_UPLOAD_TIME to 150ms - Increases timeout on collation fetching to 1200ms - Reduces limit on needed backing votes in the runtime This PR does not yet reduce the number of needed backing votes on the node as this can only be meaningfully enacted once the changed limit in the runtime is live. * Fix tests. * Guide updates. * Review remarks. * Bump minimum required backing votes to 2 in runtime. * Make sure node side code won't make runtime vomit. * cargo +nightly fmt

* First step in implementing paritytech#4386 This PR: - Reduces MAX_UNSHARED_UPLOAD_TIME to 150ms - Increases timeout on collation fetching to 1200ms - Reduces limit on needed backing votes in the runtime This PR does not yet reduce the number of needed backing votes on the node as this can only be meaningfully enacted once the changed limit in the runtime is live. * Fix tests. * Guide updates. * Review remarks. * Bump minimum required backing votes to 2 in runtime. * Make sure node side code won't make runtime vomit. * cargo +nightly fmt

This reverts commit 7311a3a.

* First step in implementing paritytech/polkadot#4386 This PR: - Reduces MAX_UNSHARED_UPLOAD_TIME to 150ms - Increases timeout on collation fetching to 1200ms - Reduces limit on needed backing votes in the runtime This PR does not yet reduce the number of needed backing votes on the node as this can only be meaningfully enacted once the changed limit in the runtime is live. * Fix tests. * Guide updates. * Review remarks. * Bump minimum required backing votes to 2 in runtime. * Make sure node side code won't make runtime vomit. * cargo +nightly fmt

eskimor assigned Lldenaurois and slumber Nov 26, 2021

eskimor mentioned this issue Nov 26, 2021

Runtime Upgrade (Big PoV) leading to collator peer reputation dropping (network stalled) paritytech/substrate#10359

Open

eskimor assigned eskimor and unassigned Lldenaurois and slumber Dec 1, 2021

ordian added a commit that referenced this issue Feb 14, 2022

Revert "First step in implementing #4386 (#4437)"

18d196e

This reverts commit 7311a3a.

eskimor mentioned this issue Mar 1, 2022

Improve parachain liveness by reducing required number of backing votes #5016

Merged

ordian added the T5-parachains_protocol This PR/Issue is related to Parachains features and protocol changes. label Aug 16, 2022

eskimor closed this as completed Aug 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize backing for latency and liveness #4386

Optimize backing for latency and liveness #4386

eskimor commented Nov 26, 2021 •

edited

Loading

burdges commented Nov 26, 2021 •

edited

Loading

eskimor commented Nov 27, 2021

burdges commented Nov 28, 2021

crystalin commented Nov 28, 2021 •

edited

Loading

AlistairStewart commented Nov 29, 2021

eskimor commented Nov 29, 2021 •

edited

Loading

burdges commented Nov 30, 2021

Optimize backing for latency and liveness #4386

Optimize backing for latency and liveness #4386

Comments

eskimor commented Nov 26, 2021 • edited Loading

Optimizing for Latency

Optimizing for Liveness

Considerations of reduced number of required backing votes

Conclusion

Further improvements

Implementation

burdges commented Nov 26, 2021 • edited Loading

eskimor commented Nov 27, 2021

burdges commented Nov 28, 2021

crystalin commented Nov 28, 2021 • edited Loading

AlistairStewart commented Nov 29, 2021

eskimor commented Nov 29, 2021 • edited Loading

burdges commented Nov 30, 2021

eskimor commented Nov 26, 2021 •

edited

Loading

burdges commented Nov 26, 2021 •

edited

Loading

crystalin commented Nov 28, 2021 •

edited

Loading

eskimor commented Nov 29, 2021 •

edited

Loading