From fd5ae5c063a430e0dfb0e3b2192fec3ddcb48366 Mon Sep 17 00:00:00 2001 From: Robert Klotzner Date: Wed, 2 Jun 2021 18:17:16 +0200 Subject: [PATCH 01/10] Dispute distribution initial design. --- node/network/protocol/src/lib.rs | 2 +- .../src/node/disputes/dispute-distribution.md | 89 ++++++++++++++++++- .../src/types/overseer-protocol.md | 17 ++++ 3 files changed, 106 insertions(+), 2 deletions(-) diff --git a/node/network/protocol/src/lib.rs b/node/network/protocol/src/lib.rs index 75e9a8805245..5017e087aa99 100644 --- a/node/network/protocol/src/lib.rs +++ b/node/network/protocol/src/lib.rs @@ -301,7 +301,7 @@ pub mod v1 { UncheckedSignedFullStatement, }; - + /// Network messages used by the bitfield distribution subsystem. #[derive(Debug, Clone, Encode, Decode, PartialEq, Eq)] pub enum BitfieldDistributionMessage { diff --git a/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md b/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md index b186fe5895e3..6560df498d8c 100644 --- a/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md +++ b/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md @@ -1,3 +1,90 @@ # Dispute Distribution -TODO https://github.com/paritytech/polkadot/issues/2581 +## Protocol + +### Input + +[`DisputeDistributionMessage`][DisputeDistributionMessage] + +### Output + +- [`DisputeCoordinatorMessage::ActiveDisputes`][DisputeParticipationMessage] +- [`DisputeCoordinatorMessage::ImportStatements`][DisputeParticipationMessage] +- [`RuntimeApiMessage`][RuntimeApiMessage] + +## Functionality + +### Distribution + +Distributing disputes needs to be a reliable protocol. We would like to make as +sure as possible that our vote got properly delivered to all concerned +validators. For this to work, this subsystem won't be gossip based, but instead +will use a request/response protocol for application level confirmations. The +request will be the payload (the `ExplicitDisputeStatement`), the response will +be the confirmation. On reception of `DistributeStatement` a node will send and +keep retrying delivering that statement in case of failures as long as the +dispute is active to all concerned validators. The list of concerned validators +will be updated on every block and will change at session boundaries. + +As can be determined from the protocol section, this subsystem is only concerned +with delivering `ExplicityDisputeStatement`s, for all other votes +(backing/approval) the dispute coordinator is responsible of keeping track of +those statements. + +To cather with spam issues, we will in a first implementation only consider +disputes of already included data. Therefore only for candidates that are +already available. These are the only disputes representing an actual threat to +the system and are also the easiest to implement with regards to spam. + +### Reception + +Apart from making sure that local statements are sent out to all relevant +validators, this subsystem is also responsible for receiving votes from other +nodes. Because we are not forwarding foreign statements, spam is not so much of +an issue. We should just make sure to punish a node if it issues a statement for +a candidate that was not found available. + +For each received vote, the subsystem will send an +`DisputeCoordinatorMessage::ImportStatements` message to the dispute +coordinator. We rely on the coordinator to trigger validation and availability +recovery of the candidate, if there was no local vote for it yet and to report +back to us via `DisputeDistributionMessage::ReportCandidateUnavailable` if a +candidate was not found available. + +### Considerations + +Dispute distribution is critical. We should keep track of available validator +connections and issue warnings if we are not connected to a majority of +validators. We should also keep track of failed sending attempts and log +warnings accordingly. As disputes are rare and TCP is a reliable protocol, +probably each failed attempt should trigger a warning in logs and also logged +into some Prometheus metric. + +## Disputes for non included candidates + +If deemed necessary we can later on also support disputes for non included +candidates, but disputes for those cases have totally different requirements. + +First of all such disputes are not time critical. We just want to have +some offender slashed at some point, but we have no risk of finalizing any bad +data. + +Second, we won't have availability for such data, but it also really does not +matter as we have relaxed timing requirements as just mentioned. Instead a node +disputing non included candidates, will be responsible for providing the +disputed data initially. Then nodes which did the check already are also +providers of the data, hence distributing load and making prevention of the +dispute from concluding harder and harder over time. Assuming an attacker can +not DoS a node forever, the dispute will succeed eventually, which is all that +matters. And again, even if an attacker managed to prevent such a dispute from +happening somehow, there is no real harm done, there was no serious attack to +begin with. + +Third: As candidates can be made up at will, we are susceptible to spam. Two +validators can continuously contradict each other. This will result in a slash +of one of them. Still it would be good to consider that possibility and make +sure the network will work properly in such an event. + +[DistputeDistributionMessage]: ../../types/overseer-protocol.md#dispute-distribution-message +[RuntimeApiMessage]: ../../types/overseer-protocol.md#runtime-api-message +[DisputeParticipationMessage]: ../../types/overseer-protocol.md#dispute-participation-message diff --git a/roadmap/implementers-guide/src/types/overseer-protocol.md b/roadmap/implementers-guide/src/types/overseer-protocol.md index 891be333ef39..694f382cfe22 100644 --- a/roadmap/implementers-guide/src/types/overseer-protocol.md +++ b/roadmap/implementers-guide/src/types/overseer-protocol.md @@ -424,6 +424,23 @@ enum DisputeParticipationMessage { } ``` +## Dispute Distribution Message + +Messages received by the [Dispute Distribution +subsystem](../node/disputes/dispute-distribution.md). This subsystem is +responsible of distributing explicit dispute statements. + +```rust +enum DisputeDistributionMessage { + /// Tell dispute distribution to distribute an explicit dispute statement to + validators. + DistributeStatement(ExplicitDisputeStatement), + /// Tell the subsystem that a candidate is not availble. Dispute distribution + can punish peers distributing votes on unavailable hashes for example. + ReportCandidateUnavailable(CandidateHash), +} +``` + ## Network Bridge Message Messages received by the network bridge. This subsystem is invoked by others to manipulate access From 5eb6c16a6efa0b72c8325c95c69e4732b09686f2 Mon Sep 17 00:00:00 2001 From: Robert Klotzner Date: Mon, 7 Jun 2021 21:12:31 +0200 Subject: [PATCH 02/10] WIP. --- .../src/node/disputes/dispute-distribution.md | 54 ++++++++++++++++++- 1 file changed, 53 insertions(+), 1 deletion(-) diff --git a/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md b/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md index 6560df498d8c..494a9349131e 100644 --- a/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md +++ b/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md @@ -14,6 +14,12 @@ ## Functionality +TODO: +- Wire message (Valid is not a valid wire message, it must contain one +invalid vote). +- Distribution of backing and approval votes, in case of nodes not having them. + + ### Distribution Distributing disputes needs to be a reliable protocol. We would like to make as @@ -51,7 +57,53 @@ recovery of the candidate, if there was no local vote for it yet and to report back to us via `DisputeDistributionMessage::ReportCandidateUnavailable` if a candidate was not found available. -### Considerations +## Backing and Approval Votes + +Backing and approval votes get imported when they arrive/are created via the +distpute coordinator by corresponding subsystems. + +We assume that under normal operation each node will be aware of backing and +approval votes and optimize for that case. Nevertheless we want disputes to +conclude fast and reliable, therefore if a node is not aware of backing/approval +votes it can request the missing votes from the node that informed it about the +dispute. It will send the node two bitfields with + + +## Bitfields + +Each validator is responsible for sending its vote to each other validator. For +backing and approval votes we assume each validator is aware of those. To cather +for imperfections, e.g.: + +- A validator might have missed gossip and is not aware of +backing/approval votes +- A validator crashed/was under attack and was only able to +send out its vote to some validators + +We also have a third wire message type: `IHaveVotes` which contains a bitfield +with a bit for each validator. 1 meaning we have some votes from that validator, +0 meaning we don't. + +A validator might send those out to some validator whenever it feels like +missing out on votes and after some timeout just to make sure it knows +everything. The receiver of a `IHaveVotes` message will do two things: + +1. See if the sender is missing votes we are aware of - if so, respond with + those votes. Also send votes of equivocating validators, no matter the + bitfield. +2. Check whether the sender knows about any votes, we don't know about and if so + send a `IHaveVotes` request back, with our knowledge. + +When to send `IHaveVotes` messages: + +1. Whenever we receive an `Invalid`/`Valid` vote and we are not aware of any + disagreeing votes. In this case we will just drop the message and send a + `IHaveVotes` message to the validator we received the `Invalid`/`Valid` + message from. +2. Once per block to some random validator as long as the dispute is active. +3. Whenever we learn something new via `IHaveVotes`, share that knowledge with + two more `IHaveVotes` messages with random other validators. +## Considerations Dispute distribution is critical. We should keep track of available validator connections and issue warnings if we are not connected to a majority of From cb748e038f924226ca103a849397f7041ae60919 Mon Sep 17 00:00:00 2001 From: Robert Klotzner Date: Fri, 11 Jun 2021 18:28:28 +0200 Subject: [PATCH 03/10] Dispute distribution guide update. --- .../src/node/disputes/dispute-distribution.md | 230 +++++++++++++----- .../src/types/overseer-protocol.md | 16 +- 2 files changed, 187 insertions(+), 59 deletions(-) diff --git a/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md b/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md index 494a9349131e..947700306b40 100644 --- a/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md +++ b/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md @@ -1,5 +1,15 @@ # Dispute Distribution +## Design Goals + +This design should result in a protocol that is: + +- resilient to nodes being temporarily unavailable +- make sure nodes are aware of a dispute quickly +- relatively efficient, should not cause too much stress on the network +- be resilient when it comes to spam +- be simple and boring: We want disputes to work when they happen + ## Protocol ### Input @@ -10,52 +20,156 @@ - [`DisputeCoordinatorMessage::ActiveDisputes`][DisputeParticipationMessage] - [`DisputeCoordinatorMessage::ImportStatements`][DisputeParticipationMessage] +- [`DisputeCoordinatorMessage::QueryCandidateVotes`][DisputeParticipationMessage] - [`RuntimeApiMessage`][RuntimeApiMessage] -## Functionality +### Wire format + +#### Disputes + +Protocol: "/polkadot/dispute/1" + +Request: + +```rust +struct DisputeRequest { + // Either initiating invalid vote or our own (if we voted invalid). + invalid_vote: SignedV2, + // Some invalid vote (can be from backing/approval) or our own if we voted + // valid. + valid_vote: SignedV2, +} + +struct InvalidVote { + /// The candidate being disputed. + candidate_hash: CandidateHash, + /// The voting validator. + validator_index: ValidatorIndex, + /// The session the candidate appears in. + candidate_session: SessionIndex, +} + +struct ValidVote { + candidate_hash: CandidateHash, + validator_index: ValidatorIndex, + candidate_session: SessionIndex, + kind: ValidDisputeStatementKind, +} +``` + +Response: + +```rust +enum DisputeResponse { + Confirmed +} +``` + +#### Vote Recovery + +Protocol: "/polkadot/vote-recovery/1" + +```rust +struct IHaveVotesRequest { + candidate_hash: CandidateHash, + session: SessionIndex, + votes: VotesBitfield, +} + +struct VotesBitfield(pub BitVec); +``` + +Response: + +```rust +struct VotesResponse { + /// All votes we have, but the requester was missing. + missing: Vec<(DisputeStatement, ValidatorIndex, ValidatorSignature)>, + /// Any additional equivocating votes, we transmit those even if the sender + /// claims to have votes for that validator (as it might only have one). + equivocating: Vec<(DisputeStatement, ValidatorIndex, ValidatorSignature)>, +} +``` -TODO: -- Wire message (Valid is not a valid wire message, it must contain one -invalid vote). -- Distribution of backing and approval votes, in case of nodes not having them. - - -### Distribution +## Functionality Distributing disputes needs to be a reliable protocol. We would like to make as sure as possible that our vote got properly delivered to all concerned validators. For this to work, this subsystem won't be gossip based, but instead will use a request/response protocol for application level confirmations. The -request will be the payload (the `ExplicitDisputeStatement`), the response will -be the confirmation. On reception of `DistributeStatement` a node will send and -keep retrying delivering that statement in case of failures as long as the -dispute is active to all concerned validators. The list of concerned validators -will be updated on every block and will change at session boundaries. - -As can be determined from the protocol section, this subsystem is only concerned -with delivering `ExplicityDisputeStatement`s, for all other votes -(backing/approval) the dispute coordinator is responsible of keeping track of -those statements. +request will be the payload (the actual votes/statements), the response will +be the confirmation. See [above][#wire-format]. + +### Starting a Dispute + +A dispute is initiated once a node sends the first `Dispute` wire message, +which must contain an "invalid" vote and some "valid" vote. + +The dispute distribution subsystem can instructed to send that message out to +all concerned validators by means of a `DisputeDistributionMessage::SendDispute` +message. That message must contain an invalid vote from the local node and some +valid one, e.g. a backing statement. + +We include a valid vote as well so any node regardless of whether it is synced +with the chain or not or has seen backing/approval vote can see that there are +conflicting votes available, hence we have a valid dispute. Nodes will still +need to check whether the disputing votes are somewhat current and not some +stale ones. + +### Participating in a Dispute + +Upon receiving a `Dispute` message, a dispute distribution will trigger the +import of the received votes via the dispute coordinator +(`DisputeCoordinatorMessage::ImportStatements`). The dispute coordinator will +take care of participating in that dispute if necessary. Once it is done, the +coordinator will send a `DisputeDistributionMessage::SendDispute` message to dispute +distribution. From here, everything is the same as for starting a dispute, +except that if the local node deemed the candidate valid, the `SendDispute` +message will contain a valid vote signed by our node and will contain the +initially received `Invalid` vote. + +### Sending of messages + +Starting and participting in a dispute are pretty similar from the perspective +of disptute distribution. Once we receive a `SendDispute` message we try to make +sure to get the data out. We keep track of all the parachain validators that +should see the message, which are all the parachain validators of the session +where the dispute happened as they will want to participate in the dispute. In +addition we also need to get the votes out to all authorities of the current +session (which might be the same or not). Those authorities will not +participtate in the dispute, but need to see the statements so they can include +them in blocks. + +We keep track of connected parachain validators and authorities and will issue +warnings in the logs if connected nodes are less than two thirds of the +corresponding sets. We also only consider a message transmitted, once we +received a confirmation message. If not we will keep retrying getting that +message out as long as the dispute is deemed alive. To determine whether a +dispute is still alive we will issue a +`DisputeCoordinatorMessage::ActiveDisputes` message before each retry run. Once +a dispute is no longer live, we will clean up the state coordingly. To cather with spam issues, we will in a first implementation only consider disputes of already included data. Therefore only for candidates that are already available. These are the only disputes representing an actual threat to the system and are also the easiest to implement with regards to spam. +Votes can still be old/ not relevant. In this case we will drop those messages +and we might want to decrease reputation of peers sending old data. + ### Reception -Apart from making sure that local statements are sent out to all relevant -validators, this subsystem is also responsible for receiving votes from other -nodes. Because we are not forwarding foreign statements, spam is not so much of -an issue. We should just make sure to punish a node if it issues a statement for -a candidate that was not found available. +Because we are not forwarding foreign statements, spam is not so much of +an issue. Rate limiting should be implemented at the substrate level, see +[#7750](https://github.com/paritytech/substrate/issues/7750). + +### Node Startup -For each received vote, the subsystem will send an -`DisputeCoordinatorMessage::ImportStatements` message to the dispute -coordinator. We rely on the coordinator to trigger validation and availability -recovery of the candidate, if there was no local vote for it yet and to report -back to us via `DisputeDistributionMessage::ReportCandidateUnavailable` if a -candidate was not found available. +On startup we need to check with the dispute coordinator for any ongoing +disputes and assume we have not yet sent our statement for those. In case we +find an explicit statement from ourselves via +`DisputeCoordinatorMessage::QueryCandidateVotes` we will pretend to just have +received a `SendDispute` message for that candidate. ## Backing and Approval Votes @@ -66,43 +180,50 @@ We assume that under normal operation each node will be aware of backing and approval votes and optimize for that case. Nevertheless we want disputes to conclude fast and reliable, therefore if a node is not aware of backing/approval votes it can request the missing votes from the node that informed it about the -dispute. It will send the node two bitfields with +dispute. +## Resiliency -## Bitfields +The above protocol should be sufficient for most cases, but there are certain +cases we also want to have covered: -Each validator is responsible for sending its vote to each other validator. For -backing and approval votes we assume each validator is aware of those. To cather -for imperfections, e.g.: +- Non validator nodes might be interested in ongoing voting, even before it is + recorded on chain. +- Nodes might have missed votes, especially backing or approval votes. + Recovering them from chain is difficult and expensive, due to runtime upgrades + and untyped extrinsics. -- A validator might have missed gossip and is not aware of -backing/approval votes -- A validator crashed/was under attack and was only able to -send out its vote to some validators +To cover those cases, we introduce a second request/response protocol, which can +be handled on a lower priority basis as the one above. It consists of the +request/response messages as described in the [protocol +section][#vote-recovery]. -We also have a third wire message type: `IHaveVotes` which contains a bitfield -with a bit for each validator. 1 meaning we have some votes from that validator, -0 meaning we don't. +Nodes may send those requests to validators, if they feel they are missing +votes. E.g. after some timeout, if no majority was reached yet in their point of +view or if they are not aware of any backing/approval votes for a received +disputed candidate. -A validator might send those out to some validator whenever it feels like -missing out on votes and after some timeout just to make sure it knows -everything. The receiver of a `IHaveVotes` message will do two things: +The receiver of a `IHaveVotesRequests` message will do the following: 1. See if the sender is missing votes we are aware of - if so, respond with those votes. Also send votes of equivocating validators, no matter the bitfield. 2. Check whether the sender knows about any votes, we don't know about and if so send a `IHaveVotes` request back, with our knowledge. +3. Record the peer's knowledge. When to send `IHaveVotes` messages: -1. Whenever we receive an `Invalid`/`Valid` vote and we are not aware of any - disagreeing votes. In this case we will just drop the message and send a - `IHaveVotes` message to the validator we received the `Invalid`/`Valid` - message from. -2. Once per block to some random validator as long as the dispute is active. -3. Whenever we learn something new via `IHaveVotes`, share that knowledge with - two more `IHaveVotes` messages with random other validators. +1. Whenever we are asked to do so via + `DisputeDistributionMessage::FetchMissingVotes`. +2. Approximately once per block to some random validator as long as the dispute + is active. + +Spam considerations: Nodes want to accept those messages once per validator and +per slot. They are free to drop more frequent requests or requests for stale +data. Requests coming from non validator nodes, can be handled on a best effort +basis. + ## Considerations Dispute distribution is critical. We should keep track of available validator @@ -129,14 +250,9 @@ providers of the data, hence distributing load and making prevention of the dispute from concluding harder and harder over time. Assuming an attacker can not DoS a node forever, the dispute will succeed eventually, which is all that matters. And again, even if an attacker managed to prevent such a dispute from -happening somehow, there is no real harm done, there was no serious attack to +happening somehow, there is no real harm done: There was no serious attack to begin with. -Third: As candidates can be made up at will, we are susceptible to spam. Two -validators can continuously contradict each other. This will result in a slash -of one of them. Still it would be good to consider that possibility and make -sure the network will work properly in such an event. - [DistputeDistributionMessage]: ../../types/overseer-protocol.md#dispute-distribution-message [RuntimeApiMessage]: ../../types/overseer-protocol.md#runtime-api-message [DisputeParticipationMessage]: ../../types/overseer-protocol.md#dispute-participation-message diff --git a/roadmap/implementers-guide/src/types/overseer-protocol.md b/roadmap/implementers-guide/src/types/overseer-protocol.md index 694f382cfe22..71e6f1cdf2d4 100644 --- a/roadmap/implementers-guide/src/types/overseer-protocol.md +++ b/roadmap/implementers-guide/src/types/overseer-protocol.md @@ -432,10 +432,22 @@ responsible of distributing explicit dispute statements. ```rust enum DisputeDistributionMessage { + /// Tell dispute distribution to distribute an explicit dispute statement to validators. - DistributeStatement(ExplicitDisputeStatement), - /// Tell the subsystem that a candidate is not availble. Dispute distribution + SendDispute((ValidVote, InvalidVote)), + + /// Ask DisputeDistribution to get votes we don't know about. + /// Fetched votes will be reported via `DisputeCoordinatorMessage::ImportStatements` + FetchMissingVotes { + candiate_hash: CandidateHash, + session: SessionIndex, + knownVotes: Bitfield, + /// Optional validator to query from. `ValidatorIndex` as in the above + referenced session. + from_validator: Option, + } + /// Tell the subsystem that a candidate is not available. Dispute distribution can punish peers distributing votes on unavailable hashes for example. ReportCandidateUnavailable(CandidateHash), } From fb1b465d8837b5a0a2ec436791f7860ceaaa5d88 Mon Sep 17 00:00:00 2001 From: Robert Klotzner Date: Sat, 12 Jun 2021 13:11:31 +0200 Subject: [PATCH 04/10] Make invalid statement include `InvalidStatementKind`. --- .../src/node/disputes/dispute-distribution.md | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md b/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md index 947700306b40..c9aec82e8393 100644 --- a/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md +++ b/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md @@ -41,19 +41,22 @@ struct DisputeRequest { } struct InvalidVote { - /// The candidate being disputed. - candidate_hash: CandidateHash, - /// The voting validator. - validator_index: ValidatorIndex, - /// The session the candidate appears in. - candidate_session: SessionIndex, + subject: VoteSubject, + kind: InvalidDisputeStatementKind, } struct ValidVote { + subject: VoteSubject, + kind: ValidDisputeStatementKind, +} + +struct VoteSubject { + /// The candidate being disputed. candidate_hash: CandidateHash, + /// The voting validator. validator_index: ValidatorIndex, + /// The session the candidate appears in. candidate_session: SessionIndex, - kind: ValidDisputeStatementKind, } ``` From cbaf46a843aa0a3b82cd74e11f9eca9399cd24b4 Mon Sep 17 00:00:00 2001 From: Robert Klotzner Date: Sun, 13 Jun 2021 15:06:21 +0200 Subject: [PATCH 05/10] Clarify the scope of disputes. --- roadmap/implementers-guide/src/protocol-disputes.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/roadmap/implementers-guide/src/protocol-disputes.md b/roadmap/implementers-guide/src/protocol-disputes.md index c7d1bb21ee9b..d925de117633 100644 --- a/roadmap/implementers-guide/src/protocol-disputes.md +++ b/roadmap/implementers-guide/src/protocol-disputes.md @@ -9,8 +9,7 @@ All parachain blocks that end up in the finalized relay chain should be valid. T We have two primary components for ensuring that nothing invalid ends up in the finalized relay chain: * Approval Checking, as described [here](./protocol-approval.md) and implemented according to the [Approval Voting](node/approval/approval-voting.md) subsystem. This protocol can be shown to prevent invalid parachain blocks from making their way into the finalized relay chain as long as the amount of attempts are limited. * Disputes, this protocol, which ensures that each attempt to include something bad is caught, and the offending validators are punished. - -Disputes and their resolution are the formal process to resolve these situations. +Disputes differ from backing and approval process (and can not be part of those) in that a dispute is independent of a particular fork, while both backing and approval operate on particular forks. This distinction is important! Approval voting stops, if an alternative fork which might not contain the currently approved candidate gets finalized. This is totally fine from the perspective of approval voting as its sole purpose is to make sure invalid blocks won't get finalized. For disputes on the other hand we have different requirements: Even though the "danger" is past and the adversaries were not able to get their invalid block approved, we still want them to get slashed for the attempt. Otherwise they just have been able to get a free try, but this is something we need to avoid in our security model, as it is based on the assumption that the probability of getting an invalid block finalized is very low and an attacker would get bankrupt before it could have tried often enough. Every dispute stems from a disagreement among two or more validators. If a bad actor creates a bad block, but the bad actor never distributes it to honest validators, then nobody will dispute it. Of course, such a situation is not even an attack on the network, so we don't need to worry about defending against it. @@ -40,14 +39,14 @@ Lastly, it is possible that for backing disputes, i.e. where the data is not alr Once becoming aware of a dispute, it is the responsibility of all validators to participate in the dispute. Concretely, this means: * Circulate all statements about the candidate that we are aware of - backing statements, approval checking statements, and dispute statements. * If we have already issued any type of statement about the candidate, go no further. - * Download the [`AvailableData`](types/availability.md). If possible, this should first be attempted from other dispute participants or backing validators, and then [(via erasure-coding)](node/availability/availability-recovery.md) from all validators. + * Download the [`AvailableData`](types/availability.md). If possible, this should first be attempted from other dispute participants or backing validators, and then [(via erasure-coding)](node/availability/availability-recovery.md) from all validators. * Extract the Validation Code from any recent relay chain block. Code is guaranteed to be kept available on-chain, so we don't need to download any particular fork of the chain. * Execute the block under the validation code, using the `AvailableData`, and check that all outputs are correct, including the `erasure-root` of the [`CandidateReceipt`](types/candidate.md). * Issue a dispute participation statement to the effect of the validity of the candidate block. -Disputes _conclude_ after ⅔ supermajority is reached in either direction. +Disputes _conclude_ after ⅔ supermajority is reached in either direction. -The on-chain component of disputes can be initiated by providing any two conflicting votes and it also waits for a ⅔ supermajority on either side. The on-chain component also tracks which parablocks have already been disputed so the same parablock may only be disputed once on any particular branch of the relay chain. Lastly, it also tracks which blocks have been included on the current branch of the relay chain. When a dispute is initiated for a para, inclusion is halted for the para until the dispute concludes. +The on-chain component of disputes can be initiated by providing any two conflicting votes and it also waits for a ⅔ supermajority on either side. The on-chain component also tracks which parablocks have already been disputed so the same parablock may only be disputed once on any particular branch of the relay chain. Lastly, it also tracks which blocks have been included on the current branch of the relay chain. When a dispute is initiated for a para, inclusion is halted for the para until the dispute concludes. The author of a relay chain block should initiate the on-chain component of disputes for all disputes which the chain is not aware of, and provide all statements to the on-chain component as well. This should all be done via _inherents_. From ba11d9aa4838be0a7b65762d57b8835391cbd6be Mon Sep 17 00:00:00 2001 From: Robert Klotzner Date: Sun, 13 Jun 2021 19:12:07 +0200 Subject: [PATCH 06/10] A few fixes + introduced back pressure oneshot. --- .../src/types/overseer-protocol.md | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/roadmap/implementers-guide/src/types/overseer-protocol.md b/roadmap/implementers-guide/src/types/overseer-protocol.md index a3194b6f41e7..27ab248e4699 100644 --- a/roadmap/implementers-guide/src/types/overseer-protocol.md +++ b/roadmap/implementers-guide/src/types/overseer-protocol.md @@ -380,6 +380,13 @@ enum DisputeCoordinatorMessage { /// - The validator index (within the session of the candidate) of the validator casting the vote. /// - The signature of the validator casting the vote. statements: Vec<(DisputeStatement, ValidatorIndex, ValidatorSignature)>, + + /// Inform the requester once we finished importing. + /// + /// This is, we either discarded the votes, just record them because we + /// casted our vote already or recovered availability for the candidate + /// successfully. + pending_confirmation: oneshot::Sender<()>, }, /// Fetch a list of all active disputes that the co-ordinator is aware of. ActiveDisputes(ResponseChannel>), @@ -434,21 +441,22 @@ responsible of distributing explicit dispute statements. enum DisputeDistributionMessage { /// Tell dispute distribution to distribute an explicit dispute statement to - validators. + /// validators. SendDispute((ValidVote, InvalidVote)), /// Ask DisputeDistribution to get votes we don't know about. /// Fetched votes will be reported via `DisputeCoordinatorMessage::ImportStatements` FetchMissingVotes { - candiate_hash: CandidateHash, + candidate_hash: CandidateHash, session: SessionIndex, - knownVotes: Bitfield, + known_valid_votes: Bitfield, + known_invalid_votes: Bitfield, /// Optional validator to query from. `ValidatorIndex` as in the above referenced session. from_validator: Option, } /// Tell the subsystem that a candidate is not available. Dispute distribution - can punish peers distributing votes on unavailable hashes for example. + /// can punish peers distributing votes on unavailable hashes. ReportCandidateUnavailable(CandidateHash), } ``` From 72add34c8a044cf8437cdd3294508999b073ddb9 Mon Sep 17 00:00:00 2001 From: Robert Klotzner Date: Sun, 13 Jun 2021 19:12:39 +0200 Subject: [PATCH 07/10] Fixes and spam protection WIP. --- .../src/node/disputes/dispute-distribution.md | 162 +++++++++++++----- 1 file changed, 116 insertions(+), 46 deletions(-) diff --git a/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md b/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md index c9aec82e8393..8d9f940d8280 100644 --- a/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md +++ b/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md @@ -1,5 +1,7 @@ # Dispute Distribution +Dispute distribution is responsible for ensuring all concerned validators will be aware of a dispute and have the relevant votes. + ## Design Goals This design should result in a protocol that is: @@ -34,10 +36,10 @@ Request: ```rust struct DisputeRequest { // Either initiating invalid vote or our own (if we voted invalid). - invalid_vote: SignedV2, + invalid_vote: InvalidVote, // Some invalid vote (can be from backing/approval) or our own if we voted // valid. - valid_vote: SignedV2, + valid_vote: ValidVote, } struct InvalidVote { @@ -57,6 +59,9 @@ struct VoteSubject { validator_index: ValidatorIndex, /// The session the candidate appears in. candidate_session: SessionIndex, + /// The validator signature, that can be verified when constructing a + `SignedDisputeStatement`. + validator_signature: ValidatorSignature, } ``` @@ -76,10 +81,10 @@ Protocol: "/polkadot/vote-recovery/1" struct IHaveVotesRequest { candidate_hash: CandidateHash, session: SessionIndex, - votes: VotesBitfield, + valid_votes: Bitfield, + invalid_votes: Bitfield, } -struct VotesBitfield(pub BitVec); ``` Response: @@ -88,9 +93,6 @@ Response: struct VotesResponse { /// All votes we have, but the requester was missing. missing: Vec<(DisputeStatement, ValidatorIndex, ValidatorSignature)>, - /// Any additional equivocating votes, we transmit those even if the sender - /// claims to have votes for that validator (as it might only have one). - equivocating: Vec<(DisputeStatement, ValidatorIndex, ValidatorSignature)>, } ``` @@ -105,15 +107,15 @@ be the confirmation. See [above][#wire-format]. ### Starting a Dispute -A dispute is initiated once a node sends the first `Dispute` wire message, -which must contain an "invalid" vote and some "valid" vote. +A dispute is initiated once a node sends the first `DisputeRequest` wire message, +which must contain an "invalid" vote and a "valid" vote. -The dispute distribution subsystem can instructed to send that message out to +The dispute distribution subsystem can get instructed to send that message out to all concerned validators by means of a `DisputeDistributionMessage::SendDispute` message. That message must contain an invalid vote from the local node and some valid one, e.g. a backing statement. -We include a valid vote as well so any node regardless of whether it is synced +We include a valid vote as well, so any node regardless of whether it is synced with the chain or not or has seen backing/approval vote can see that there are conflicting votes available, hence we have a valid dispute. Nodes will still need to check whether the disputing votes are somewhat current and not some @@ -121,7 +123,7 @@ stale ones. ### Participating in a Dispute -Upon receiving a `Dispute` message, a dispute distribution will trigger the +Upon receiving a `DisputeRequest` message, a dispute distribution will trigger the import of the received votes via the dispute coordinator (`DisputeCoordinatorMessage::ImportStatements`). The dispute coordinator will take care of participating in that dispute if necessary. Once it is done, the @@ -133,38 +135,109 @@ initially received `Invalid` vote. ### Sending of messages -Starting and participting in a dispute are pretty similar from the perspective +Starting and participating in a dispute are pretty similar from the perspective of disptute distribution. Once we receive a `SendDispute` message we try to make sure to get the data out. We keep track of all the parachain validators that should see the message, which are all the parachain validators of the session where the dispute happened as they will want to participate in the dispute. In addition we also need to get the votes out to all authorities of the current -session (which might be the same or not). Those authorities will not -participtate in the dispute, but need to see the statements so they can include -them in blocks. +session (which might be the same or not and may change during the dispute). +Those authorities will not participate in the dispute, but need to see the +statements so they can include them in blocks. We keep track of connected parachain validators and authorities and will issue warnings in the logs if connected nodes are less than two thirds of the corresponding sets. We also only consider a message transmitted, once we -received a confirmation message. If not we will keep retrying getting that +received a confirmation message. If not, we will keep retrying getting that message out as long as the dispute is deemed alive. To determine whether a dispute is still alive we will issue a `DisputeCoordinatorMessage::ActiveDisputes` message before each retry run. Once -a dispute is no longer live, we will clean up the state coordingly. - -To cather with spam issues, we will in a first implementation only consider -disputes of already included data. Therefore only for candidates that are -already available. These are the only disputes representing an actual threat to -the system and are also the easiest to implement with regards to spam. - -Votes can still be old/ not relevant. In this case we will drop those messages -and we might want to decrease reputation of peers sending old data. +a dispute is no longer live, we will clean up the state accordingly. -### Reception +### Reception & Spam Considerations Because we are not forwarding foreign statements, spam is not so much of an issue. Rate limiting should be implemented at the substrate level, see -[#7750](https://github.com/paritytech/substrate/issues/7750). +[#7750](https://github.com/paritytech/substrate/issues/7750). Still we should +make sure that it is not possible via spamming to prevent a dispute concluding +or worse from getting noticed. + +Considered attack vectors: + +1. A flood of invalid statements could make us run out of resources. E.g. if we + recorded every statement, we could run out of disk space eventually. +2. An attacker can just flood us with notifications on any notification + protocol, assuming flood protection is not effective enough, our unbounded + buffers can fill up and we will run out of memory eventually. +3. Attackers could spam us with invalid disputes (dispute statements that don't + exist). Our incoming queue of requests could get monopolized by those + malicious requests and we won't be able to import any valid disputes and we + could run out of resources. + +For tackling 1, we make sure to not occupy resources before we don't know a +candidate is available. So we will not record those statements to disk until we +recovered availability for the candidate or know by some other means that it is +valid. + +For 2, we will pick up on any dispute on restart, so assuming that any realistic +memory filling attack will take some time, we should be able to participate in a +dispute under such attacks. + +For 3, full monopolization of the incoming queue should not be possible assuming +substrate handles incoming requests in a somewhat fair way. Still we want some defense mechanisms, at the very least we need to make sure to not exhaust resources. + +The dispute coordinator will notify us +via `DisputeDistributionMessage::ReportCandidateUnavailable` about unavailable +candidates and we can disconnect from such peers/decrease their reputation +dramatically. This alone should get us quite far with regards to queue +monopolization, as availability recovery is expected to fail relatively quickly +for unavailable data. + +Still if those spam messages come at a very high rate, we might still run out of +resources if we immediately call `DisputeCoordinatorMessage::ImportStatements` +on each one of them. Secondly with our assumption of 1/3 dishonest validators, +getting rid of all of them will take some time, depending on reputation timeouts +some of them might even be able to reconnect eventually. + +To mitigate those issues we will process dispute messages with a maximum +parallelism `N`. We initiate import processes for up to `N` candidates in +parallel. Once we reached `N` parallel requests we will start back pressuring on +the incoming requests. This saves us from resource exhaustion. + +To reduce impact of malicious nodes further, we can keep track from which nodes the +currently importing statements came from and will drop requests from nodes that +already have imports in flight. + +Honest nodes are not expected to send dispute statements at a high rate, but +even if they did - we will import at least the first one and if it is valid it +will trigger a dispute, preventing finality. For the dropped request any honest +node will retry sending. Also there will be other nodes notifying us about that +dispute and chances are good that the first sent candidate is indeed the oldest +one (if they differ in age at all). So this general rate limit, that we drop +requests if they come faster than we can import the statements should not cause +any problems. At the same time we maximize chances for other validators to be +able to notify us about disputes quickly. Whether this optimization is worth +the trouble is up for discussion. + +Size of `N`: For performance it makes sense to make `N` greater then one for +some pipelining. If we implement the above optimization, a greater `N` allows us +to take care of colluding spammers a bit better. It should defintely not be too +large as multiple availability recovery processes just slow each other down and +we want one to finish. If we don't go with the above optimization, I would +suggest + +The larger `N` the better we can handle distributed flood attacks, +but we also get potentially more availability recovery processes happening at +the same time, which slows down the individual processes. And we rather want to +have one finish quickly than lots slowly at the same time. On the other hand +valid disputes are expected to be rare, so if we ever exhaust `N` it is very +likely that this is caused by spam and spam recoveries don't cost too much +bandwidth due to empty responses. + +Considering that an attacker would need to attack many nodes in parallel to have +any effect, an `N` of 10 seems to be a good compromise. For honest requests, most +of those imports will likely concern the same candidate, and for dishonest ones +we get to disconnect from up to ten colluding adversaries at a time. ### Node Startup @@ -183,7 +256,7 @@ We assume that under normal operation each node will be aware of backing and approval votes and optimize for that case. Nevertheless we want disputes to conclude fast and reliable, therefore if a node is not aware of backing/approval votes it can request the missing votes from the node that informed it about the -dispute. +dispute (see [Resiliency](#Resiliency]) ## Resiliency @@ -206,16 +279,15 @@ votes. E.g. after some timeout, if no majority was reached yet in their point of view or if they are not aware of any backing/approval votes for a received disputed candidate. -The receiver of a `IHaveVotesRequests` message will do the following: +The receiver of a `IHaveVotesRequest` message will do the following: 1. See if the sender is missing votes we are aware of - if so, respond with - those votes. Also send votes of equivocating validators, no matter the - bitfield. + those votes. 2. Check whether the sender knows about any votes, we don't know about and if so - send a `IHaveVotes` request back, with our knowledge. + send a `IHaveVotesRequest` request back, with our knowledge. 3. Record the peer's knowledge. -When to send `IHaveVotes` messages: +When to send `IHaveVotesRequest` messages: 1. Whenever we are asked to do so via `DisputeDistributionMessage::FetchMissingVotes`. @@ -236,25 +308,23 @@ warnings accordingly. As disputes are rare and TCP is a reliable protocol, probably each failed attempt should trigger a warning in logs and also logged into some Prometheus metric. -## Disputes for non included candidates +## Disputes for non available candidates -If deemed necessary we can later on also support disputes for non included +If deemed necessary we can later on also support disputes for non available candidates, but disputes for those cases have totally different requirements. First of all such disputes are not time critical. We just want to have some offender slashed at some point, but we have no risk of finalizing any bad data. -Second, we won't have availability for such data, but it also really does not -matter as we have relaxed timing requirements as just mentioned. Instead a node -disputing non included candidates, will be responsible for providing the -disputed data initially. Then nodes which did the check already are also -providers of the data, hence distributing load and making prevention of the -dispute from concluding harder and harder over time. Assuming an attacker can -not DoS a node forever, the dispute will succeed eventually, which is all that -matters. And again, even if an attacker managed to prevent such a dispute from -happening somehow, there is no real harm done: There was no serious attack to -begin with. +Second, as we won't have availability for such data, the node that initiated the +dispute will be responsible for providing the disputed data initially. Then +nodes which did the check already are also providers of the data, hence +distributing load and making prevention of the dispute from concluding harder +and harder over time. Assuming an attacker can not DoS a node forever, the +dispute will succeed eventually, which is all that matters. And again, even if +an attacker managed to prevent such a dispute from happening somehow, there is +no real harm done: There was no serious attack to begin with. [DistputeDistributionMessage]: ../../types/overseer-protocol.md#dispute-distribution-message [RuntimeApiMessage]: ../../types/overseer-protocol.md#runtime-api-message From 70dec4e3af88b343470782c6d28f8ece608d6d07 Mon Sep 17 00:00:00 2001 From: Robert Klotzner Date: Sun, 13 Jun 2021 19:42:03 +0200 Subject: [PATCH 08/10] More spam considerations. --- .../src/node/disputes/dispute-distribution.md | 85 +++++++++++-------- 1 file changed, 48 insertions(+), 37 deletions(-) diff --git a/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md b/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md index 8d9f940d8280..d63d41b8cf2d 100644 --- a/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md +++ b/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md @@ -156,40 +156,44 @@ a dispute is no longer live, we will clean up the state accordingly. ### Reception & Spam Considerations -Because we are not forwarding foreign statements, spam is not so much of -an issue. Rate limiting should be implemented at the substrate level, see +Because we are not forwarding foreign statements, spam is not so much of an +issue as in other subsystems. Rate limiting should be implemented at the +substrate level, see [#7750](https://github.com/paritytech/substrate/issues/7750). Still we should make sure that it is not possible via spamming to prevent a dispute concluding or worse from getting noticed. Considered attack vectors: -1. A flood of invalid statements could make us run out of resources. E.g. if we - recorded every statement, we could run out of disk space eventually. +1. Invalid disputes (candidate does not exist) could make us + run out of resources. E.g. if we recorded every statement, we could run out + of disk space eventually. 2. An attacker can just flood us with notifications on any notification protocol, assuming flood protection is not effective enough, our unbounded buffers can fill up and we will run out of memory eventually. -3. Attackers could spam us with invalid disputes (dispute statements that don't - exist). Our incoming queue of requests could get monopolized by those - malicious requests and we won't be able to import any valid disputes and we - could run out of resources. +3. Attackers could spam us at a high rate with invalid disputes. Our incoming + queue of requests could get monopolized by those malicious requests and we + won't be able to import any valid disputes and we could run out of resources, + if we tried to process them all in parallel. For tackling 1, we make sure to not occupy resources before we don't know a -candidate is available. So we will not record those statements to disk until we -recovered availability for the candidate or know by some other means that it is -valid. +candidate is available. So we will not record statements to disk until we +recovered availability for the candidate or know by some other means that the +dispute is legit. For 2, we will pick up on any dispute on restart, so assuming that any realistic memory filling attack will take some time, we should be able to participate in a dispute under such attacks. For 3, full monopolization of the incoming queue should not be possible assuming -substrate handles incoming requests in a somewhat fair way. Still we want some defense mechanisms, at the very least we need to make sure to not exhaust resources. +substrate handles incoming requests in a somewhat fair way. Still we want some +defense mechanisms, at the very least we need to make sure to not exhaust +resources. The dispute coordinator will notify us via `DisputeDistributionMessage::ReportCandidateUnavailable` about unavailable candidates and we can disconnect from such peers/decrease their reputation -dramatically. This alone should get us quite far with regards to queue +drastically. This alone should get us quite far with regards to queue monopolization, as availability recovery is expected to fail relatively quickly for unavailable data. @@ -209,36 +213,43 @@ currently importing statements came from and will drop requests from nodes that already have imports in flight. Honest nodes are not expected to send dispute statements at a high rate, but -even if they did - we will import at least the first one and if it is valid it -will trigger a dispute, preventing finality. For the dropped request any honest -node will retry sending. Also there will be other nodes notifying us about that -dispute and chances are good that the first sent candidate is indeed the oldest -one (if they differ in age at all). So this general rate limit, that we drop -requests if they come faster than we can import the statements should not cause -any problems. At the same time we maximize chances for other validators to be -able to notify us about disputes quickly. Whether this optimization is worth -the trouble is up for discussion. - -Size of `N`: For performance it makes sense to make `N` greater then one for -some pipelining. If we implement the above optimization, a greater `N` allows us -to take care of colluding spammers a bit better. It should defintely not be too -large as multiple availability recovery processes just slow each other down and -we want one to finish. If we don't go with the above optimization, I would -suggest - -The larger `N` the better we can handle distributed flood attacks, -but we also get potentially more availability recovery processes happening at -the same time, which slows down the individual processes. And we rather want to -have one finish quickly than lots slowly at the same time. On the other hand -valid disputes are expected to be rare, so if we ever exhaust `N` it is very -likely that this is caused by spam and spam recoveries don't cost too much -bandwidth due to empty responses. +even if they did: + +- we will import at least the first one and if it is valid it will trigger a + dispute, preventing finality. +- Chances are good that the first sent candidate from a peer is indeed the + oldest one (if they differ in age at all). +- for the dropped request any honest node will retry sending. +- there will be other nodes notifying us about that dispute as well. +- honest votes have a speed advantage on average. Apart from the very first + dispute statement for a candidate, which might cause the availability recovery + process, imports of honest votes will be super fast, while for spam imports + they will always take some time as we have to wait for availability to fail. + +So this general rate limit, that we drop requests from same peers if they come +faster than we can import the statements should not cause any problems for +honest nodes and is in their favour. + +Size of `N`: The larger `N` the better we can handle distributed flood attacks +(see previous paragraph), but we also get potentially more availability recovery +processes happening at the same time, which slows down the individual processes. +And we rather want to have one finish quickly than lots slowly at the same time. +On the other hand, valid disputes are expected to be rare, so if we ever exhaust +`N` it is very likely that this is caused by spam and spam recoveries don't cost +too much bandwidth due to empty responses. Considering that an attacker would need to attack many nodes in parallel to have any effect, an `N` of 10 seems to be a good compromise. For honest requests, most of those imports will likely concern the same candidate, and for dishonest ones we get to disconnect from up to ten colluding adversaries at a time. +For the size of the channel for incoming requests: Due to dropping of repeated +requests from same nodes we can make the channel relatively large without fear +of lots of spam requests sitting there wasting our time, even after we already +blocked a peer. For valid disputes, incoming requests can become bursty. On the +other hand we will also be very quick in processing them. A channel size of 50 +requests seems plenty and should be able to handle bursts adequately. + ### Node Startup On startup we need to check with the dispute coordinator for any ongoing From eaa87198b20abc79feee1aa873d42c28aa729bb0 Mon Sep 17 00:00:00 2001 From: Robert Klotzner Date: Sun, 13 Jun 2021 20:16:10 +0200 Subject: [PATCH 09/10] More fixes. --- .../src/node/disputes/dispute-coordinator.md | 9 ++++++--- .../implementers-guide/src/types/overseer-protocol.md | 2 +- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/roadmap/implementers-guide/src/node/disputes/dispute-coordinator.md b/roadmap/implementers-guide/src/node/disputes/dispute-coordinator.md index 90cc75b2e275..794fc5a08069 100644 --- a/roadmap/implementers-guide/src/node/disputes/dispute-coordinator.md +++ b/roadmap/implementers-guide/src/node/disputes/dispute-coordinator.md @@ -2,7 +2,7 @@ This is the central subsystem of the node-side components which participate in disputes. This subsystem wraps a database which tracks all statements observed by all validators over some window of sessions. Votes older than this session window are pruned. -This subsystem will be the point which produce dispute votes, eiuther positive or negative, based on locally-observed validation results as well as a sink for votes received by other subsystems. When importing a dispute vote from another node, this will trigger the [dispute participation](dispute-participation.md) subsystem to recover and validate the block and call back to this subsystem. +This subsystem will be the point which produce dispute votes, either positive or negative, based on locally-observed validation results as well as a sink for votes received by other subsystems. When importing a dispute vote from another node, this will trigger the [dispute participation](dispute-participation.md) subsystem to recover and validate the block and call back to this subsystem. ## Database Schema @@ -85,7 +85,7 @@ Do nothing. * Load from underlying DB by querying `("candidate-votes", session, candidate_hash). If that does not exist, create fresh with the given candidate receipt. * If candidate votes is empty and the statements only contain dispute-specific votes, return. * Otherwise, if there is already an entry from the validator in the respective `valid` or `invalid` field of the `CandidateVotes`, return. -* Add an entry to the respective `valid` or `invalid` list of the `CandidateVotes` for each statement in `statements`. +* Add an entry to the respective `valid` or `invalid` list of the `CandidateVotes` for each statement in `statements`. * Write the `CandidateVotes` to the underyling DB. * If the both `valid` and `invalid` lists now have non-zero length where previously one or both had zero length, the candidate is now freshly disputed. * If freshly disputed, load `"active-disputes"` and add the candidate hash and session index. Also issue a [`DisputeParticipationMessage::Participate`][DisputeParticipationMessage]. @@ -104,8 +104,11 @@ Do nothing. ### On `DisputeCoordinatorMessage::IssueLocalStatement` * Deconstruct into parts `{ session_index, candidate_hash, candidate_receipt, is_valid }`. -* Construct a [`DisputeStatement`][DisputeStatement] based on `Valid` or `Invalid`, depending on the parameterization of this routine. +* Construct a [`DisputeStatement`][DisputeStatement] based on `Valid` or `Invalid`, depending on the parameterization of this routine. * Sign the statement with each key in the `SessionInfo`'s list of parachain validation keys which is present in the keystore, except those whose indices appear in `voted_indices`. This will typically just be one key, but this does provide some future-proofing for situations where the same node may run on behalf multiple validators. At the time of writing, this is not a use-case we support as other subsystems do not invariably provide this guarantee. +* Write statement to DB. +* Send a `DisputeDistributionMessage::SendDispute` message to get the vote + distributed to other validators. ### On `DisputeCoordinatorMessage::DetermineUndisputedChain` diff --git a/roadmap/implementers-guide/src/types/overseer-protocol.md b/roadmap/implementers-guide/src/types/overseer-protocol.md index 27ab248e4699..a8d6da09a6b3 100644 --- a/roadmap/implementers-guide/src/types/overseer-protocol.md +++ b/roadmap/implementers-guide/src/types/overseer-protocol.md @@ -452,7 +452,7 @@ enum DisputeDistributionMessage { known_valid_votes: Bitfield, known_invalid_votes: Bitfield, /// Optional validator to query from. `ValidatorIndex` as in the above - referenced session. + /// referenced session. from_validator: Option, } /// Tell the subsystem that a candidate is not available. Dispute distribution From b9f26fb1c31685705264b74d465eaa02b79e6207 Mon Sep 17 00:00:00 2001 From: Robert Klotzner Date: Mon, 14 Jun 2021 15:41:51 +0200 Subject: [PATCH 10/10] Fixes + add note about not dispute participating nodes. --- .../src/node/disputes/dispute-distribution.md | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md b/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md index d63d41b8cf2d..c8e47cd0bdd3 100644 --- a/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md +++ b/roadmap/implementers-guide/src/node/disputes/dispute-distribution.md @@ -29,7 +29,7 @@ This design should result in a protocol that is: #### Disputes -Protocol: "/polkadot/dispute/1" +Protocol: "/polkadot/send\_dispute/1" Request: @@ -60,7 +60,7 @@ struct VoteSubject { /// The session the candidate appears in. candidate_session: SessionIndex, /// The validator signature, that can be verified when constructing a - `SignedDisputeStatement`. + /// `SignedDisputeStatement`. validator_signature: ValidatorSignature, } ``` @@ -75,7 +75,7 @@ enum DisputeResponse { #### Vote Recovery -Protocol: "/polkadot/vote-recovery/1" +Protocol: "/polkadot/req\_votes/1" ```rust struct IHaveVotesRequest { @@ -133,6 +133,15 @@ except that if the local node deemed the candidate valid, the `SendDispute` message will contain a valid vote signed by our node and will contain the initially received `Invalid` vote. +Note, that we rely on the coordinator to check availability for spam protection +(see below). +In case the current node is only a potential block producer and does not +actually need to recover availability (as it is not going to participate in the +dispute), there is a potential optimization available: The coordinator could +first just check whether we have our piece and only if we don't, try to recover +availability. Our node having a piece would be proof enough of the +data to be available and thus the dispute to not be spam. + ### Sending of messages Starting and participating in a dispute are pretty similar from the perspective @@ -247,7 +256,7 @@ For the size of the channel for incoming requests: Due to dropping of repeated requests from same nodes we can make the channel relatively large without fear of lots of spam requests sitting there wasting our time, even after we already blocked a peer. For valid disputes, incoming requests can become bursty. On the -other hand we will also be very quick in processing them. A channel size of 50 +other hand we will also be very quick in processing them. A channel size of 100 requests seems plenty and should be able to handle bursts adequately. ### Node Startup