Initial guide text for approvals and especially approvals assignments #1518

burdges · 2020-08-01T13:42:22Z

No description provided.

cla-bot-2020 · 2020-08-01T13:42:24Z

@burdges it looks like you have not signed our contributor license aggreement yet. Please visit this link to sign our agreement. This pull request cannot be merged until the agrement is signed.

burdges · 2020-08-02T09:22:56Z

roadmap/implementers-guide/src/node/validity/approvals.md

+
+### Future work
+
+We could consider additional gossip messages with which nodes claims "slow availability" and/or "slow candidate" to fine tune the assignments "no show" system, but long enough "no show" delays suffice probably.


https://github.com/paritytech/polkadot/issues/1269#issuecomment-667650052

This should avoid the political problems with validator operators wnting everything to be a remote signer.

roadmap/implementers-guide/src/node/validity/approvals.md

rphmeier · 2020-08-05T12:17:55Z

roadmap/implementers-guide/src/node/validity/approvals.md

+
+- **Assignments** ensures that each candidates receives enough random checkers, while reducing adversaries odds for obtaining enough checkers, and limiting adversaries foreknowledge.  It tracks approval votes to identify "no show" approval check takes suspiciously long, perhaps indicating the node being under attack, and assigns more checks in this case.  It tracks relay chain equivocations to determine when adversaries possibly gained foreknowledge about assignments, and adds additional checks in this case.
+
+- **Approval checks** listens to the assignments subsystem for outgoing assignment notices that we shall check specific candidates.  It then performs these checks by first invoking the reconstruction subsystem to obtain the candidate, second invoking the candidate validity utility subsystem upon the candidate, and finally sending out an approval vote, or perhaps initiating a dispute.


This references an assignments subsystem which hasn't been defined.

What are "outgoing assignment notices"? Are these notifications from some other piece of code that we need to be checking some particular candidate?

Great that this references reconstruction & candidate validity : ) - that's exactly how this will be implemented.

rphmeier · 2020-08-05T12:21:30Z

roadmap/implementers-guide/src/node/validity/approvals.md

+
+### Approval keys
+
+We need two separate keys for the approval subsystem


Oh, so this implies yet another session key?

not sure if we have yet migrated a Substrate chain to add more session keys, but it should be doable. However, there is a bootstrapping concern here. When migrating the chain nobody will have yet registered the extra key, but we can't just throw out the validator set.

So I think the process would be that we'd have to add the session key, make an announcement that everyone should rotate session keys, and then enable parachains.

Or is there some way that both of these keys can be the same? It would make the practicalities of upgrading the relay-chain much simpler

We could make the approval vote key be the grandpa key. We could always separate them if some distant future super-sentry node iteration would support consensus running on a separate machine from candidate worker nodes or whatever.

We need either the assignments key to be some new key immune to large slashes, or else some validators would ask that assignments by run in a remote signer, which sounds absolutely nightmarish.

I see, that makes a lot of sense. Expounding on the "risk" these keys have as the reason for the separation would make sense in this section. I'm fine with reusing the GRANDPA key for that, although we will be upgrading that to BLS at some point. Maybe makes most sense to just upgrade and add the extra key type, put out an announcement that every validator should rotate their session keys, and then enable parachains a few weeks later

We actually cannot wholly replace Schnorr with BLS because BLS verification seems just too slow. We'll likely add a BLS12-381 G1 point with a Schnorr proof-of-possession as either the only or as a second GRANDPA public key, but we've then two choices:

We could sign GRANDPA messages first with this BLS public key and then sign that signed message with the Ed25519 public key. We could even replace Ed25519 with Rabin-Williams here, which gets shockingly fast. We'll need slashing condition for when something messy happens.

We sign GRANDPA messages with a Schnorr VRF using this BLS public key, so verification runs much slower than Ed25519, but still vastly faster than BLS signatures. At this point the VRF pre-output actually is a BLS signature however, so we can transition smoothly to BLS verification whenever we gain enough signatures for aggregation to help, but doing this avoids any slashing conditions.

I'd wager 1 sounds easiest, so maybe Ed25519 would stick around for quite a while.

I just say to use ed25519 for approval vote keys now, which along with previous tweaks seemingly finishes this one. Anything else?

roadmap/implementers-guide/src/node/validity/approvals.md

rphmeier · 2020-08-05T12:31:31Z

roadmap/implementers-guide/src/node/validity/approvals.md

+We could consider additional gossip messages with which nodes claims "slow availability" and/or "slow candidate" to fine tune the assignments "no show" system, but long enough "no show" delays suffice probably.
+
+We shall develop more practical experience with UDP once the availability system works using direct UDP connections.  In this, we should discover if reconstruction performs adequately with a complete graphs or  
+benefits from topology restrictions.  At this point, an assignment notices could implicitly request pieces from a random 1/3rd, perhaps topology restricted, which saves one gossip round.  If this preliminary fast reconstruction fails, then nodes' request alternative pieces directly.  There is an interesting design space in how this overlaps with "slow availability" claims.


oh, that's cool. cc @infinity0

roadmap/implementers-guide/src/node/validity/assignments.md

rphmeier · 2020-08-05T12:44:43Z

roadmap/implementers-guide/src/node/validity/assignments.md

+
+We liberate availability cores when their candidate becomes available of course, but one approval assignment criteria continues associating each candidate with the core number it occupied when it became available. 
+
+Assignment operates in loosely timed rounds determined by this `DelayTranche`s, which proceed roughly 12 times faster than six second block production assuming half second gossip times.  If a candidate `C` needs more approval checkers by the time we reach round `t` then any validators with an assignment to `C` in delay tranche `t` gossip their send assignment notice for `C`.  We continue until all candidates have enough approval checkers assigned.  We take entire tranches together if we do not yet have enough, so we expect strictly more than enough checkers.  We also take later tranches if some checkers return their approval votes too slow (see no shows below).  


Another point on delay-tranches. It seems that there is no consensus on which delay-tranches should be used.

For reconstruction and gossip, this seems important. If I receive a reconstruction request, I want it to be legitimized by an assignment proof.

And as I gossip assignments, I will only want to gossip assignments from tranches that I believe should be active. However, how are my peers supposed to know what I accept and what I drop?

The common thread here is to make sure that there is no way for a single validator to create an unbounded amount of assignment proofs that other nodes are forced to circulate or respond to for some reason.

Another point on delay-tranches. It seems that there is no consensus on which delay-tranches should be used.

It's one tranche every k seconds after the relay chain block's slot. I've two numbers in the code: delay tranches start from zero with the relay chain block's slot, while AnV slots are 12 * relay_chain_slot + delay_tranche give an absolute close. I'll let someone else figure out which should be more or less exposed in the interface, etc.

For reconstruction and gossip, this seems important. If I receive a reconstruction request, I want it to be legitimized by an assignment proof.

Yes and no, we could let validators reconstruct anything, but prioritize approval assignments.

And as I gossip assignments, I will only want to gossip assignments from tranches that I believe should be active. However, how are my peers supposed to know what I accept and what I drop?

You need not drop anything:

Approval votes are a huge deal, so gossip them always.

Assignment notices are inherently somewhat limited in number due to being VRFs, so merely save them, and regossip them only when you believe they become viable.

We still need politeness for relay chain block knowledge of course.

roadmap/implementers-guide/src/node/validity/assignments.md

roadmap/implementers-guide/src/node/validity/approvals.md

See w3f/research-internal#515

burdges · 2020-08-07T01:41:30Z

Added notes on parameters in 2b2e4f9

burdges · 2020-08-10T13:34:04Z

Added draft code PR in #1558 :)

burdges · 2020-08-10T13:47:13Z

We've added a discussion in ecfce2b about this scenario that came up in chat with @pepyakin :

A validator with a tranche zero (or other low) assignment never makes their announcement, like because they postponed their work (which is allowed). Yet, they then made this announcement later right around finality. If this announcement gets on-chain (also allowed), then yes it delays finality. If it does not get on-chain, then yes we've one announcement that the off-chain consensus system says is valid, but the chain says was too slow.

In this case, the chain wins I'd think. Yet, if the chain wins here then this requires imposing some annoying universal delay upon finality. :( We could prevent nodes from delaying announcing their assignments by too much I think, but not sure about the parameters yet.

cla-bot-2020 · 2020-08-10T21:30:18Z

@burdges, Your signature has been received.

burdges · 2020-08-15T18:46:57Z

@rphmeier We should chat about the equivocation symmetry: If X and Y are equivocation than differ in parachain rho, so included candidates X[rho] and Y[rho] differ. We risk some subsystem deciding X does not warrant work because Y looks better, but maybe X is an attack, X[rho] is invalid, and Y exists to distract from X. We could say all inclusions get checked, meaning no subsystem could decide X does not warrant work. We might need this for other chain distraction, like maybe X and Y are not even equivocations, but not necessarily. We could alternatively say the candidate equivocations X[rho] and Y[rho] should always be checked, even if we give up on X for other reasons.

* master: Companion for Substrate #6815 (Dynamic Whitelist) (#1612) Candidate backing respects scheduled collator (#1613) implementers-guide: in TOC move collators before backing, to match protocol pipeline (#1611) Initial guide text for approvals and especially approvals assignments (#1518) Implement validation data refactor (#1585) Implementer's Guide: Make HRMP use upward message kinds (#1591)

Initial guide modifications for approvals

c86c667

burdges commented Aug 2, 2020

View reviewed changes

Split approval assignments keys and approval votes keys

2b93eeb

This should avoid the political problems with validator operators wnting everything to be a remote signer.