Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Clique Support #1032

Merged
merged 50 commits into from
Jan 28, 2021
Merged

Add Clique Support #1032

merged 50 commits into from
Jan 28, 2021

Conversation

holgerd77
Copy link
Member

@holgerd77 holgerd77 commented Jan 6, 2021

This PR aims to add clique support as defined in EIP-225 throughout the library stack, so likely touching Common, Block, Blockchain, eventually also Client.

Some references:

Still WIP.

Note: mention Block bug fix from def4177 in release notes

@codecov
Copy link

codecov bot commented Jan 6, 2021

Codecov Report

Merging #1032 (6721321) into master (01243f0) will decrease coverage by 0.01%.
The diff coverage is 86.01%.

Impacted file tree graph

Flag Coverage Δ
block 81.17% <85.95%> (+2.97%) ⬆️
blockchain 83.01% <88.27%> (+4.49%) ⬆️
client 87.87% <ø> (-0.28%) ⬇️
common 86.79% <100.00%> (-5.16%) ⬇️
devp2p 82.73% <ø> (+0.24%) ⬆️
ethash 82.08% <ø> (ø)
tx 90.00% <ø> (ø)
vm 83.09% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

@holgerd77
Copy link
Member Author

Ok, made some good progress here today and we at least have got some substantial start here, this will likely take some time to mature though, there are also still larger parts missing, in particular the signer voting (respectively the extraction of an up-to-date signer list from the voting info from the blocks, read the EIP if you are interested) and the signature validation, both likely for implementation in Blockchain.

If someone has a good idea where to get or how to easily generate PoA specific blockchain test data - so a file like testdata.json but with Rinkeby or Goerli blocks - that would really helpful, got a bit stuck on this front and at the end deactivated some tests for now.

@jochem-brouwer
Copy link
Member

@holgerd77 We only care about the blocks in this testdata right? The transactions in them don't really matter? If that's the case we can just pull the first ~10 blocks from Goerli/Rinkeby using web3 I'd say?

@holgerd77
Copy link
Member Author

@jochem-brouwer I had some problems with missing transactions as well, e.g. on running (testing) the general Block.validate() function, then the tx trie validation would fail. So optimally this would be including the tx data (if not so be it).

@holgerd77
Copy link
Member Author

This is now functionally ready I would say, namely this PR implements the following at this point:

  • Common now includes a PoA consensus configuration for the PoA clique networks (Rinkeby and Goerli)
  • Block performs various header field checks which are specific to Clique (timestamp diff, format of extraData which has a central role on the protocol)
  • Various helper functions for Block like cliqueIsEpochTransition() to comfortably request PoA related data from a block
  • Block: Working signature validation (the signature is included in the last 65 extraData bytes and called extraSeal)
  • Blockchain: extracting initial signers from the genesis block (also included in extraData and re-included for validation every EPOCH_TRANSITION blocks (so that block is then also called an "epoch transition block" with some special properties), data structure for making updated signer states persistent in DB, the structure also prepares for re-orgs where some last signer states might be deleted (deletion on re-orgs not yet implemented, should not be difficult though with the preparations done)
  • Blockchain: PoA clique block signature validation when validateConsensus is activated
  • Blockchain: added the voting mechanism (voting - to add a new signer or remove an existing one - is done repurposing some block header fields like coinbase and nonce), vote persistence in DB (also re-org ready), most of the test cases from the EIP
  • VM: made PoA clique ready (the COINBASE opcode returns the block signer from the extraData field on clique, since the coinbase header field has some other purpose and is not containing the actual signer)

There is some further work to be done and for example some test cases to be finished and/or reactivated, this also needs some polishing in general, so I would expect this to take another 1-2 weeks at least until we get this merged here.

Since the changes are so broad I will nevertheless put this open this for review already - together with adding a "do-not-merge" label - since it will make sense to not do an all-at-once review here but rather give some continued thoughts and polish together. You are also very invited to open PRs towards this branch if you want to continue to work here or make something better.

I have some dedicated questions and remarks as well, will put these directly along the related code parts.

Cheers 🥳 👾 🌻

Holger

@ryanio
Copy link
Contributor

ryanio commented Jan 20, 2021

I've rebased on master and added some commits, here are the summary of changes:

  • Tests added for coverage:
    • Test invalid extra data length (pow, poa)
    • Test invalid signer list (indivisible by 20)
    • Test timestamp diff between blocks is lower than PERIOD
  • Renamed _checkClique() to _requireClique() for increased clarity
  • Used Address instead of Buffer for clique signer types
  • Block Header:
    • Add helper method hashClique()
    • Renamed for clarity cliqueSignatureToAddress() -> cliqueSigner(): Address
  • Blockchain:
    • Catch NotFoundError for clique key db reads
    • Add method helper: cliqueSaveSigners(genesisBlock)
    • Replaced const CLIQUE_EMPTY_BENEFICIARY = Buffer.alloc(20) with Address.isZero()

Pending TODOs I will be working on next

  • Test zero beneficiary (coinbase) on epoch transitions
  • Test non-zero mixHash
  • Validate signer lists on epoch transition blocks
  • CLIQUE_SIGNER_HISTORY_BLOCK_LIMIT is unused?

One future TODO for myself: I found having to write a lot of address1.toBuffer().equals(address2.toBuffer() so I would like to add Address.equals(address: Address) to ethereumjs-util, which would shorten many lines to address1.equals(address2). (edit: opened a PR for this here: ethereumjs/ethereumjs-util#285)

@holgerd77
Copy link
Member Author

holgerd77 commented Jan 20, 2021

Great, I'm ok with all the changes, yes, CLIQUE_SIGNER_HISTORY_BLOCK_LIMIT is not yet used, this should be integrated into the updateSignerStates method (at least that was my intention, came over it though) to not endlessly bloat the list of signer states on ongoing updates.

Copy link
Member Author

@holgerd77 holgerd77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed up till 893a0b4

Everything ok. 😄

packages/blockchain/src/index.ts Outdated Show resolved Hide resolved
packages/blockchain/src/index.ts Show resolved Hide resolved
packages/blockchain/src/index.ts Outdated Show resolved Hide resolved
packages/blockchain/src/db/operation.ts Show resolved Hide resolved
packages/block/test/header.spec.ts Show resolved Hide resolved
packages/block/src/header.ts Show resolved Hide resolved
packages/block/src/block.ts Show resolved Hide resolved
@ryanio ryanio force-pushed the add-clique-support branch from 893a0b4 to 436067a Compare January 23, 2021 02:18
@ryanio
Copy link
Contributor

ryanio commented Jan 23, 2021

okay I believe I've completed all remaining TODOs! all tests are passing.

The last task is to handle reorgs by removing added signers and votes for deleted blocks, which I will add soon.

I also wanted to ask if we should additionally validate the difficulty field, I don't think I saw any logic for that yet:

difficulty: Contains the standalone score of the block to derive the quality of a chain.
Must be DIFF_NOTURN (1) if BLOCK_NUMBER % SIGNER_COUNT != SIGNER_INDEX
Must be DIFF_INTURN (2) if BLOCK_NUMBER % SIGNER_COUNT == SIGNER_INDEX

Another validation we can add is that ommersHash is equal to UNCLE_HASH (Keccak256(RLP([]))), I don't think it's critical, but a requirement of the spec.

(edit: realized there are still a few commented out test cases at the bottom of blockchain/test/clique.spec.ts which I will resolve)

@holgerd77
Copy link
Member Author

@ryanio I honestly didn't understand this whole turn-no-turn part of the spec. 😋 If you get it just implement, at the moment it is not taken in (as I also perceived this as non-mission-critical, but of course would be good to have).

The uncle validation is automatically done in Block.validateUnclesHash() which is comparing against (Keccak256(RLP([])) if the list of uncles (as checked) is empty.

@holgerd77
Copy link
Member Author

Have added a new factory helper method Blockchain.fromBlocksData() here 72e6fb2 for testing in the client but also for general usage, this might also be helpful for the tests here. Eventually in a subsequent PR since this needs some time to get merged in along with the other changes on client execution (will give some description along the PR later on).

@ryanio
Copy link
Contributor

ryanio commented Jan 28, 2021

ok, I believe I've completed everything and this is now ready for review!

I will give it one last look over tomorrow to see if I can add any helpful additional clarifying comments or typedocs, but meanwhile it is open for review.

Please let me know if anything can be improved or clarified. Thanks!

@@ -44,5 +44,5 @@ export default function blockFromRpc(blockParams: any, uncles: any[] = [], optio

const uncleHeaders = uncles.map((uh) => blockHeaderFromRpc(uh, options))

return Block.fromBlockData({ header, transactions, uncleHeaders })
return Block.fromBlockData({ header, transactions, uncleHeaders }, options)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug fix 🥴 I found common wasn't being passed properly to the returned block in blockFromRpc when using it for the goerli block test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ryanio
Copy link
Contributor

ryanio commented Jan 28, 2021

One thing we can consider adding in the future is a helper to create a block with a certain clique signature seal, otherwise we currently have to manually put together the extraData which can be a bit tedious. For now this might only be useful for making test data/cases (we do this in test/blockchain/clique.spec.ts:addNextBlock) until we start producing signed blocks in the client.

@ryanio
Copy link
Contributor

ryanio commented Jan 28, 2021

This is also one last implementation detail in EIP-225 for the client when broadcasting clique blocks, noting here for subsequent work:

Authorization strategies

As long as signers conform to the above specs, they can authorize and distribute blocks as they see fit. The following suggested strategy will however reduce network traffic and small forks, so it’s a suggested feature:

  • If a signer is allowed to sign a block (is on the authorized list and didn’t sign recently).
    • Calculate the optimal signing time of the next block (parent + BLOCK_PERIOD).
    • If the signer is in-turn, wait for the exact time to arrive, sign and broadcast immediately.
    • If the signer is out-of-turn, delay signing by rand(SIGNER_COUNT * 500ms).

This small strategy will ensure that the in-turn signer (who’s block weighs more) has a slight advantage to sign and propagate versus the out-of-turn signers. Also the scheme allows a bit of scale with the increase of the number of signers.

@holgerd77
Copy link
Member Author

One thing we can consider adding in the future is a helper to create a block with a certain clique signature seal, otherwise we currently have to manually put together the extraData which can be a bit tedious. For now this might only be useful for making test data/cases (we do this in test/blockchain/clique.spec.ts:addNextBlock) until we start producing signed blocks in the client.

Yes, I thought this as well and already wanted to add, just couldn't come up with an API I was satisfied with so I put this aside. Think we should add this rather sooner than later, I would assume this is a somewhat common use case also useful beside our own signed block creation one.

Copy link
Member Author

@holgerd77 holgerd77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, have reviewed the rest of the commits, thanks Ryan, really great PR with these substantial functionality additions and completions and extensive test additions.

Have some comments (likely) to address, will nevertheless give this a go to have this huge PR out of the way.

I will also merge with admin merge for procedural reasons, since Ryan has already reviewed my initial code and I have now done the review of the remaining code changes, so technically this should make sense, hope that is ok.

if (!activeSigners.find(aSigner => aSigner.toBuffer().equals(cSigner.toBuffer()))) {
throw new Error('checkpoint signer not found in active signers list: ' + cSigner.toString())
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually just checking one way, there can still be the case that we have an active signer still there which should have been removed at some point (so the whole check should be for equality of the lists).

Won't make this a blocker for review but we likely want to add on occasion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh right good point, I will change to a strict equality check of the same order, that would be much better.

const limit = this.CLIQUE_SIGNER_HISTORY_BLOCK_LIMIT
if (length > limit) {
this._cliqueLatestVotes = this._cliqueLatestVotes.slice(length - limit, length)
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was my initial thought as well to trim both the signer states AND the votes (sorry, just looked it up, also still had this as a comment in along CLIQUE_SIGNER_HISTORY_BLOCK_LIMIT), but the votes actually can't/shouldn't be discarded, these can always still be relevant for vote count (not even directly after an epoch transition block since there can incidentally be a reorg directly upon/around an epoch transition block).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I guess for the votes we might want to define a safe-number-of-blocks waiting periods (longest anticipated reorg block counts) and then discard the votes from before the epoch transition.

Copy link
Contributor

@ryanio ryanio Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right now all votes are being cleared on epoch transitions:

if (this._common.consensusAlgorithm() === 'clique') {
// Reset all votes on epoch transition blocks
if (header.cliqueIsEpochTransition()) {
this._cliqueLatestVotes = []
await this.cliqueUpdateVotes()

which is correct as per this part of the spec:

Attack vector: Spamming signer

... The solution is to place a moving window of W blocks after which votes are considered stale. We’ll call this an epoch.

Keeping in mind reorgs would be important though as you mention, but I have to think more about how we would manage that.

Perhaps we could keep some history of stale votes, but the vote counting logic in cliqueUpdateVotes should only count votes from the last active epoch in case of this edge scenario? starts getting a bit tricky 🤔

(edit: I added this to the new follow up PR)

Copy link
Member Author

@holgerd77 holgerd77 Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually think it's too tricky: if there is a reorg, then all vote and signer state entries with block numbers higher than the reorg block number will be deleted, the signers from the last singer state stack item will be become the new active signers. Everything else from that moment on then continues (restarts) as before.

If I don't get this wrong, votes need to be kept relatively long in this scenario, so i think "EPOCH_LENGTH + REORG_SAFE_PERIOD" block numbers - so something like 30050, since if a reorg is happening from let's say 30020 -> 29950 and there is voting taking place in these last 50 blocks before the epoch transition, theoretically all existing votes from this 0-29999 epoch might be relevant for the pre-epoch voting. Does this make sense?

For Signer States I would think - all states from the last REORG_SAFE_PERIOD blocks + minimally one state - should be enough if I am not forgetting something.

But I think we can also generally be a bit more generous here - e.g. take a somewhat longer REORG_SAFE_PERIOD of 100 or something or keep more signer states, nothing lost here, this doesn't take that much space and eventually the date might be interesting for analytical purposes.

const length = this._cliqueLatestSignerStates.length
const limit = this.CLIQUE_SIGNER_HISTORY_BLOCK_LIMIT
if (length > limit) {
this._cliqueLatestSignerStates = this._cliqueLatestSignerStates.slice(length - limit, length)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea for CLIQUE_SIGNER_HISTORY_BLOCK_LIMIT was to have a defined period of blocks (therefore the name) and delete the old signer states once these number of blocks have passed. Not sure, this version of just keeping some number of signer states should work as well - I guess - but then I would rename the constant.

Also maybe 100 might be a bit too much then? But not sure, maybe that's ok, shouldn't be too much extra data.

@@ -44,5 +44,5 @@ export default function blockFromRpc(blockParams: any, uncles: any[] = [], optio

const uncleHeaders = uncles.map((uh) => blockHeaderFromRpc(uh, options))

return Block.fromBlockData({ header, transactions, uncleHeaders })
return Block.fromBlockData({ header, transactions, uncleHeaders }, options)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -130,4 +130,5 @@ export interface JsonHeader {

export interface Blockchain {
getBlock(hash: Buffer): Promise<Block>
cliqueActiveSigners(): Address[]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is - as far as I oversee - the only breaking change up till now in the PR. Maybe we don't want to do this for now and keep the option of doing a non-breaking release? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: ah, just seeing that this is in Block. A bit confused, are people likely using this or is this just for us for testing purposes?

Copy link
Contributor

@ryanio ryanio Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I needed this for the validateCliqueDifficulty method which needs to calculate if the signer is in or out of turn. I thought it would be good in the block header next to the pow difficulty check method, but you are right that it adds to the blockchain interface here. I can move it over to blockchain and leave the interface here as before if you think that would be better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just rechecked, so if this is used for typing in Block functions like async validate(blockchain: Blockchain): Promise<void> then this change is definitely breaking on backwards-compatibility, so we should find another solution here.

Maybe alternatively just leave the interface as-is and manually check in validateCliqueDifficulty() for blockchain.cliqueActiveSigners() to be present and if not throw? And additionally add some notes on the blockchain consuming methods in Block that these methods needs a clique supporting Blockchain object if run with a PoA setting? 🤔

Not sure if this is the best/optimal solution though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that sounds like a good solution for now, I like keeping the validation in the block header since it's a logical place to validate the difficulty (although it requires the context of blockchain).

i can be convinced to move it but let's see how the solution looks this way. (added in commit: d58a901)

@holgerd77 holgerd77 merged commit 28ea476 into master Jan 28, 2021
@holgerd77 holgerd77 deleted the add-clique-support branch January 28, 2021 11:02
clique: {
period: number
epoch: number
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was too strict and breaks typing in non-PoA contexts, have fixed this in 6fe4a1d.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok great, I was wondering, it might be helpful if all the options were optional, at least for my use in forCustomChain , I only wanted to set epoch to 3, but I had to provide all the other values:

const common = Common.forCustomChain(
'rinkeby',
{
consensus: {
type: 'poa',
algorithm: 'clique',
clique: {
period: 15,
epoch: 3,
},
},
},
'chainstart'
)

I understand this is a limitation of how the code is set up in forCustomChain though, since it's using the spread operator, it does not allow for deep nesting replacement of single values, it would replace the whole block provided. Not sure if there is a better technique we could use.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't be a fan if we loosen this here in general since the type safety of these chain and HF definition JSON files is so underdeveloped already and should rather be improved towards the other side at some PiT. I get the point though. Allow any here for the function signature as the most simple/naive solution? 😜 Not sure, if there is something better, this should of course be preferred.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, agree, in general would love some more type safety in common coming from the json files.

what I meant though is I would love to do this:

 const common = Common.forCustomChain( 
   'rinkeby',
   { consensus: { clique: { epoch: 3 } } }, 
   'chainstart' 
 )

however the current way the code is set up, then { consensus: type, algorithm, clique: { period } } would all be undefined. we would need some more sophisticated logic for deeply replacing nested values, rather than the simple solution of using the spread operator (here).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, thanks for the clarification.

@ryanio
Copy link
Contributor

ryanio commented Jan 28, 2021

One thing we can consider adding in the future is a helper to create a block with a certain clique signature seal, otherwise we currently have to manually put together the extraData which can be a bit tedious. For now this might only be useful for making test data/cases (we do this in test/blockchain/clique.spec.ts:addNextBlock) until we start producing signed blocks in the client.

Yes, I thought this as well and already wanted to add, just couldn't come up with an API I was satisfied with so I put this aside. Think we should add this rather sooner than later, I would assume this is a somewhat common use case also useful beside our own signed block creation one.

just thinking about this, how about a new BlockOption:

/**
 * Provide a [Address, privateKey] to seal clique block with this signer.
 * Will throw if provided on non-clique chains.
 */
cliqueSigner?: [Address, Buffer]

then I can create a helper method that would easily concat it with the passed-in (or blank) extraData.

(edit: actually looks like we only need the privateKey, not the address)

@holgerd77
Copy link
Member Author

One thing we can consider adding in the future is a helper to create a block with a certain clique signature seal, otherwise we currently have to manually put together the extraData which can be a bit tedious. For now this might only be useful for making test data/cases (we do this in test/blockchain/clique.spec.ts:addNextBlock) until we start producing signed blocks in the client.

Yes, I thought this as well and already wanted to add, just couldn't come up with an API I was satisfied with so I put this aside. Think we should add this rather sooner than later, I would assume this is a somewhat common use case also useful beside our own signed block creation one.

just thinking about this, how about a new BlockOption:

/**
 * Provide a [Address, privateKey] to seal clique block with this signer.
 * Will throw if provided on non-clique chains.
 */
cliqueSigner?: [Address, Buffer]

then I can create a helper method that would easily concat it with the passed-in (or blank) extraData.

(edit: actually looks like we only need the privateKey, not the address)

I think that's an option, at some point I also thought about adding an option to the constructor, do you think this is ok from a security PoV to request the private key from the signer? This was my main reason to hesitate, but rather based on some blurry feeling than some more solid analysis here.

P.S.: not sure if this was clever - at least it gives a neutral perspective - just to let you know that I answered these questions here before looking at your follow-up clique PR #1074.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants