Skip to content
Daejun Park edited this page Jul 18, 2019 · 3 revisions

About process_deposit

Proof generation

Daejun Park [12:58 PM] @danny.ryan @carl.beekhuizen I'm a bit confused with process_deposit. Suppose a block that contains multiple Deposits. Then, process_deposit processes each of them, verifying the Merkle tree proof by executing the following code:

    # Verify the Merkle branch                                                           
    assert is_valid_merkle_branch(
        leaf=hash_tree_root(deposit.data),                                               
        branch=deposit.proof,                                                            
        depth=DEPOSIT_CONTRACT_TREE_DEPTH + 1,  # Add 1 for the `List` length mix-in     
        index=state.eth1_deposit_index,                                                  
        root=state.eth1_data.deposit_root,
    )

But, here, the same root is used for different deposits, where different deposits should have different roots. (That is, leaf and branch come from deposit, but root is provided from state.eth1_data that is not changed during multiple process_deposit executions.) What am I missing here? (edited)

Carl Beekhuizen [1:13 PM] The proofs are not the proofs from when the deposits were made. The proofs in these deposits are recalculated based on the state of the deposit-merkle-tree at the point where its root is state.eth1_data.deposit_root.

To include deposits in a block, the proposer must then look at what tree state state.eth1_data.deposit_root corresponds to and calculate the siblings for each height in that tree. That constitutes the proof. (In the same way one needs to maintain the witness normally for a merkle tree.)

Daejun Park [1:40 PM] @carl.beekhuizen I see. But then, wouldn't it be too much effort to reconstruct the proof? In the worst case, the block proposer would need to reconstruct the whole Merkle tree of height 32 which is quite large.. (edited)

Carl Beekhuizen [1:43 PM] MAX_DEPOSITS=16 should help to keep it bounded, but even in the worst case, it is only 2*num_deposits hashes And by worst case here, I mean just recalculating the entire tree, something that shouldn’t have to happen (edited)

Daejun Park [1:49 PM] I don't think MAX_DEPOSITS helps here. If the latest Merkle tree advances a lot (receiving a lot of deposits recently), the branch of the deposit contract is quite different than the proof, so you will need to re-construct the whole tree again. And, yes, it will be just 2 * num_deposits hashes, but num_deposits could be as large as 2^32 (of course, in practice, it should be less than that, though.) (edited) So, you think that 2^32 hash computation is not a big deal for the block proposer, right?

Carl Beekhuizen [1:53 PM] If a client just keeps the tree of new (unprocessed) deposits cached then an a new deposit requires 32 hashes to be updated. When a client sees that a new eth1 data is going to be updated into BeaconState, it can use the deposit_count to find the index of that node from there it can partition the tree and hash in empty siblings at a cost of 32 hashes. Proofs then require a lookup of 32 hashes per new deposit processed. Based on about of circulating Ether (2^27), even if every deposit was done at once, it would “only” be 2^23 hashes (edited)

Daejun Park [1:58 PM] I agree there should be a smart way of time-efficiently constructing the proof by maintaining some previous states (branchs) by clients locally. Is it described in the client implementation guide? Or, implementation-dependent? (edited)

Carl Beekhuizen [2:00 PM] That was something off the top of my head. As far as I know, there is no recommended efficient implementation I think this is one of those instances where we want clients to come up with their own implementations for the sake of diversity.

Forcing update deposits

Daejun Park [2:13 PM] @carl.beekhuizen Another question. In process_operations, we have the following assertion:

    # Verify that outstanding deposits are processed up to the maximum number of deposits
    assert len(body.deposits) == min(MAX_DEPOSITS, state.eth1_data.deposit_count - state.eth1_deposit_index)

I have a concern about clients' usability issue here. I think the usability could be improved by having <= instead of ==.

Since process_eth1_data is executed before process_operations, and process_eth1_data may update state.eth1_data (when having enough votes), block proposers need to also consider whether state.eth1_data is updated or not during processing the proposed block, to pass the above assertion. That is, if they proposed a block by simply referring to the current state.eth1_data and collecting the deposits based on that, but state.eth1_data is luckily(?) updated during processing the proposed block and there were new deposits made between the old and new state.eth1_datas, then the above assertion will fail.

So, the question is: would it be problematic if we have <= instead of == in the above assertion? (edited)

Carl Beekhuizen [2:26 PM] One fear of many people regarding Proof of Stake systems is that validators censor new validators and that a cartel forms that won't allow any new validators to join. In particular on Eth2, because a validator's rewards decrease as a function of the number of other validators, there is an incentive not to add new validators. As an attempt to mitigate this, the assertion is made that the maximum number of new deposits must be added for a block to be valid. I agree that this may place some strange pressure on a validator and that it could be difficult for a validator to construct a valid block. Alternatively, a validator could create two blocks locally one with and one without deposits for both cases and only sign the one that is ultimately correct

Daejun Park [2:53 PM] I see. That's very good to know. In that case, how about swapping the order, that is, having process_operations before process_eth1_data in process_block? Or, keeping the current order, but having the above assert refers to the previous (if just updated) state.eth1_data? I can think of many benign but failing cases at the boundary when the eth1 votes are accumulated almost enough to update the eth1_data.

Carl Beekhuizen [2:35 AM] That definitely would help clients build blocks, but I’d argue that then a block is sometimes at a weird state whereby the deposits are lagging behind the deposit root. I think because the block proposer is the same person behind both the eth1 votes and the deposits, it’s not unfair to expect them to handle this, nor do I think it is particularly difficult. I think this is something to watch out for when the testnets go live. If these block proposers miss their blocks, then I think we should revisit it.

Daejun Park [11:10 AM] @carl.beekhuizen The situation that deposits are behind the deposit root can happen in the current setting, due to the MAX_DEPOSITS cap (i.e., when many deposits more than MAX_DEPOSITS are made between the state.eth1_data update), right? So I don't think that it can be a big concern. Also, the lagging-behind-situation does not hurt our objective that is to "force" validators to "deterministically" add new deposits.

So, I still believe that the above assertion can be updated to improve validators' experience, unless there are other critical concerns. But, I admit that due to my limited experience/understanding of client development, I do not have a good idea of how crucial this improvement is. So, I agree with you that we can reserve this issue for now, and discuss again later when the testnets go live.

Clone this wiki locally