Validate note commitment trees in non-finalized state #2425

dconnolly · 2021-06-30T23:12:27Z

Is your feature request related to a problem?

In order to maintain shielded note commitment trees, we will need to update our implementation of Zebra state for non-finalized chains, keep those trees up to date in the non-finalized state as we validate blocks, and save down the serialized trees to finalized state for the tip.

Questions:

Unanswered questions about the finalized and non-finalized state:

What keys do we need to be able to look up quickly? For transaction validation? For treestate updates?

This changes how we index treestates in the state.

The state already has sprout and sapling anchors, and sprout, sapling, and orchard nullifiers.

Are there any rules about which treestates can be used for which transactions?

This might change how we maintain (Sprout?) treestates in the non-finalized state.

For example, can Sprout, Sapling, and Orchard transactions use the previous block's treestate?
(There are a lot of references to "finalized" treestates in the rest of the RFC, but I'm not sure if they are the same as the "finalized" state.)

The rest of the RFC mentions that Sprout transactions can use the treestate of any previous transaction in the block.

Why do we need to store a separate Sprout treestate for every block? Can we skip blocks without Sprout transactions? Are there parts of the treestate data that don't change?

This impacts the indexes, storage, and lookups for those treestates.

What are the details of a Sprout treestate update?

This might change how we maintain (Sprout?) treestates in the non-finalized or finalized state.

Influences from the UTXO design

Referencing a root that we haven't seen before, because when we're verifying blocks in parallel, is that ok? Cryptographically ?

about using roots that we haven't produced ourselves yet, if you use an invalid root, the proof verification would fail, because the witness path in the tree would fail to match the root

Non-finalized state

Maintaining note commitment trees

Do the sprout implementations for the non-finalized state
Add (sprout|sapling|orchard)_note_commitment_tree: (sprout|sapling|orchard)::tree::NoteCommitmentTree data field to the Chain type
- this stores the note commitment tree for each non-finalized tip
When adding a block to a non-finalized tip:
- if there are any (Sprout|Sapling|Orchard) note commitments exposed by JoinSplits or Outputs or Actions in the transactions in the block, they must be appended to the respective note commitment tree in-order
- append to trees using NoteCommitmentTree.append()
When forking a new side-chain:
- clone the note commitment tree from the fork block
- add from there, do not do any explicit rollback logic, just rebuild from known-good tree

Update `Chain.update_chain_state_with()`

Ensuring that transactions are processed in-block-order, then:
Update self.update_chain_state_with(joinsplit_data) to append any sprout::NoteCommitments
self.update_chain_state_with(sapling_shielded_data_per_spend_anchor) to append any sapling::NoteCommitments
self.update_chain_state_with(sapling_shielded_data_shared_anchor) to append any sapling::NoteCommitments
self.update_chain_state_with(orchard_shielded_data) to append any orchard::NoteCommitments

Finalized state

`FinalizedBlock`

Do the sprout implementations for the finalized state
Add (sprout|sapling|orchard)_note_commitment_tree fields
- These should be pre-computed by non-finalized state/ Chain, the output treestates as of the last transaction in the block

Database format:

Add a new RocksDB column family for each of Sprout, Sapling, and Orchard serialized note commitment trees for the finalized tip:
- | sprout_incremental | BE32(height) | sprout::tree::NoteCommitmentTree | Delete |
- | sapling_incremental | BE32(height) | sapling::tree::NoteCommitmentTree | Delete |
- | orchard_incremental | BE32(height) | orchard::tree::NoteCommitmentTree | Delete |
Increment Zebra's state version

Additional context

See #2091 (comment) for further details

The text was updated successfully, but these errors were encountered:

conradoplg · 2021-07-16T20:25:30Z

After studying this and #2458 I see two options for the overall flow of the finalized state:

When finalizing a block, get the finalized tree from the DB (or a cached value?), add the note commitments from the block and then update the finalized tree and store the new anchor in the state. (This requires "redoing" the work on the tree since it was already done in the Chain, but we no longer have that treestate since the Chain only keeps the treestate of the chain tip)
Store each treestate (and matching anchor) in each PreparedBlock of the Chain, so that when a block is finalized we already have the data we need.

I think the first approach is better since it wastes less memory and the additional computation it requires is small. But it may lead to some code repetition / refactoring since we need to recalculate stuff that was already calculated in the Chain. I'll explore the first option for now.

Additional questions:

Should the trees be stored only when applicable? E.g. when commiting the genesis block, should we store empty Sprout, Sapling and Orchard trees, or should we first store them when applicable?

teor2345 · 2021-07-19T01:17:05Z

After studying this and #2458 I see two options for the overall flow of the finalized state:
* When finalizing a block, get the finalized tree from the DB (or a cached value?), add the note commitments from the block and then update the finalized tree and store the new anchor in the state. (This requires "redoing" the work on the tree since it was already done in the Chain, but we no longer have that treestate since the Chain only keeps the treestate of the chain tip)

* Store each treestate (and matching anchor) in each PreparedBlock of the Chain, so that when a block is finalized we already have the data we need.
I think the first approach is better since it wastes less memory and the additional computation it requires is small. But it may lead to some code repetition / refactoring since we need to recalculate stuff that was already calculated in the Chain. I'll explore the first option for now.

I think the first option is fine - feel free to open a ticket for later optimisations!

Additional questions:

Should the trees be stored only when applicable? E.g. when commiting the genesis block, should we store empty Sprout, Sapling and Orchard trees, or should we first store them when applicable?

I'm not sure, here are some benefits and drawbacks of waiting until Sprout, Sapling, or NU5 activation:

Benefits of waiting until activation:

empty trees and their anchors are not available until the shielded pool is activated, so Zebra can't accidentally approve invalid transactions that depend on empty anchors
less CPU and fewer database writes

Drawbacks of waiting until activation:

Zebra might panic if it tries to verify a pool transfer before that shielded pool is activated (but that code would be a bug)
return types are more complex - they'll need to add an Option or Result layer

Based on those tradeoffs, I would prefer waiting - because verification correctness is a high priority.
What do you think?
Am I missing any benefits or drawbacks?

conradoplg · 2021-07-19T23:33:05Z

I'm not sure, here are some benefits and drawbacks of waiting until Sprout, Sapling, or NU5 activation:

Benefits of waiting until activation:

empty trees and their anchors are not available until the shielded pool is activated, so Zebra can't accidentally approve invalid transactions that depend on empty anchors

less CPU and fewer database writes

Drawbacks of waiting until activation:

Zebra might panic if it tries to verify a pool transfer before that shielded pool is activated (but that code would be a bug)

return types are more complex - they'll need to add an Option or Result layer

Based on those tradeoffs, I would prefer waiting - because verification correctness is a high priority.
What do you think?
Am I missing any benefits or drawbacks?

The code becomes a bit awkward yes, and I think it will make testing a bit difficult since it will be more difficult to "fake" some stuff. For example, with the finalized state, we will need to create a full chain with Sapling and Orchard "fake" activations in order to test it properly. I'm not sure if the current tests already do something similar, I need to check.

Other than that, I think it's a good idea. I've started to use Option for the trees in Chain. @dconnolly can you provide any feedback on this, if you can?

teor2345 · 2021-07-19T23:53:07Z

For example, with the finalized state, we will need to create a full chain with Sapling and Orchard "fake" activations in order to test it properly. I'm not sure if the current tests already do something similar, I need to check.

It might be worth looking at the nullifier tests from #2497, and the UTXO tests in one of my upcoming PRs. (Which has been delayed because I have to work out how to make other tests pass the UTXO checks.)

The nullifier tests commit the genesis block, then modify blocks 1 & 2 so they contain the nullifiers I want.

For the UTXO tests, I'll need to add test methods to the finalized and non-finalized states which add fake UTXOs.

We should be able to do something similar for the anchors, and just have test methods which add fake state?
As long as the non-test anchor initialisation code waits for the network upgrade, we should be fine in production. (We have existing integration tests which test that code in production.)

conradoplg · 2021-07-21T19:27:40Z

@teor2345: my concern is, for example, if we only create the Sapling tree in the Sapling activation height, then in order to test it in the finalized state then we will need to create more than 100K blocks which is unfeasible. From what I understand in the nullifiers test this doesn't happen because there is no check anywhere the test flow regarding the block height.

Regarding using Option, the big question is: when to change it from None to Some(Default::default()) (i.e. the empty tree). I see the following options:

Don't use Option. Always create the empty trees when instantiating objects (FinalizedState, Chain). This is the easiest. Drawbacks:
- As @teor2345 pointed out this could hide a possible bug that attempts to use the trees before they're allowed to. However I think that's easier to validate elsewhere (e.g. don't allow Orchard transactions before Orchard activation).
- More DB writes. This can de dealt with separately, though (e.g. only write the tree if it has changed).
Use Option and change from None to Some lazily (i.e. when the first note is added to the tree). Drawbacks:
- Transactions that anchor on the empty tree won't work because the tree doesn't exist yet and its anchor wasn't added before. This would require special-case handling.
Use Option and change from None to Some on the correct activation height. Drawbacks:
- Can make testing difficult since the test will need to generate blocks with the expected heights (seems impossible), or we will need to change the activation heights only for the tests.

After trying to use Option in the PR I'm more inclined to just not using it (first option).

teor2345 · 2021-07-22T00:03:57Z

@teor2345: my concern is, for example, if we only create the Sapling tree in the Sapling activation height, then in order to test it in the finalized state then we will need to create more than 100K blocks which is unfeasible. From what I understand in the nullifiers test this doesn't happen because there is no check anywhere the test flow regarding the block height.

There is a check that the heights are in order, which is why all the tests that use the state service or finalized state start at a genesis block.

But you're right, there's no check for any absolute activation block height.

Regarding using Option, the big question is: when to change it from None to Some(Default::default()) (i.e. the empty tree). I see the following options:

Don't use Option. Always create the empty trees when instantiating objects (FinalizedState, Chain). This is the easiest. Drawbacks:

As @teor2345 pointed out this could hide a possible bug that attempts to use the trees before they're allowed to. However I think that's easier to validate elsewhere (e.g. don't allow Orchard transactions before Orchard activation).

We already validate orchard activation in the transaction::Verifier, so we don't need to repeat it in the state:

zebra/zebra-consensus/src/transaction.rs

Lines 336 to 343 in 6df17ff

    
               /// Verifies if a V5 `transaction` is supported by `network_upgrade`. 
        
               fn verify_v5_transaction_network_upgrade( 
        
                   transaction: &Transaction, 
        
                   network_upgrade: NetworkUpgrade, 
        
               ) -> Result<(), TransactionError> { 
        
                   match network_upgrade { 
        
                       // Supports V5 transactions 
        
                       NetworkUpgrade::Nu5 => Ok(()),

Zebra's V5 transactions are the only transaction version that contains orchard actions, so we won't be asking for orchard roots until the v5 transaction check passes.

Sprout was active from genesis, and sapling is active as of Zebra's Canopy checkpoint. So we don't need to do any activation validation for them.

* More DB writes. This can de dealt with separately, though (e.g. only write the tree if it has changed).

The DB is cached in RAM, so we don't need to limit writes in this PR. (Unless it's really easy.)
We can just open a follow-up ticket.

Use Option and change from None to Some lazily (i.e. when the first note is added to the tree). Drawbacks:

Transactions that anchor on the empty tree won't work because the tree doesn't exist yet and its anchor wasn't added before. This would require special-case handling.

I don't think special-case handling is a good idea - we'd have to get it right for both block validation and mempool validation. And test it for both cases.

Use Option and change from None to Some on the correct activation height. Drawbacks:

Can make testing difficult since the test will need to generate blocks with the expected heights (seems impossible), or we will need to change the activation heights only for the tests.

The correct activation height is actually the commit of the block before the activation block, because the anchors need to be available to mempool transactions and transactions in the activation block:
https://discord.com/channels/809218587167293450/811004767043977288/867179258525515838

This needs a spec clarification.

After trying to use Option in the PR I'm more inclined to just not using it (first option).

I agree, I think we'd be effectively duplicating validation we've done already in the transaction::Verifier. And it gets really tricky handling mempool and block transactions.

dconnolly · 2021-07-22T05:38:59Z

@teor2345: my concern is, for example, if we only create the Sapling tree in the Sapling activation height, then in order to test it in the finalized state then we will need to create more than 100K blocks which is unfeasible. From what I understand in the nullifiers test this doesn't happen because there is no check anywhere the test flow regarding the block height.

Regarding using Option, the big question is: when to change it from None to Some(Default::default()) (i.e. the empty tree). I see the following options:

Don't use Option. Always create the empty trees when instantiating objects (FinalizedState, Chain). This is the easiest. Drawbacks:

As @teor2345 pointed out this could hide a possible bug that attempts to use the trees before they're allowed to. However I think that's easier to validate elsewhere (e.g. don't allow Orchard transactions before Orchard activation).

More DB writes. This can de dealt with separately, though (e.g. only write the tree if it has changed).

Use Option and change from None to Some lazily (i.e. when the first note is added to the tree). Drawbacks:

Transactions that anchor on the empty tree won't work because the tree doesn't exist yet and its anchor wasn't added before. This would require special-case handling.

Use Option and change from None to Some on the correct activation height. Drawbacks:

Can make testing difficult since the test will need to generate blocks with the expected heights (seems impossible), or we will need to change the activation heights only for the tests.

After trying to use Option in the PR I'm more inclined to just not using it (first option).

I'm fine with instantiating all the trees from the beginning, as teor said we have several checks in place to ensure they will not be populated (besides the Default::default() / empty tree) until needed, and I think that's fine, if any accidental queries to them come back with a result besides the root of the empty tree, that should correctly fail, unless that is the first sprout/sapling/orchard transaction correctly looking for the root of the empty tree.

upbqdn · 2021-12-13T09:50:05Z

It looks like we have implemented all tasks in this issue.

dconnolly · 2021-12-13T19:51:58Z

Done!

dconnolly added C-design Category: Software design work S-needs-design Status: Needs a design decision C-enhancement Category: This is an improvement S-needs-triage Status: A bug report needs triage labels Jun 30, 2021

dconnolly self-assigned this Jun 30, 2021

dconnolly mentioned this issue Jun 30, 2021

Add sapling and orchard anchors to zebra-state #1320

Closed

4 tasks

dconnolly added A-consensus Area: Consensus rule updates A-rust Area: Updates to Rust code and removed C-enhancement Category: This is an improvement labels Jun 30, 2021

dconnolly mentioned this issue Jul 2, 2021

Add notes about note commitment trees storage #2259

Merged

1 task

mpguerra removed the S-needs-triage Status: A bug report needs triage label Jul 5, 2021

This was referenced Jul 5, 2021

Calculate incremental note commitment trees #2407

Merged

Track anchors and note commitment trees in zebra-state #2458

Merged

dconnolly mentioned this issue Jul 26, 2021

Validate Orchard and Sapling anchors in non-finalized state #2530

Closed

3 tasks

mpguerra added this to the 2021 Sprint 22 milestone Oct 13, 2021

teor2345 changed the title ~~Design: Validate note commitment trees in state~~ Validate note commitment trees in state Oct 13, 2021

mpguerra unassigned dconnolly Oct 14, 2021

mpguerra modified the milestones: 2021 Sprint 22, 2021 Sprint 23 Oct 14, 2021

teor2345 added the P-Medium label Nov 4, 2021

mpguerra assigned upbqdn and unassigned upbqdn Nov 8, 2021

teor2345 changed the title ~~Validate note commitment trees in state~~ Validate note commitment trees in non-finalized state Nov 8, 2021

mpguerra mentioned this issue Nov 9, 2021

Validate Sprout anchors in non-finalized state #3043

Closed

2 tasks

mpguerra assigned dconnolly and unassigned dconnolly Nov 15, 2021

dconnolly removed the S-needs-design Status: Needs a design decision label Nov 22, 2021

mpguerra modified the milestones: 2021 Sprint 23, 2021 Sprint 24 Nov 29, 2021

dconnolly closed this as completed Dec 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate note commitment trees in non-finalized state #2425

Validate note commitment trees in non-finalized state #2425

dconnolly commented Jun 30, 2021 •

edited by upbqdn

Loading

conradoplg commented Jul 16, 2021 •

edited

Loading

teor2345 commented Jul 19, 2021 •

edited

Loading

conradoplg commented Jul 19, 2021

teor2345 commented Jul 19, 2021

conradoplg commented Jul 21, 2021

teor2345 commented Jul 22, 2021 •

edited

Loading

dconnolly commented Jul 22, 2021

upbqdn commented Dec 13, 2021

dconnolly commented Dec 13, 2021

Validate note commitment trees in non-finalized state #2425

Validate note commitment trees in non-finalized state #2425

Comments

dconnolly commented Jun 30, 2021 • edited by upbqdn Loading

Is your feature request related to a problem?

Questions:

Non-finalized state

Maintaining note commitment trees

Update Chain.update_chain_state_with()

Finalized state

FinalizedBlock

conradoplg commented Jul 16, 2021 • edited Loading

teor2345 commented Jul 19, 2021 • edited Loading

conradoplg commented Jul 19, 2021

teor2345 commented Jul 19, 2021

conradoplg commented Jul 21, 2021

teor2345 commented Jul 22, 2021 • edited Loading

dconnolly commented Jul 22, 2021

upbqdn commented Dec 13, 2021

dconnolly commented Dec 13, 2021

dconnolly commented Jun 30, 2021 •

edited by upbqdn

Loading

Update `Chain.update_chain_state_with()`

`FinalizedBlock`

conradoplg commented Jul 16, 2021 •

edited

Loading

teor2345 commented Jul 19, 2021 •

edited

Loading

teor2345 commented Jul 22, 2021 •

edited

Loading