Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate note commitment trees in non-finalized state #2425

Closed
11 tasks done
dconnolly opened this issue Jun 30, 2021 · 9 comments
Closed
11 tasks done

Validate note commitment trees in non-finalized state #2425

dconnolly opened this issue Jun 30, 2021 · 9 comments
Assignees
Labels
A-consensus Area: Consensus rule updates A-rust Area: Updates to Rust code C-design Category: Software design work

Comments

@dconnolly
Copy link
Contributor

dconnolly commented Jun 30, 2021

Is your feature request related to a problem?

In order to maintain shielded note commitment trees, we will need to update our implementation of Zebra state for non-finalized chains, keep those trees up to date in the non-finalized state as we validate blocks, and save down the serialized trees to finalized state for the tip.

Questions:

Unanswered questions about the finalized and non-finalized state:

What keys do we need to be able to look up quickly? For transaction validation? For treestate updates?

This changes how we index treestates in the state.

The state already has sprout and sapling anchors, and sprout, sapling, and orchard nullifiers.

Are there any rules about which treestates can be used for which transactions?

This might change how we maintain (Sprout?) treestates in the non-finalized state.

For example, can Sprout, Sapling, and Orchard transactions use the previous block's treestate?
(There are a lot of references to "finalized" treestates in the rest of the RFC, but I'm not sure if they are the same as the "finalized" state.)

The rest of the RFC mentions that Sprout transactions can use the treestate of any previous transaction in the block.

Why do we need to store a separate Sprout treestate for every block? Can we skip blocks without Sprout transactions? Are there parts of the treestate data that don't change?

This impacts the indexes, storage, and lookups for those treestates.

What are the details of a Sprout treestate update?

This might change how we maintain (Sprout?) treestates in the non-finalized or finalized state.

Influences from the UTXO design

Referencing a root that we haven't seen before, because when we're verifying blocks in parallel, is that ok? Cryptographically ?

about using roots that we haven't produced ourselves yet, if you use an invalid root, the proof verification would fail, because the witness path in the tree would fail to match the root

Non-finalized state

Maintaining note commitment trees

  • Do the sprout implementations for the non-finalized state

  • Add (sprout|sapling|orchard)_note_commitment_tree: (sprout|sapling|orchard)::tree::NoteCommitmentTree data field to the Chain type

    • this stores the note commitment tree for each non-finalized tip
  • When adding a block to a non-finalized tip:

    • if there are any (Sprout|Sapling|Orchard) note commitments exposed by JoinSplits or Outputs or Actions in the transactions in the block, they must be appended to the respective note commitment tree in-order
    • append to trees using NoteCommitmentTree.append()
  • When forking a new side-chain:

    • clone the note commitment tree from the fork block
    • add from there, do not do any explicit rollback logic, just rebuild from known-good tree

Update Chain.update_chain_state_with()

  • Ensuring that transactions are processed in-block-order, then:
  • Update self.update_chain_state_with(joinsplit_data) to append any sprout::NoteCommitments
    self.update_chain_state_with(sapling_shielded_data_per_spend_anchor) to append any sapling::NoteCommitments
    self.update_chain_state_with(sapling_shielded_data_shared_anchor) to append any sapling::NoteCommitments
    self.update_chain_state_with(orchard_shielded_data) to append any orchard::NoteCommitments

Finalized state

FinalizedBlock

  • Do the sprout implementations for the finalized state

  • Add (sprout|sapling|orchard)_note_commitment_tree fields

    • These should be pre-computed by non-finalized state/ Chain, the output treestates as of the last transaction in the block

Database format:

  • Add a new RocksDB column family for each of Sprout, Sapling, and Orchard serialized note commitment trees for the finalized tip:
    • | sprout_incremental | BE32(height) | sprout::tree::NoteCommitmentTree | Delete |
    • | sapling_incremental | BE32(height) | sapling::tree::NoteCommitmentTree | Delete |
    • | orchard_incremental | BE32(height) | orchard::tree::NoteCommitmentTree | Delete |
  • Increment Zebra's state version

Additional context

See #2091 (comment) for further details

@dconnolly dconnolly added C-design Category: Software design work S-needs-design Status: Needs a design decision C-enhancement Category: This is an improvement S-needs-triage Status: A bug report needs triage labels Jun 30, 2021
@dconnolly dconnolly self-assigned this Jun 30, 2021
@dconnolly dconnolly added A-consensus Area: Consensus rule updates A-rust Area: Updates to Rust code and removed C-enhancement Category: This is an improvement labels Jun 30, 2021
@mpguerra mpguerra removed the S-needs-triage Status: A bug report needs triage label Jul 5, 2021
@conradoplg
Copy link
Collaborator

conradoplg commented Jul 16, 2021

After studying this and #2458 I see two options for the overall flow of the finalized state:

  • When finalizing a block, get the finalized tree from the DB (or a cached value?), add the note commitments from the block and then update the finalized tree and store the new anchor in the state. (This requires "redoing" the work on the tree since it was already done in the Chain, but we no longer have that treestate since the Chain only keeps the treestate of the chain tip)
  • Store each treestate (and matching anchor) in each PreparedBlock of the Chain, so that when a block is finalized we already have the data we need.

I think the first approach is better since it wastes less memory and the additional computation it requires is small. But it may lead to some code repetition / refactoring since we need to recalculate stuff that was already calculated in the Chain. I'll explore the first option for now.

Additional questions:

  • Should the trees be stored only when applicable? E.g. when commiting the genesis block, should we store empty Sprout, Sapling and Orchard trees, or should we first store them when applicable?

@teor2345
Copy link
Contributor

teor2345 commented Jul 19, 2021

After studying this and #2458 I see two options for the overall flow of the finalized state:

* When finalizing a block, get the finalized tree from the DB (or a cached value?), add the note commitments from the block and then update the finalized tree and store the new anchor in the state. (This requires "redoing" the work on the tree since it was already done in the Chain, but we no longer have that treestate since the Chain only keeps the treestate of the chain tip)

* Store each treestate (and matching anchor) in each PreparedBlock of the Chain, so that when a block is finalized we already have the data we need.

I think the first approach is better since it wastes less memory and the additional computation it requires is small. But it may lead to some code repetition / refactoring since we need to recalculate stuff that was already calculated in the Chain. I'll explore the first option for now.

I think the first option is fine - feel free to open a ticket for later optimisations!

Additional questions:

  • Should the trees be stored only when applicable? E.g. when commiting the genesis block, should we store empty Sprout, Sapling and Orchard trees, or should we first store them when applicable?

I'm not sure, here are some benefits and drawbacks of waiting until Sprout, Sapling, or NU5 activation:

Benefits of waiting until activation:

  • empty trees and their anchors are not available until the shielded pool is activated, so Zebra can't accidentally approve invalid transactions that depend on empty anchors
  • less CPU and fewer database writes

Drawbacks of waiting until activation:

  • Zebra might panic if it tries to verify a pool transfer before that shielded pool is activated (but that code would be a bug)
  • return types are more complex - they'll need to add an Option or Result layer

Based on those tradeoffs, I would prefer waiting - because verification correctness is a high priority.
What do you think?
Am I missing any benefits or drawbacks?

@conradoplg
Copy link
Collaborator

I'm not sure, here are some benefits and drawbacks of waiting until Sprout, Sapling, or NU5 activation:

Benefits of waiting until activation:

  • empty trees and their anchors are not available until the shielded pool is activated, so Zebra can't accidentally approve invalid transactions that depend on empty anchors
  • less CPU and fewer database writes

Drawbacks of waiting until activation:

  • Zebra might panic if it tries to verify a pool transfer before that shielded pool is activated (but that code would be a bug)
  • return types are more complex - they'll need to add an Option or Result layer

Based on those tradeoffs, I would prefer waiting - because verification correctness is a high priority.
What do you think?
Am I missing any benefits or drawbacks?

The code becomes a bit awkward yes, and I think it will make testing a bit difficult since it will be more difficult to "fake" some stuff. For example, with the finalized state, we will need to create a full chain with Sapling and Orchard "fake" activations in order to test it properly. I'm not sure if the current tests already do something similar, I need to check.

Other than that, I think it's a good idea. I've started to use Option for the trees in Chain. @dconnolly can you provide any feedback on this, if you can?

@teor2345
Copy link
Contributor

For example, with the finalized state, we will need to create a full chain with Sapling and Orchard "fake" activations in order to test it properly. I'm not sure if the current tests already do something similar, I need to check.

It might be worth looking at the nullifier tests from #2497, and the UTXO tests in one of my upcoming PRs. (Which has been delayed because I have to work out how to make other tests pass the UTXO checks.)

The nullifier tests commit the genesis block, then modify blocks 1 & 2 so they contain the nullifiers I want.

For the UTXO tests, I'll need to add test methods to the finalized and non-finalized states which add fake UTXOs.

We should be able to do something similar for the anchors, and just have test methods which add fake state?
As long as the non-test anchor initialisation code waits for the network upgrade, we should be fine in production. (We have existing integration tests which test that code in production.)

@conradoplg
Copy link
Collaborator

@teor2345: my concern is, for example, if we only create the Sapling tree in the Sapling activation height, then in order to test it in the finalized state then we will need to create more than 100K blocks which is unfeasible. From what I understand in the nullifiers test this doesn't happen because there is no check anywhere the test flow regarding the block height.


Regarding using Option, the big question is: when to change it from None to Some(Default::default()) (i.e. the empty tree). I see the following options:

  • Don't use Option. Always create the empty trees when instantiating objects (FinalizedState, Chain). This is the easiest. Drawbacks:
    • As @teor2345 pointed out this could hide a possible bug that attempts to use the trees before they're allowed to. However I think that's easier to validate elsewhere (e.g. don't allow Orchard transactions before Orchard activation).
    • More DB writes. This can de dealt with separately, though (e.g. only write the tree if it has changed).
  • Use Option and change from None to Some lazily (i.e. when the first note is added to the tree). Drawbacks:
    • Transactions that anchor on the empty tree won't work because the tree doesn't exist yet and its anchor wasn't added before. This would require special-case handling.
  • Use Option and change from None to Some on the correct activation height. Drawbacks:
    • Can make testing difficult since the test will need to generate blocks with the expected heights (seems impossible), or we will need to change the activation heights only for the tests.

After trying to use Option in the PR I'm more inclined to just not using it (first option).

@teor2345
Copy link
Contributor

teor2345 commented Jul 22, 2021

@teor2345: my concern is, for example, if we only create the Sapling tree in the Sapling activation height, then in order to test it in the finalized state then we will need to create more than 100K blocks which is unfeasible. From what I understand in the nullifiers test this doesn't happen because there is no check anywhere the test flow regarding the block height.

There is a check that the heights are in order, which is why all the tests that use the state service or finalized state start at a genesis block.

But you're right, there's no check for any absolute activation block height.

Regarding using Option, the big question is: when to change it from None to Some(Default::default()) (i.e. the empty tree). I see the following options:

  • Don't use Option. Always create the empty trees when instantiating objects (FinalizedState, Chain). This is the easiest. Drawbacks:

    • As @teor2345 pointed out this could hide a possible bug that attempts to use the trees before they're allowed to. However I think that's easier to validate elsewhere (e.g. don't allow Orchard transactions before Orchard activation).

We already validate orchard activation in the transaction::Verifier, so we don't need to repeat it in the state:

/// Verifies if a V5 `transaction` is supported by `network_upgrade`.
fn verify_v5_transaction_network_upgrade(
transaction: &Transaction,
network_upgrade: NetworkUpgrade,
) -> Result<(), TransactionError> {
match network_upgrade {
// Supports V5 transactions
NetworkUpgrade::Nu5 => Ok(()),

Zebra's V5 transactions are the only transaction version that contains orchard actions, so we won't be asking for orchard roots until the v5 transaction check passes.

Sprout was active from genesis, and sapling is active as of Zebra's Canopy checkpoint. So we don't need to do any activation validation for them.

* More DB writes. This can de dealt with separately, though (e.g. only write the tree if it has changed).

The DB is cached in RAM, so we don't need to limit writes in this PR. (Unless it's really easy.)
We can just open a follow-up ticket.

  • Use Option and change from None to Some lazily (i.e. when the first note is added to the tree). Drawbacks:

    • Transactions that anchor on the empty tree won't work because the tree doesn't exist yet and its anchor wasn't added before. This would require special-case handling.

I don't think special-case handling is a good idea - we'd have to get it right for both block validation and mempool validation. And test it for both cases.

  • Use Option and change from None to Some on the correct activation height. Drawbacks:

    • Can make testing difficult since the test will need to generate blocks with the expected heights (seems impossible), or we will need to change the activation heights only for the tests.

The correct activation height is actually the commit of the block before the activation block, because the anchors need to be available to mempool transactions and transactions in the activation block:
https://discord.com/channels/809218587167293450/811004767043977288/867179258525515838

This needs a spec clarification.

After trying to use Option in the PR I'm more inclined to just not using it (first option).

I agree, I think we'd be effectively duplicating validation we've done already in the transaction::Verifier. And it gets really tricky handling mempool and block transactions.

@dconnolly
Copy link
Contributor Author

@teor2345: my concern is, for example, if we only create the Sapling tree in the Sapling activation height, then in order to test it in the finalized state then we will need to create more than 100K blocks which is unfeasible. From what I understand in the nullifiers test this doesn't happen because there is no check anywhere the test flow regarding the block height.

Regarding using Option, the big question is: when to change it from None to Some(Default::default()) (i.e. the empty tree). I see the following options:

  • Don't use Option. Always create the empty trees when instantiating objects (FinalizedState, Chain). This is the easiest. Drawbacks:

    • As @teor2345 pointed out this could hide a possible bug that attempts to use the trees before they're allowed to. However I think that's easier to validate elsewhere (e.g. don't allow Orchard transactions before Orchard activation).
    • More DB writes. This can de dealt with separately, though (e.g. only write the tree if it has changed).
  • Use Option and change from None to Some lazily (i.e. when the first note is added to the tree). Drawbacks:

    • Transactions that anchor on the empty tree won't work because the tree doesn't exist yet and its anchor wasn't added before. This would require special-case handling.
  • Use Option and change from None to Some on the correct activation height. Drawbacks:

    • Can make testing difficult since the test will need to generate blocks with the expected heights (seems impossible), or we will need to change the activation heights only for the tests.

After trying to use Option in the PR I'm more inclined to just not using it (first option).

I'm fine with instantiating all the trees from the beginning, as teor said we have several checks in place to ensure they will not be populated (besides the Default::default() / empty tree) until needed, and I think that's fine, if any accidental queries to them come back with a result besides the root of the empty tree, that should correctly fail, unless that is the first sprout/sapling/orchard transaction correctly looking for the root of the empty tree.

@mpguerra mpguerra added this to the 2021 Sprint 22 milestone Oct 13, 2021
@teor2345 teor2345 changed the title Design: Validate note commitment trees in state Validate note commitment trees in state Oct 13, 2021
@mpguerra mpguerra assigned upbqdn and unassigned upbqdn Nov 8, 2021
@teor2345 teor2345 changed the title Validate note commitment trees in state Validate note commitment trees in non-finalized state Nov 8, 2021
@mpguerra mpguerra assigned dconnolly and unassigned dconnolly Nov 15, 2021
@dconnolly dconnolly removed the S-needs-design Status: Needs a design decision label Nov 22, 2021
@upbqdn
Copy link
Member

upbqdn commented Dec 13, 2021

It looks like we have implemented all tasks in this issue.

@dconnolly
Copy link
Contributor Author

Done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-consensus Area: Consensus rule updates A-rust Area: Updates to Rust code C-design Category: Software design work
Projects
None yet
Development

No branches or pull requests

5 participants