Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute notes root #64

Merged
merged 116 commits into from
Nov 29, 2023
Merged

Compute notes root #64

merged 116 commits into from
Nov 29, 2023

Conversation

plafer
Copy link
Contributor

@plafer plafer commented Nov 14, 2023

Closes: #67

Builds on #55

This builds the notes tree as discussed in 0xPolygonMiden/crypto#220, i.e. as a tree of depth 20, where batches' created notes tree roots are inserted at depth 8.

@plafer plafer marked this pull request as ready for review November 16, 2023 19:24
@plafer plafer requested a review from bobbinth November 16, 2023 19:24
@bobbinth bobbinth changed the base branch from main to plafer-block-builder November 20, 2023 22:36
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Looks good! Not a full review yet, but I left a few comments inline - the main one is about the structure of leaf nodes in the batch created note SMT.

block-producer/src/block_builder/errors.rs Outdated Show resolved Hide resolved
block-producer/src/batch_builder/mod.rs Outdated Show resolved Hide resolved
block-producer/src/batch_builder/mod.rs Outdated Show resolved Hide resolved
Comment on lines 60 to 80
let created_notes_smt = {
let created_notes: Vec<_> = txs
.iter()
.flat_map(|tx| tx.created_notes())
.map(|note_envelope| note_envelope.note_hash())
.collect();

if created_notes.len() > MAX_NUM_CREATED_NOTES_PER_BATCH {
return Err(BuildBatchError::TooManyNotesCreated(created_notes.len()));
}

SimpleSmt::with_leaves(
CREATED_NOTES_SMT_DEPTH,
created_notes.into_iter().enumerate().map(|(idx, note_hash)| {
(
idx.try_into().expect("already checked not for too many notes"),
note_hash.into(),
)
}),
)?
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is slightly off. I believe the leaves in created note SMT are computed as hash(note_hash || note_metadata). This also means that the length of the path from not root to note hash is 21.

Also, nit: I'd probably move this logic into a separate helper function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will adjust. But why aren't we instead using a tree of depth 20, but where the leaf nodes' hash is computed as hash(note_hash || note_metadata) (easily computed from a NoteEnvelope)?

The current structure is a bit awkward, since

  • the depth of the tree is not linked with the number of notes anymore
    • i.e. 2^21 != max number of notes
  • leaves always come in pair, as opposed to being independent (as we typically think of leaves)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why aren't we instead using a tree of depth 20, but where the leaf nodes' hash is computed as hash(note_hash || note_metadata) (easily computed from a NoteEnvelope)?

Good question. Let's discuss this - maybe indeed using a tree of depth 20 will simplify things.

Copy link
Contributor Author

@plafer plafer Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - the batches now have depth 13, and each NoteEnvelope is stored as 2 nodes in the tree.

Also, nit: I'd probably move this logic into a separate helper function.

I personally disagree with this. IMO, unless a piece of code is used more than once (or is really long/complex), it shouldn't be its own function. Mainly because naming is hard; so you're unlikely to find good names for the helper functions, and ultimately as a reader of the code, I need to scroll out of context to go check the details of the function.

I actually find that Rust making code blocks expressions (such that you can define vars as let v = { ... }; is one of Rust's best "small" features, exactly because you no longer need to define many helper functions - the indentation helps visually separate the logic necessary to define v, and temporary variables are not accessible outside the block. This is IMO an improvement on javascript's self-executing functions, where I would write var v = (function() { ... })(); to achieve the same result (i.e. avoiding the need to name helper functions, temporary variable cleanup, and keeping the logic near its context).

Would love your thoughts on this - ideally, we'd adopt a consistent convention across our code bases (i.e. use of blocks vs use of helper functions for cases like this one).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - interesting! I haven't thought about it this way before.

I usually try to move code out into dedicated functions when it encompass some non-trivial well-encapsulated logic. I think this improves readability because it clearly defines interfaces and reduces the number of things someone needs to understand when reading code.

In the above example, understanding how notes SMT is built is not material to understanding how a transaction batch is instantiated (we just need to know that we need to build a note SMT) - so, in my mind it made sense to move it to a separate function. On the other hand, the code is simple enough that we can probably keep it as is.

Overall, I try not to have functions which span more than 50 - 60 lines - though, it doesn't apply to this case either as the whole function is just 40.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this improves readability because it clearly defines interfaces and reduces the number of things someone needs to understand when reading code.

Right, this is probably the best counterpoint to my argument, and yes that's fair.

On the other hand, the code is simple enough that we can probably keep it as is.

Right, it's now quite simple. But I agree that if it were a bit more complex, I would move to a helper function, and I'll adopt that convention moving forward.

@plafer
Copy link
Contributor Author

plafer commented Nov 22, 2023

Update: I inlined the masm function mtree_8_set as it was trivial and actually made the code a bit harder to understand

@plafer plafer mentioned this pull request Nov 22, 2023
3 tasks
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I left a couple of very minor comments inline.

Comment on lines 60 to 80
let created_notes_smt = {
let created_notes: Vec<_> = txs
.iter()
.flat_map(|tx| tx.created_notes())
.map(|note_envelope| note_envelope.note_hash())
.collect();

if created_notes.len() > MAX_NUM_CREATED_NOTES_PER_BATCH {
return Err(BuildBatchError::TooManyNotesCreated(created_notes.len()));
}

SimpleSmt::with_leaves(
CREATED_NOTES_SMT_DEPTH,
created_notes.into_iter().enumerate().map(|(idx, note_hash)| {
(
idx.try_into().expect("already checked not for too many notes"),
note_hash.into(),
)
}),
)?
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why aren't we instead using a tree of depth 20, but where the leaf nodes' hash is computed as hash(note_hash || note_metadata) (easily computed from a NoteEnvelope)?

Good question. Let's discuss this - maybe indeed using a tree of depth 20 will simplify things.

block-producer/src/batch_builder/mod.rs Show resolved Hide resolved
Comment on lines 103 to 114
# Stack: [<account root inputs>, <note root inputs>]
proc.main.2
exec.compute_account_root loc_storew.0 dropw
#=> [<note root inputs>]

exec.compute_note_root loc_storew.1 dropw
#=> [ ]

# Load output on stack
loc_loadw.1 padw loc_loadw.0
#=> [ ACCOUNT_ROOT, NOTE_ROOT]
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to have a dedicated main function rather than put the same code into begin ... end section? The only thing that would need to change is that we'd use global memory (i.e., via mem_storew) rather than local procedure memory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked to let the assembler figure out where to store my objects as opposed to me needing to define my own constants. Basically, I prefer reading loc_storew.0 than mem_storew.ACCOUNT_ROOT_PTR, especially when used as one liners:

// less verbose
loc_loadw.1 padw loc_loadw.0

vs

// more verbose, but at the same time, a bit more clear...
mem_loadw.NOTE_ROOT_PTR padw mem_loadw.ACCOUNT_ROOT_PTR

This is not a strong preference though; I'm happy to change it to use global memory instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use inline constants with mem_storew as well. Basically, to re-write the main procedure as begin ... end block, we'd need to do this:

begin
    exec.compute_account_root mem_storew.0 dropw
    #=> [<note root inputs>]

    exec.compute_note_root mem_storew.1 dropw
    #=> [ ]

    # Load output on stack
    mem_loadw.1 padw mem_loadw.0
    #=> [ ACCOUNT_ROOT, NOTE_ROOT]
end

Btw, a slightly more efficient way to write this would be:

begin
    exec.compute_account_root mem_storew.0 dropw
    #=> [<note root inputs>]

    exec.compute_note_root
    #=> [NOTE_ROOT]

    # Load output on stack
    padw mem_loadw.0
    #=> [ ACCOUNT_ROOT, NOTE_ROOT]
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use inline constants with mem_storew as well.

ah I didn't know you could do that!

Fixed

block-producer/src/block_builder/prover/mod.rs Outdated Show resolved Hide resolved
block-producer/src/block_builder/prover/mod.rs Outdated Show resolved Hide resolved
Base automatically changed from plafer-block-builder to main November 27, 2023 15:28
@plafer plafer merged commit 43ab472 into main Nov 29, 2023
@bobbinth bobbinth deleted the plafer-note-root branch December 24, 2023 09:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Block Kernel: Compute notes root
2 participants