-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compute notes root #64
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Looks good! Not a full review yet, but I left a few comments inline - the main one is about the structure of leaf nodes in the batch created note SMT.
let created_notes_smt = { | ||
let created_notes: Vec<_> = txs | ||
.iter() | ||
.flat_map(|tx| tx.created_notes()) | ||
.map(|note_envelope| note_envelope.note_hash()) | ||
.collect(); | ||
|
||
if created_notes.len() > MAX_NUM_CREATED_NOTES_PER_BATCH { | ||
return Err(BuildBatchError::TooManyNotesCreated(created_notes.len())); | ||
} | ||
|
||
SimpleSmt::with_leaves( | ||
CREATED_NOTES_SMT_DEPTH, | ||
created_notes.into_iter().enumerate().map(|(idx, note_hash)| { | ||
( | ||
idx.try_into().expect("already checked not for too many notes"), | ||
note_hash.into(), | ||
) | ||
}), | ||
)? | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is slightly off. I believe the leaves in created note SMT are computed as hash(note_hash || note_metadata)
. This also means that the length of the path from not root to note hash is 21.
Also, nit: I'd probably move this logic into a separate helper function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will adjust. But why aren't we instead using a tree of depth 20, but where the leaf nodes' hash is computed as hash(note_hash || note_metadata)
(easily computed from a NoteEnvelope
)?
The current structure is a bit awkward, since
- the depth of the tree is not linked with the number of notes anymore
- i.e.
2^21 != max number of notes
- i.e.
- leaves always come in pair, as opposed to being independent (as we typically think of leaves)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why aren't we instead using a tree of depth 20, but where the leaf nodes' hash is computed as
hash(note_hash || note_metadata)
(easily computed from aNoteEnvelope
)?
Good question. Let's discuss this - maybe indeed using a tree of depth 20 will simplify things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed - the batches now have depth 13, and each NoteEnvelope
is stored as 2 nodes in the tree.
Also, nit: I'd probably move this logic into a separate helper function.
I personally disagree with this. IMO, unless a piece of code is used more than once (or is really long/complex), it shouldn't be its own function. Mainly because naming is hard; so you're unlikely to find good names for the helper functions, and ultimately as a reader of the code, I need to scroll out of context to go check the details of the function.
I actually find that Rust making code blocks expressions (such that you can define vars as let v = { ... };
is one of Rust's best "small" features, exactly because you no longer need to define many helper functions - the indentation helps visually separate the logic necessary to define v
, and temporary variables are not accessible outside the block. This is IMO an improvement on javascript's self-executing functions, where I would write var v = (function() { ... })();
to achieve the same result (i.e. avoiding the need to name helper functions, temporary variable cleanup, and keeping the logic near its context).
Would love your thoughts on this - ideally, we'd adopt a consistent convention across our code bases (i.e. use of blocks vs use of helper functions for cases like this one).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah - interesting! I haven't thought about it this way before.
I usually try to move code out into dedicated functions when it encompass some non-trivial well-encapsulated logic. I think this improves readability because it clearly defines interfaces and reduces the number of things someone needs to understand when reading code.
In the above example, understanding how notes SMT is built is not material to understanding how a transaction batch is instantiated (we just need to know that we need to build a note SMT) - so, in my mind it made sense to move it to a separate function. On the other hand, the code is simple enough that we can probably keep it as is.
Overall, I try not to have functions which span more than 50 - 60 lines - though, it doesn't apply to this case either as the whole function is just 40.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this improves readability because it clearly defines interfaces and reduces the number of things someone needs to understand when reading code.
Right, this is probably the best counterpoint to my argument, and yes that's fair.
On the other hand, the code is simple enough that we can probably keep it as is.
Right, it's now quite simple. But I agree that if it were a bit more complex, I would move to a helper function, and I'll adopt that convention moving forward.
Update: I inlined the masm function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thank you! I left a couple of very minor comments inline.
let created_notes_smt = { | ||
let created_notes: Vec<_> = txs | ||
.iter() | ||
.flat_map(|tx| tx.created_notes()) | ||
.map(|note_envelope| note_envelope.note_hash()) | ||
.collect(); | ||
|
||
if created_notes.len() > MAX_NUM_CREATED_NOTES_PER_BATCH { | ||
return Err(BuildBatchError::TooManyNotesCreated(created_notes.len())); | ||
} | ||
|
||
SimpleSmt::with_leaves( | ||
CREATED_NOTES_SMT_DEPTH, | ||
created_notes.into_iter().enumerate().map(|(idx, note_hash)| { | ||
( | ||
idx.try_into().expect("already checked not for too many notes"), | ||
note_hash.into(), | ||
) | ||
}), | ||
)? | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why aren't we instead using a tree of depth 20, but where the leaf nodes' hash is computed as
hash(note_hash || note_metadata)
(easily computed from aNoteEnvelope
)?
Good question. Let's discuss this - maybe indeed using a tree of depth 20 will simplify things.
# Stack: [<account root inputs>, <note root inputs>] | ||
proc.main.2 | ||
exec.compute_account_root loc_storew.0 dropw | ||
#=> [<note root inputs>] | ||
|
||
exec.compute_note_root loc_storew.1 dropw | ||
#=> [ ] | ||
|
||
# Load output on stack | ||
loc_loadw.1 padw loc_loadw.0 | ||
#=> [ ACCOUNT_ROOT, NOTE_ROOT] | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to have a dedicated main
function rather than put the same code into begin ... end
section? The only thing that would need to change is that we'd use global memory (i.e., via mem_storew
) rather than local procedure memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked to let the assembler figure out where to store my objects as opposed to me needing to define my own constants. Basically, I prefer reading loc_storew.0
than mem_storew.ACCOUNT_ROOT_PTR
, especially when used as one liners:
// less verbose
loc_loadw.1 padw loc_loadw.0
vs
// more verbose, but at the same time, a bit more clear...
mem_loadw.NOTE_ROOT_PTR padw mem_loadw.ACCOUNT_ROOT_PTR
This is not a strong preference though; I'm happy to change it to use global memory instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use inline constants with mem_storew
as well. Basically, to re-write the main
procedure as begin ... end
block, we'd need to do this:
begin
exec.compute_account_root mem_storew.0 dropw
#=> [<note root inputs>]
exec.compute_note_root mem_storew.1 dropw
#=> [ ]
# Load output on stack
mem_loadw.1 padw mem_loadw.0
#=> [ ACCOUNT_ROOT, NOTE_ROOT]
end
Btw, a slightly more efficient way to write this would be:
begin
exec.compute_account_root mem_storew.0 dropw
#=> [<note root inputs>]
exec.compute_note_root
#=> [NOTE_ROOT]
# Load output on stack
padw mem_loadw.0
#=> [ ACCOUNT_ROOT, NOTE_ROOT]
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use inline constants with mem_storew as well.
ah I didn't know you could do that!
Fixed
Closes: #67
Builds on #55
This builds the notes tree as discussed in 0xPolygonMiden/crypto#220, i.e. as a tree of depth 20, where batches' created notes tree roots are inserted at depth 8.