feat: Add `prehash_compare_key` to allow proving nonexistence in sparse trees #136

plaidfinch · 2023-02-22T21:45:48Z

Motivation

There are several important merkle trees that hash keys and sparsely store values at positions indicated by the value of the hash, such as the Cosmos Sparse Merkle Tree and Libra/Diem/Aptos/Penumbra's Jellyfish Merkle Tree. We would like this type of tree to be compatible with ICS23.

For existence proofs, this is already the case; however, currently, when verifying a ICS23 non-existence proof, keys are compared based on the preimage of the hash function used to prehash them, even when prehash_key is set in the ProofSpec. This means that when a ProofSpec uses prehash_key, nonexistence proofs cannot be verified even if a client correctly generates them as per the "spirit" of the ICS23 nonexistence proof format. We believe that this is a bug, but this change is framed backwards-compatibly, as noted below, so that it can be applied universally without friction.

See here for other discussion of the impact of this issue:

Support removal of key/value pairs penumbra-zone/jmt#24 (comment) (Note: We think that ibc-go can't make any assumption about the keys being hashed or unhashed, since that's not part of its domain; that's the domain of the proof spec. However, this description of the issue is pointing at the same problem that this PR resolves.)
Key comparison in proof generated from SMT #83 is the corresponding issue for the SMT (Note: In the current SMT implementation, keys are prehashed entirely externally to the proof and its spec, and the relationship between key and hashed key is proven externally to the proof spec. We believe that this ends up adding implementation burden for counterparty chains, who have to check the additional constraint outside of ICS23 verification that the key being proven matches the hash claimed. Instead of this, the JMT proof spec could be adapted to the SMT, allowing the SMT to internalize the prehashing and avoid the difficulty of managing prehashing externally to the proof spec. @cwgoes: does this seem like something you would want to do?)

Summary of changes

This PR introduces one single boolean new parameter to the top-level ProofSpec: prehash_compare_key. When set to true, this flag causes keys to be consistently compared lexicographically according to their hashes within nonexistence proof verification, using the same hash function as specified by the already-extant prehash_key field.

This is intended as an alternative to #88

We believe this change is the minimum necessary to unblock nonexistence proofs on key-prehashed structures. #88 also attempts to solve this problem, namely that for nonexistence proofs only, any tree that uses prehashing cannot prove nonexistence because the nonexistence proof verifier compares the keys lexically by their preimage, ignoring the prehash_key field of the specification.

However, as currently implemented, #88 does not accomplish this goal. There are 3 issues with #88 as a solution to this problem, which this PR addresses:

It does not actually change the verification procedure for nonexistence proofs to lexically compare all keys by their hash; rather, it lexically compares only the hash of the input key to unhashed neighbor keys. This will not work correctly for any general cryptographic hash function, because comparing a cryptographic hash to an element of its preimage is, effectively, random.
It doesn't merely supply a flag opting into prehashed key comparison; it allows you to specify an entirely different hash function to use for the comparison. We believe this is not representative of any known use case: if keys are prehashed using hash function H, then a different hash function H' can't be used to compute a meaningful comparison on keys, so this is more complex of a specification than necessary.
It also introduces a prehash_compared_value field, which is not necessary to fix this specific issue, and like its prehash_compared_key field, this too is a specification of an arbitrary hash function, which for the same reasons above, we believe is overly general.

Backwards-compatibility

This is a backwards-compatible change, as it requires opt-in via setting the prehash_compare_key flag to true in the ProofSpec. All existing ProofSpecs will continue to behave identically.

Thanks

We would like to thank everyone who participated in discussion and development of #88 and other work towards solutions to this issue. We feel very confident that this is the right way forward, but we want to ensure that our contribution does not make anyone feel like their work is unappreciated; to the contrary, the discussion and work leading up to this PR, by everyone, has been necessary to clarify our understanding of the issue. Thanks everyone for your help!

prehash_compare_key indicates whether to compare the keys lexicographically according to their _hashed_ values (implied by the hash function given by prehash_key). This is required for nonexistence proofs in proof specs that use prehashing.

AdityaSripal · 2023-02-23T11:21:20Z

Thank you for including the justifications to include this change over #88, all of them make sense to me.

Will review today!

AdityaSripal

Agreed with this direction!! But it looks like this PR is incomplete?

My understanding is that the key in ExistenceProof.Key will still be the preimage, hence why you also hash the rightKey and leftKey before comparison.

However, you are not hashing the passed-in key for existence proofs before comparison?

There is a check on ics23.go that is changed in #88 that seems like it would also need to be changed here.

473d9a5#diff-39a55415fc38a90b85c05989f293fda3b7ee126010cfe63838fd9b8441e47ed1R39

473d9a5#diff-39a55415fc38a90b85c05989f293fda3b7ee126010cfe63838fd9b8441e47ed1R59

I've also suggested a different naming for the field that at least me and Colin found more intuitive

AdityaSripal · 2023-02-23T11:26:38Z

proto/cosmos/ics23/v1/proofs.proto

+  // prehash_compare_key is a flag that indicates whether to use the
+  // prehash_key specified by LeafOp to compare lexical ordering of keys for
+  // non-existence proofs.
+  bool prehash_compare_key = 5;


I think a clearer name might be helpful here similar to our review of #88.

So effectively the database is providing an interface to the application: store.Set(key, value)

and underneath the hood it is merklizing this pair by storing the hash of the application-provided key as the key in the merkle tree.

Thus, we liked the nomenclature: appKey and treeKey.

The appKey is the key known to the application, the treeKey is the key stored in the tree.

In SMT, the treeKey is the hash of the appKey, while in iAVL they are the same.

So I think prehash_app_key would be clearer here as a name. And we can use the docs here to explain there may be a difference between appKey and treeKey for some trees

This is a bit of a bikeshed, but if there's going to be a distinction between app_key and tree_key, wouldn't it be even clearer to call the field compare_tree_key? Then compare_tree_key = false means it compares the app key, and compare_tree_key = true means it compares the tree key.

Ok now that I understand more of this field.

Perhaps we can call it: prehash_key_before_comparison

The way I read the current field name at the moment, I expect it to be a HashOp type not a boolean.

cc: @colin-axner

AdityaSripal · 2023-02-23T11:29:29Z

go/proof.go

@@ -204,6 +204,14 @@ func (p *ExistenceProof) CheckAgainstSpec(spec *ProofSpec) error {
 	return nil
 }

+func keyForComparison(spec *ProofSpec, key []byte) []byte {


You also will have to use this function on getExistProofForKey and getNonexistProofForKey correct?

473d9a5#diff-39a55415fc38a90b85c05989f293fda3b7ee126010cfe63838fd9b8441e47ed1R39

Yes, good catch

but only on getNonexistProof, because the lexical comparison is only relevant for non-existence proofs

plaidfinch · 2023-02-23T22:34:31Z

Thanks for the careful review @AdityaSripal!

My understanding is that the key in ExistenceProof.Key will still be the preimage, hence why you also hash the rightKey and leftKey before comparison.

However, you are not hashing the passed-in key for existence proofs before comparison?

In existence proofs, the only comparison is for equality, not for lexical ordering, which means that comparing either the preimage of the hashed key, or the hash, will work equally well (because we can assume that H(x) == H(y) implies x == y up to computational intractability). To ensure that this change is minimal, we do not alter anything about existence proof verification, because it is not necessary.

By contrast, nonexistence proofs require lexical ordering comparison on keys (or their hashes), which this PR implements, for nonexistence proofs only.

There is a check on ics23.go that is changed in #88 that seems like it would also need to be changed here.

Only one of the two links is relevant to the changes we propose here: we do not change anything about existence proofs, so the change noted from #88 to existence proofs is not relevant:

473d9a5#diff-39a55415fc38a90b85c05989f293fda3b7ee126010cfe63838fd9b8441e47ed1R39

In regards to the second piece, the below line of code adds an extra layer of hashing to the key at the top level, which is not necessary or sufficient to verify nonexistence proofs:

473d9a5#diff-39a55415fc38a90b85c05989f293fda3b7ee126010cfe63838fd9b8441e47ed1R59

Instead, our implementation exhibits the same prehashing behavior in Go as the original Rust implementation: it changes the lexical ordering comparisons isLeft and isRight to operate according to the hashing described by the prehash_key field, if and only if the opt-in flag in the top-level proof spec is set. Good catch noticing that we missed this part of the Go implementation: this is now fixed, per the diff linked above in this paragraph.

I've also suggested a different naming for the field that at least me and Colin found more intuitive

We're fine with whatever naming works well for you, provided it is well-documented. We are partial to compare_tree_key, as suggested by @hdevalence, but de gustibus non est disputandum.

A high-level note on where some of this confusion may originate: there are two notions of "prehashed key comparison", as below:

As in Add new parameters to compare the given key and value for SMT proof #88, you assume that the top-level key in the proof has an extra layer of hashing already applied to it externally to the proof and proof spec. This is not actually useful for making nonexistence proofs work correctly for sparse merkle trees, for the reasons noted in the original issue.
As in this PR, you don't make any exogenous assumptions about the hashing that has been already applied to the top-level key outside of the proof and proof spec; rather, you incorporate the already-specified prehashing function into the verification of nonexistence proofs, by causing lexical comparison of keys within verifying those proofs to operate on the hash of keys rather than the keys themselves. The fact that this wasn't already the case for nonexistence proofs is, we posit, a bug: this change can be seen not as a true piece of new functionality for ICS23, but a rectification of a gap between the intent of the system design and its implementation. For example, existence proofs correctly respect the prehash_key field; for this to be inconsistently applied in different places throughout verification is not useful in any context.

It's worth keeping in mind that you can still add something like #88 on top of this change, but it's not necessary to do so in order to unblock the ability to use ICS23 nonexistence proofs with sparse merkle trees, just as you already can correctly use ICS23 existence proofs with sparse merkle trees. We suspect that #88 is not as useful once this PR is merged, because we think (but are not certain) that its originating motivation came from an attempt to work around the very bug that this PR resolves.

AdityaSripal · 2023-02-24T17:12:36Z

proto/cosmos/ics23/v1/proofs.proto

+  // prehash_compare_key is a flag that indicates whether to use the
+  // prehash_key specified by LeafOp to compare lexical ordering of keys for
+  // non-existence proofs.
+  bool prehash_compare_key = 5;


Ok now that I understand more of this field.

Perhaps we can call it: prehash_key_before_comparison

The way I read the current field name at the moment, I expect it to be a HashOp type not a boolean.

cc: @colin-axner

AdityaSripal · 2023-02-24T17:13:50Z

go/proof.go

+	if !spec.PrehashCompareKey {
+		return key
+	}
+	hash, _ := doHashOrNoop(spec.LeafSpec.PrehashKey, key)


Since this is using the PrehashKey defined in the leafspec and the key in the existenceProof is supposed to now be unhashed for SMT proofs, I think the SMT spec needs to be changed to have PrehashKey=SHA_256 instead of NO_HASH

I think the SMT should mirror the JMT's proof spec:

https://github.com/penumbra-zone/jmt/blob/main/src/tree/ics23_impl.rs#L219

Right so the SMT proof spec in this repo need to be changed

The code that generates the proofs will also need to be updated following the spec change

ghost · 2023-02-28T10:34:24Z

Backwards compatibility nonwithstanding, it seems like applications (if they can) should always set prehash_key_before_comparison to true (even in the absence of actual prehashing) because the false behavior is (as described) inconsistent/not intuitive and/or "buggy"; if so, perhaps this can be documented in ProofSpec.

cwgoes · 2023-02-28T13:23:06Z

@plaidfinch Thanks for this! I can tell you guys have thought this through. I think that this solution should work for us as well.

AdityaSripal · 2023-02-28T14:06:44Z

Backwards compatibility nonwithstanding, it seems like applications (if they can) should always set prehash_key_before_comparison to true (even in the absence of actual prehashing) because the false behavior is (as described) inconsistent/not intuitive and/or "buggy"; if so, perhaps this can be documented in ProofSpec.

I don't see how this is true. If the key is being hashed in order to create the LeafHash, but the key is ordered lexicographically on the key itself (IAVL, Trie, etc), then prehash_key_before_comparison should be false

AdityaSripal

Ack pending test fixes.

Great work to everyone involved!!

avahowell · 2023-02-28T22:31:32Z

Backwards compatibility nonwithstanding, it seems like applications (if they can) should always set prehash_key_before_comparison to true (even in the absence of actual prehashing) because the false behavior is (as described) inconsistent/not intuitive and/or "buggy"; if so, perhaps this can be documented in ProofSpec.

Yes, the only reason to have a flag in this case is to maintain strict backwards-compatibility in my opinion. If you have prehash_key_before_comparison: true on a tree with no prehashing, it's the same thing as having prehash_key_before_comparison: false, since this change just inherits the hashing specified by prehash_key.

Where the change would be breaking without the flag is in the case where you use prehash_key, but don't want to compare keys according to the hashing algorithm specified by prehash_key. As we mentioned in the OP, this condition seems like a bug to us. But we kept the flag to maintain strict backwards compatibility

avahowell · 2023-03-21T01:24:28Z

I've updated this code to include the changes for the smt_spec required for the SMT. The next step is to update the SMT's proof-generating code itself to be compatible with this new spec.

codecov · 2023-03-27T10:48:10Z

Codecov Report

Patch coverage: 56.56% and project coverage change: +11.19 🎉

Comparison is base (f4deb05) 39.35% compared to head (64a5c0e) 50.54%.

Additional details and impacted files

@@             Coverage Diff             @@
##           master     #136       +/-   ##
===========================================
+ Coverage   39.35%   50.54%   +11.19%     
===========================================
  Files          16       23        +7     
  Lines        6286     8034     +1748     
  Branches       85       86        +1     
===========================================
+ Hits         2474     4061     +1587     
- Misses       3456     3616      +160     
- Partials      356      357        +1

Flag	Coverage Δ
go	`38.26% <27.45%> (+0.15%)`	⬆️
rust	`92.15% <65.43%> (?)`
typescript	`42.03% <68.18%> (+0.21%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
go/proofs.pb.go	`31.65% <0.00%> (+0.13%)`	⬆️
rust/src/cosmos.ics23.v1.rs	`17.18% <0.00%> (ø)`
js/src/generated/codecimpl.js	`25.29% <58.00%> (-0.02%)`	⬇️
go/proof.go	`59.34% <75.00%> (+0.91%)`	⬆️
rust/src/verify.rs	`94.88% <86.66%> (ø)`
rust/src/api.rs	`97.11% <95.12%> (ø)`
go/ics23.go	`88.11% <100.00%> (ø)`
js/src/ics23.ts	`71.42% <100.00%> (+0.59%)`	⬆️
js/src/proofs.ts	`78.50% <100.00%> (+1.05%)`	⬆️
js/src/testvectors.spec.ts	`99.04% <100.00%> (+0.05%)`	⬆️
... and 1 more

... and 3 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

colin-axner

ACK for the concept and the go code changes. I like this solution a lot! Excellent work y'all :)

go/proof.go

go/ics23.go

plaidfinch · 2023-03-28T21:33:22Z

@AdityaSripal We've fixed the SMT proof spec in this PR, and we are happy to fix the SMT test vectors, but we are not sure how these test vectors are being generated. Could you shed some light on that so we can finish fixing the tests and push this PR over the finish line?

As of now, we think we've tracked down how they're being generated to here. Can you confirm that this was the method used?

avahowell · 2023-03-28T23:18:18Z

i updated the smt test vectors to be correctly generated per this change, and they should verify now. i generated the test vectors using a modified version of the smt in the store/v2 branch of the cosmos sdk, which I can PR if desired. Here's the branch, the change required for the smt on the generation side is very small since it already stores a PreimageMap:

avahowell/cosmos-sdk@a3a049b

I believe that with the most recent changes, all the review comments have been addressed

AdityaSripal · 2023-03-29T11:44:03Z

Hi yes, please make a PR to the SDK to update the proofgen code there

colin-axner

Wahoo! Fantastic work to everyone involved! 🎉

Nice job updating the test vectors! I wasn't sure either how the test data was generated

(approving for go changes)

colin-axner · 2023-03-30T15:43:56Z

Would it be possible to eventually have test data vectors from Penumbra's JMT? It appears the referenced code generation has been removed from the SDK and I'm not sure the SMT implementation is being maintained

plaidfinch · 2023-03-30T15:58:10Z

Would it be possible to eventually have test data vectors from Penumbra's JMT? It appears the referenced code generation has been removed from the SDK and I'm not sure the SMT implementation is being maintained.

We could make some test vectors, yeah. If we match the ad-hoc JSON serialization format used by the SMT generation code, then I think it'd be as simple as checking proofs against them using the JMT spec instead of the SMT spec.

Would you want these test vectors included in this PR before merging it, or should we make a separate PR with them? It'd require a bit of implementation work to add the code to the JMT to make it spit out the test vectors, so my personal preference would be to merge this PR and make a separate PR later to swap in the JMT test vectors for the SMT ones.

romac

Rust version looks good! I just left one question regarding the removal of the derived Eq instance and a small nit, feel free to ignore.

rust/codegen/src/main.rs

rust/src/api.rs

colin-axner · 2023-04-05T15:36:29Z

Would you want these test vectors included in this PR before merging it, or should we make a separate PR with them? It'd require a bit of implementation work to add the code to the JMT to make it spit out the test vectors, so my personal preference would be to merge this PR and make a separate PR later to swap in the JMT test vectors for the SMT ones.

Let's do a separate pr 👍 I don't see a rush in adding the test vectors, just want to make sure that long term ics23 has an up to date testing framework/tests 😄

@romac

co-authored-by: @romac Co-authored-by: Romain Ruetschi <romain.ruetschi@gmail.com>

This reverts commit 69a38e8.

plaidfinch · 2023-04-07T21:20:32Z

The failing check appears to be a spurious service error for the Go code coverage tool. Could someone re-run it please? If I understand correctly, this branch is ready to be merged, and a new release cut, at this point. Anything else we can help out with?

romac · 2023-04-08T16:32:04Z

All good on the Rust side! I can do a release of the Rust crate on Tuesday if nobody beats me to it.

plaidfinch · 2023-04-08T17:47:38Z

All good on the Rust side! I can do a release of the Rust crate on Tuesday if nobody beats me to it.

Fantastic! Thanks all for your help bringing this to the finish line! 🎉

hdevalence · 2023-04-10T18:26:13Z

Hey, just checking in on this -- are we still good to merge this PR and cut a release?

Olshansk · 2023-04-13T02:55:15Z

Support for SMTs is going to go a long way. Thanks to everyone involved here!

add prehash_compare_key

6d7c1d5

prehash_compare_key indicates whether to compare the keys lexicographically according to their _hashed_ values (implied by the hash function given by prehash_key). This is required for nonexistence proofs in proof specs that use prehashing.

avahowell mentioned this pull request Feb 22, 2023

Add new parameters to compare the given key and value for SMT proof #88

Closed

plaidfinch changed the title ~~Add prehash_compare_key to allow proving nonexistence in sparse trees~~ feat: Add prehash_compare_key to allow proving nonexistence in sparse trees Feb 22, 2023

AdityaSripal requested changes Feb 23, 2023

View reviewed changes

use keyForComparison in getNonExistProofForKey

5b13544

apply keyForComparison in typescript verifyNonExistence

03cd990

hdevalence mentioned this pull request Feb 24, 2023

ICS23 support penumbra-zone/jmt#18

Closed

AdityaSripal requested changes Feb 24, 2023

View reviewed changes

avahowell added 2 commits February 27, 2023 13:40

update smt spec to use prehash_sha256

048493a

rename prehash_compare_key -> prehash_key_before_comparison

603f973

AdityaSripal reviewed Feb 28, 2023

View reviewed changes

colin-axner reviewed Mar 27, 2023

View reviewed changes

go/proof.go Outdated Show resolved Hide resolved

go/ics23.go Show resolved Hide resolved

plaidfinch added 3 commits March 28, 2023 17:09

fix: Make it compile in no_std by removing naming of Vec

c9a1984

doc: Add comment describing keyForComparison/key_for_comparison

ef2958b

doc: Elaborate more on why prehashing before comparison

0d29478

use updated smt test vectors

cc3a76d

AdityaSripal approved these changes Mar 29, 2023

View reviewed changes

colin-axner approved these changes Mar 30, 2023

View reviewed changes

colin-axner mentioned this pull request Mar 30, 2023

Add documentation on how to generate new test data vectors #138

Open

romac approved these changes Apr 4, 2023

View reviewed changes

rust/codegen/src/main.rs Show resolved Hide resolved

rust/src/api.rs Outdated Show resolved Hide resolved

plaidfinch and others added 4 commits April 5, 2023 11:52

Avoid cloning leaf_spec when prehashing keys for comparison

c997325

co-authored-by: @romac Co-authored-by: Romain Ruetschi <romain.ruetschi@gmail.com>

Fix missing Eq derive on prost build (reverting)

69a38e8

Revert "Fix missing Eq derive on prost build (reverting)"

b549e62

This reverts commit 69a38e8.

Fix fmt issue in CI, fix generated code to contain Eq impl

b97ba6f

romac approved these changes Apr 6, 2023

View reviewed changes

add testing for SMT proofs in js

64a5c0e

conorsch mentioned this pull request Apr 7, 2023

Tracking issue: IBC Implementation penumbra-zone/penumbra#454

Closed

32 tasks

crodriguezvega merged commit cea74ba into cosmos:master Apr 10, 2023

kevinji mentioned this pull request Apr 19, 2023

Bump ics23 to v0.10 to get prehash_key_before_comparison cosmos/ibc-rs#640

Closed

nicolaslara mentioned this pull request Sep 21, 2023

IBC minor version bump for penumbra support? osmosis-labs/osmosis#6482

Closed

crodriguezvega mentioned this pull request Sep 22, 2023

Key comparison in proof generated from SMT #83

Closed

crodriguezvega added a commit that referenced this pull request Sep 25, 2023

cherrypick #136

2405973

This was referenced Sep 25, 2023

Release v4.5 cosmos/ibc-go#4710

Closed

feat: cherrypick #136 #199

Merged

crodriguezvega added a commit that referenced this pull request Sep 28, 2023

cherrypick #136 (#199)

a532519

feat: Add prehash_compare_key to allow proving nonexistence in sparse trees #136

feat: Add prehash_compare_key to allow proving nonexistence in sparse trees #136

Conversation

plaidfinch commented Feb 22, 2023

Motivation

Summary of changes

This is intended as an alternative to #88

Backwards-compatibility

Contents

Thanks

AdityaSripal commented Feb 23, 2023

AdityaSripal left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

plaidfinch commented Feb 23, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ghost commented Feb 28, 2023

cwgoes commented Feb 28, 2023

AdityaSripal commented Feb 28, 2023

AdityaSripal left a comment

Choose a reason for hiding this comment

avahowell commented Feb 28, 2023 • edited Loading

avahowell commented Mar 21, 2023

codecov bot commented Mar 27, 2023 • edited Loading

Codecov Report

colin-axner left a comment

Choose a reason for hiding this comment

plaidfinch commented Mar 28, 2023 • edited Loading

avahowell commented Mar 28, 2023 • edited Loading

AdityaSripal commented Mar 29, 2023

colin-axner left a comment • edited Loading

Choose a reason for hiding this comment

colin-axner commented Mar 30, 2023

plaidfinch commented Mar 30, 2023

romac left a comment

Choose a reason for hiding this comment

colin-axner commented Apr 5, 2023

plaidfinch commented Apr 7, 2023

romac commented Apr 8, 2023

plaidfinch commented Apr 8, 2023

hdevalence commented Apr 10, 2023

Olshansk commented Apr 13, 2023 • edited Loading

feat: Add `prehash_compare_key` to allow proving nonexistence in sparse trees #136

feat: Add `prehash_compare_key` to allow proving nonexistence in sparse trees #136

avahowell commented Feb 28, 2023 •

edited

Loading

codecov bot commented Mar 27, 2023 •

edited

Loading

plaidfinch commented Mar 28, 2023 •

edited

Loading

avahowell commented Mar 28, 2023 •

edited

Loading

colin-axner left a comment •

edited

Loading

Olshansk commented Apr 13, 2023 •

edited

Loading