Refactor Public Inputs and add them to the transcript #134

davidnevadoc · 2022-04-14T11:18:54Z

In this PR we have added the Public Inputs to the transcript as the first step of the prove and verify algorithms. They have been added as the evaluations of the PI polynomial over the used domain (n field elements).

Close #133

Update `construct_dense_pi_vec` so it returns the evaluation representation of the PI polynomial over the used domain. This means that the result vector will always have a power of two length.

Co-authored-by: Joshua Fitzgerald <joshuabfitzgerald@gmail.com>

davidnevadoc · 2022-04-14T11:28:16Z

This method may not be best way of solving the issue so discussions on alternative solutions are welcome!

plonk-core/src/proof_system/proof.rs

plonk-core/src/proof_system/prover.rs

ghost · 2022-04-14T23:03:13Z

Halo 2 also includes the circuit description in the transcript, which I think is another good level of defense in depth. In general, my understanding of the Trail of Bits recommendation is that pretty much everything the verifier has access to should go in the transcript.

ghost · 2022-04-15T06:06:28Z

Per Alistair's suggestion on ZK study club telegram, it's not a bad idea to add the first element of the KZG SRS to the transcript as well. This is a bit tricky as it's not generic over the poly commit scheme

Update `construct_dense_pi_vec` so it returns the evaluation representation of the PI polynomial over the used domain. This means that the result vector will always have a power of two length.

Co-authored-by: Joshua Fitzgerald <joshuabfitzgerald@gmail.com>

…o add-pi-transcript

The new implmenentation uses BTreeMap to ensure no duplicates are introduced and to easily get an ordered representation. This is particualrly helpuful to get a unique representation to add to the transcript.

CPerezz · 2022-04-28T10:45:19Z

@joebebel makes sense.

Note that for the ordering, the hash will take care too. As we store the PI in a BTreeSet so therefore the order is preserved and therefore, ir prevents any order maleability.

lopeetall · 2022-04-28T13:25:31Z

The only issue I see with serializing the sparse form of the PI is that we would need to provide a spec for the serialization so that other implementations of the Verifier serialize PI in the same way.

Instead, why don't we just use the SRS to get a binding commitment the PI polynomial and add it to the transcript using the existing Merlin API wrapper? It may not be the most efficient solution as it requires doing another MSM, but we can still take advantage of the sparseness of the PI by skipping all the zeros. And since all we need is a binding commitment we won't need to use an FFT before MSM.

This method would use our existing tools without modification and is consistent with the rest of the protocol.

bhgomes · 2022-04-28T14:11:20Z

For example, there is another problem with the code (besides the Box::leak distraction) - it violates the property that no label is a prefix of another label.

Why is prefixing an issue? The documentation on merlin says that it's a requirement of the transcript being "parseable". What does this mean? We aren't parsing anything here.

CPerezz · 2022-04-28T14:55:13Z

The only issue I see with serializing the sparse form of the PI is that we would need to provide a spec for the serialization so that other implementations of the Verifier serialize PI in the same way.

In the verifier (Usually a Smart Contract) will probably be much more expensive to do the MSM than the hash. That's why I suggested that @lopeetall.

For now, we could start with something easy that solves the bug. And then move on into a better design.

Something like this as purposed by @davidnevadoc in private.

// For each PI != 0
transcript.append(b"pi_pos", &F::from(*pos));
transcript.append(b"pi_val", &F::from(val));

This is not the most optimal. But works for now. And we can discuss the approach we want to go for in a follow-up issue.

lopeetall · 2022-04-28T16:25:39Z

Can we reuse those labels pi_pos and pi_val though?

From the Merlin website:

The labels should be fixed, not defined at runtime, as runtime data should be in the message body, while the labels are part of the protocol definition. A sufficient condition for the transcript to be parseable is that the labels should be distinct and none should be a prefix of any other.

ghost · 2022-04-28T22:59:46Z

Why is prefixing an issue? The documentation on merlin says that it's a requirement of the transcript being "parseable". What does this mean? We aren't parsing anything here.

The absolute requirement is that the transcript is binding to the messages in it, that is, it is not feasible to find a different transcript with different labels/messages that gives the same challenges. The property that merlin promises is that if labels are prefix free, then the transcript is unambiguously parseable (a mathematical definition - there is no computation to parse the transcript), and if the transcript is parseable then it is message binding.

It could be that the transcript is message binding even if the labels are not prefix-free, but this requires an additional argument and can't be assumed from the guarantees that merlin provides. It's just much simpler to defer to merlin's requirements.

Instead, why don't we just use the SRS to get a binding commitment the PI polynomial and add it to the transcript using the existing Merlin API wrapper? It may not be the most efficient solution as it requires doing another MSM, but we can still take advantage of the sparseness of the PI by skipping all the zeros. And since all we need is a binding commitment we won't need to use an FFT before MSM.

Or, just serialize the PI polynomial? Though I think your original suggestion of serializing the PI directly is probably fine as well. I think the MSM is expensive.

bhgomes · 2022-04-28T23:49:37Z

Why is prefixing an issue? The documentation on merlin says that it's a requirement of the transcript being "parseable". What does this mean? We aren't parsing anything here.

The absolute requirement is that the transcript is binding to the messages in it, that is, it is not feasible to find a different transcript with different labels/messages that gives the same challenges. The property that merlin promises is that if labels are prefix free, then the transcript is unambiguously parseable (a mathematical definition - there is no computation to parse the transcript), and if the transcript is parseable then it is message binding.

It could be that the transcript is message binding even if the labels are not prefix-free, but this requires an additional argument and can't be assumed from the guarantees that merlin provides. It's just much simpler to defer to merlin's requirements.

If this is the case then why are we letting the user specify any labeling scheme at all, just choose a prefix-free sequence of labels and use each one in turn. From an engineering perspective, I don't want to have to think about these safety requirements if there exists a protocol that takes care of it for me, especially when these labels don't actually help me accomplish anything inside of my application (in this case PLONK). This is the API I actually care about:

trait Transcript {
    fn append<M>(&mut self, message: M);
    fn challenge<C>(&mut self) -> C;
}

In which scenarios is knowing or specifying the labels manually an important feature to have?

ghost · 2022-04-29T00:22:22Z

In which scenarios is knowing or specifying the labels manually an important feature to have?

Yeah, you are absolutely correct that from an implementation perspective, it is entirely possible to do without labels, or without 'static lifetimes on the labels, etc etc. It's probably entirely possible to use the exact "pi{}" format labels too.

The reason it's important, though, is because "Frozen Heart" is just the latest in a long history of Fiat-Shamir implementation mistakes (mostly of the "omit data" kind). The idea is that we should write down - very explicitly - the exact Plonk interactive protocol being used, and then duplicate it exactly with merlin, down to the labels and everything. The goal is not only 0 mistakes in converting the interactive proof to non-interactive, but not even the possibility of mistakes.

Public Inputs are serialized and added to the transcript under the label "pi". The struct PI is responsible of guaranteeing a unique representation of each possible public input vector.

davidnevadoc · 2022-04-29T17:44:36Z

After reading the discussion and learning about the correct usage of labels in merlin I believe that we should open a different PR to tackle this issue, as there are several parts of the code where they are misused. For example we are adding the commitment for the permutation with the label z and also adding the challenge used for lookup table RLC with the label zeta, which breaks the prefix rule. We are also using Box::leak() in the same way I did in the first version of the PR when adding the evaluation of custom gate constraints.

Having the merlin requirements in mind I think the simplest solution is to serialize PI and add it to the transcript with the label "pi". This struct is responsible of holding a one-to-one representation of each possible public input vector. In particular we use BTreeMap to guarantee the inputs are stored in order and we only store non-zero values, as zeros are implicit when no value is provided.

hdevalence · 2022-05-01T02:14:08Z

After reading the discussion and learning about the correct usage of labels in merlin I believe that we should open a different PR to tackle this issue, as there are several parts of the code where they are misused.

The docs on Transcript Protocols might be helpful; the recommended way to use the library is for each protocol to define a protocol-specific extension trait that maps the protocol-level transcript actions ("do a domain separation", "append a public key", "append a commitment to whatever", etc) operating on higher-level types into the underlying calls to the byte-oriented Merlin API. So in this case, for instance, it might make sense to have a transcript protocol that handled the serialization and labeling, and presented higher-level methods to the rest of the crate.

bhgomes

Just a few points, and a possible bug, otherwise LGTM.

plonk-core/src/circuit.rs

plonk-core/src/constraint_system/composer.rs

plonk-core/src/constraint_system/helper.rs

plonk-core/src/proof_system/pi.rs

plonk-core/src/proof_system/proof.rs

CPerezz

LGTM.

I like that refactor of the Public Inputs TBH. Although I think we can do more in the future with that and improve them much more.

As for the concerns of @bhgomes , they're legit. But at the same time, panicking due to a PI re-assignation is not something crazy. As it means you're not building a correct circuit. Which is not worth then compiling or making proofs for.

Would like the approval of @bhgomes and maybe one more before we merge TBH as this topic lead to a broad discussion and lots of members were involved.

LukePearson1

LGTM. Feel free to merge

davidnevadoc and others added 2 commits April 14, 2022 12:57

Modify construct_dense_pi_vec

b9f59d2

Update `construct_dense_pi_vec` so it returns the evaluation representation of the PI polynomial over the used domain. This means that the result vector will always have a power of two length.

Add PI to the transcript

5e54d13

Co-authored-by: Joshua Fitzgerald <joshuabfitzgerald@gmail.com>

davidnevadoc assigned lopeetall and davidnevadoc Apr 14, 2022

davidnevadoc added the T-bug Type: bug label Apr 14, 2022

davidnevadoc requested review from CPerezz and iquerejeta April 14, 2022 12:17

CPerezz requested changes Apr 14, 2022

View reviewed changes

plonk-core/src/proof_system/proof.rs Outdated Show resolved Hide resolved

plonk-core/src/proof_system/prover.rs Outdated Show resolved Hide resolved

davidnevadoc and others added 9 commits April 25, 2022 20:58

Modify construct_dense_pi_vec

21e379f

Update `construct_dense_pi_vec` so it returns the evaluation representation of the PI polynomial over the used domain. This means that the result vector will always have a power of two length.

Add PI to the transcript

eb3d6d8

Co-authored-by: Joshua Fitzgerald <joshuabfitzgerald@gmail.com>

Merge branch 'add-pi-transcript' of github.com:rust-zkp/ark-plonk int…

3aade1a

…o add-pi-transcript

New public inputs implementation

7fa4eda

Change PI implementation

bd2b3a8

The new implmenentation uses BTreeMap to ensure no duplicates are introduced and to easily get an ordered representation. This is particualrly helpuful to get a unique representation to add to the transcript.

New error type InvalidPublicInputValue

f718a86

Remove leftover

c894342

Add non-zero public inputs to the transcript

62d7ade

Adapt to new PI struct

d1a3737

davidnevadoc changed the base branch from master to MockProver April 26, 2022 21:16

davidnevadoc changed the base branch from MockProver to master April 26, 2022 21:16

davidnevadoc changed the title ~~Add Public Inputs to the Transcript~~ Refactor Public Inputs and add them to the transcript Apr 26, 2022

davidnevadoc marked this pull request as ready for review April 26, 2022 21:21

davidnevadoc force-pushed the add-pi-transcript branch 2 times, most recently from cbc7e97 to b7bc1a4 Compare April 26, 2022 21:59

davidnevadoc requested review from CPerezz, bhgomes and LukePearson1 April 26, 2022 22:05

Fix intradoc links & lint checks

56e947a

Add serialized PI to the transcript.

e9969e8

Public Inputs are serialized and added to the transcript under the label "pi". The struct PI is responsible of guaranteeing a unique representation of each possible public input vector.

davidnevadoc requested a review from a user May 6, 2022 07:27

davidnevadoc added 3 commits May 6, 2022 09:40

Remove Deref from PI

a2bbc62

Replace PrimeField with FftField.

cb7512a

Add basic tests

2ed6f87

davidnevadoc force-pushed the add-pi-transcript branch from 92e8557 to 2ed6f87 Compare May 6, 2022 08:14

bhgomes requested changes May 7, 2022

View reviewed changes

davidnevadoc added 6 commits May 9, 2022 14:14

Stray comments

2bfcc7f

Use derivative::Derivative instead of derive

d5bfc17

Drop unnecessary comment

8e96b40

Rename PI to PublicInputs

a4853bb

Fix extend

0ec0bb5

Fix intradoc links

d764967

davidnevadoc requested a review from lopeetall May 11, 2022 14:09

CPerezz approved these changes May 11, 2022

View reviewed changes

bhgomes approved these changes May 11, 2022

View reviewed changes

LukePearson1 approved these changes May 12, 2022

View reviewed changes

davidnevadoc merged commit 5323cd5 into master May 13, 2022

davidnevadoc deleted the add-pi-transcript branch May 13, 2022 07:10

davidnevadoc mentioned this pull request May 18, 2022

Proper transcript handling #145

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Public Inputs and add them to the transcript #134

Refactor Public Inputs and add them to the transcript #134

davidnevadoc commented Apr 14, 2022

davidnevadoc commented Apr 14, 2022

ghost commented Apr 14, 2022

ghost commented Apr 15, 2022

CPerezz commented Apr 28, 2022

lopeetall commented Apr 28, 2022 •

edited

Loading

bhgomes commented Apr 28, 2022 •

edited

Loading

CPerezz commented Apr 28, 2022 •

edited

Loading

lopeetall commented Apr 28, 2022

ghost commented Apr 28, 2022

bhgomes commented Apr 28, 2022 •

edited

Loading

ghost commented Apr 29, 2022

davidnevadoc commented Apr 29, 2022 •

edited

Loading

hdevalence commented May 1, 2022

bhgomes left a comment

CPerezz left a comment

LukePearson1 left a comment

Refactor Public Inputs and add them to the transcript #134

Refactor Public Inputs and add them to the transcript #134

Conversation

davidnevadoc commented Apr 14, 2022

davidnevadoc commented Apr 14, 2022

ghost commented Apr 14, 2022

ghost commented Apr 15, 2022

CPerezz commented Apr 28, 2022

lopeetall commented Apr 28, 2022 • edited Loading

bhgomes commented Apr 28, 2022 • edited Loading

CPerezz commented Apr 28, 2022 • edited Loading

lopeetall commented Apr 28, 2022

ghost commented Apr 28, 2022

bhgomes commented Apr 28, 2022 • edited Loading

ghost commented Apr 29, 2022

davidnevadoc commented Apr 29, 2022 • edited Loading

hdevalence commented May 1, 2022

bhgomes left a comment

Choose a reason for hiding this comment

CPerezz left a comment

Choose a reason for hiding this comment

LukePearson1 left a comment

Choose a reason for hiding this comment

lopeetall commented Apr 28, 2022 •

edited

Loading

bhgomes commented Apr 28, 2022 •

edited

Loading

CPerezz commented Apr 28, 2022 •

edited

Loading

bhgomes commented Apr 28, 2022 •

edited

Loading

davidnevadoc commented Apr 29, 2022 •

edited

Loading