Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: replace BN254 final exp by a class equivalence check #1143

Merged
merged 18 commits into from
Jul 3, 2024

Conversation

yelhousni
Copy link
Contributor

@yelhousni yelhousni commented May 24, 2024

Description

For pairing checks of the form ∏ e(Pi, Qi) == 1 we replace the final exponentiation by class equivalence check as described in https://eprint.iacr.org/2024/640.pdf (Section 4). We compute in a hint the residue witness c and check in-circuit that f * w == c^λ where f = ∏ MillerLoop(Pi, Qi), w some hinted scaler to ensure f*w is a cubic residue and λ=6u+2+q^3-q^2+q the optimal exponent (instead of r) with u the curve seed. Exponentiation by 6u+2 is done using an optimized addition chain.

The paper suggests to include c^(6u+2) in the multi-Miller loop computation so that the mutualized squarings in the loop would catch the squarings needed for c^(6u+2) too but I don't see how this can be done since we need f to compute c. One thing we can maybe do later is to push f and all the hint outputs to the torus or at least the cyclotomic subgroup by doing the easy part of the final exponentiation so that the FE elimination trick should be conducted with torus-based arithmetic over Fp6 (or at least with cyclotomic squarings over Fp12).

Edit: the squaring mutualization is possible (see #1143 (comment)). Cyclotomic subgroup / torus push becomes inefficient now that squarings are for "free".

Small typo in Alg.4 of the paper for the modified Tonelli-Shanks: p^k-1 = 3^n * s instead of p-1 = 3^r * s

Type of change

  • New feature (non-breaking change which adds functionality)

How has this been tested?

The existing TestPairingCheckTestSolve works for this.

How has this been benchmarked?

Compared to the previous torus-based final exponentiation, this PR saves in the ECPair precompile:

  • 807,034 1,378,371 scs

Checklist:

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • I did not modify files generated from templates
  • golangci-lint does not output errors locally
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

@yelhousni yelhousni self-assigned this May 24, 2024
@yelhousni yelhousni changed the title Perf/eliminate final exp perf: replace final exp by a class equivalence check May 24, 2024
@zilong-dai
Copy link

cool

@yelhousni yelhousni changed the title perf: replace final exp by a class equivalence check perf: replace BN254 final exp by a class equivalence check Jun 3, 2024
@yelhousni
Copy link
Contributor Author

yelhousni commented Jun 5, 2024

The paper suggests to include c^(6u+2) in the multi-Miller loop computation so that the mutualized squarings in the loop would catch the squarings needed for c^(6u+2) too but I don't see how this can be done since we need f to compute c.

Actually we can feed the hint with the points instead of the Miller function to compute the witness residue. This means we compute the Miller loop out-circuit (in the hint) and based on that we compute the witness residue c and then, in-circuit, we re-compute the Miller loop alongside c^{6u+2}. We can do this by initilizing the Miller loop accumulator in-circuit by 1/c instead of 1 and when bit is 1 we multiply it by 1/c and resp. c when bit is -1. Both c and 1/c are provided in the hint. Finally we check that w * f * (1/c)^(q^3-q^2+q) == 1 only this time f is f = ∏ MillerLoop(Pi, Qi) * (1/c)^(6u+2).

However, we need the Miller loop out-circuit to match exactly the in-circuit version to have the same witness residue. Previously in gnark-crypto we used doubleStep and addStep for affine pairing while here we use doubleAndAddStep (2P+Q) this resulted in a mismatch of lines (line through 2Pinstead of P+Q) but is now fixed in Consensys/gnark-crypto#506.

This saves an additional 571,337 scs in the ECPAIR precompile.

@yelhousni
Copy link
Contributor Author

yelhousni commented Jun 12, 2024

Some remarks:

  • The efficiency of the technique requires d to be small. For BN254 d=3 (requiring a cube root) and for BW6-761 it's even better d=1 (perf: replace BW6-761 final exp by a class equivalence check #1155) but for BLS families d=u-1 where u is the curve seed (64 bits for BLS12).
  • The technique is efficient only when pairings are equal i.e. ∏ MillerLoop(Pi, Qi) == 1. When they are not we end up with some residue w and we need to check in-circuit that w^r == 1. Ofc most applications need the ==1 check but maybe some future applications would require the !=1 check e.g. some kind of fraud proofs.

@feltroidprime
Copy link

Some remarks:

* The efficiency of the technique requires `d` to be small. For BN254 `d=3` (requiring a cube root) and for BW6-761 it's even better `d=1` ([perf: replace BW6-761 final exp by a class equivalence check #1155](https://github.com/Consensys/gnark/pull/1155)) but for BLS families `d=u-1` where `u` is the curve seed (64 bits for BLS12).

Moreover 1/m' mod h doesn't exist for BLS12-381 ...

@yelhousni
Copy link
Contributor Author

Some remarks:

* The efficiency of the technique requires `d` to be small. For BN254 `d=3` (requiring a cube root) and for BW6-761 it's even better `d=1` ([perf: replace BW6-761 final exp by a class equivalence check #1155](https://github.com/Consensys/gnark/pull/1155)) but for BLS families `d=u-1` where `u` is the curve seed (64 bits for BLS12).

Moreover 1/m' mod h doesn't exist for BLS12-381 ...

yes gcd(m',h)=u-1

Copy link
Collaborator

@ivokub ivokub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I got the idea of having last ML and FinalExp in the same call.

@yelhousni
Copy link
Contributor Author

Looks good. I got the idea of having last ML and FinalExp in the same call.

we can do this actually for the whole multi-pairing, but for the precompile we need to have fixed circuits regardless of the number of pairs so I did the trick only for the last Miller loop.

@yelhousni yelhousni merged commit d94368b into master Jul 3, 2024
7 checks passed
@yelhousni yelhousni deleted the perf/eliminate-finalExp branch July 3, 2024 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants