Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[plinth][plc][api] Draft implementation of 'modularExponentiation' builtin #6348

Merged
merged 1 commit into from
Aug 14, 2024

Conversation

bezirg
Copy link
Contributor

@bezirg bezirg commented Jul 26, 2024

This is a draft implementation.
Supporting GHC8.10 for this builtin needs extra work; so we discussed with @zliu41 and postpone it for a bit later.
So as not to make the whole CI not build anymore with GHC8.10, the implementation of this builtin always fails for GHC8.10. The builtin is not enabled to run yet on the chain.

Pre-submit checklist:

  • Branch
    • Tests are provided (if possible)
    • Commit sequence broadly makes sense
    • Key commits have useful messages
    • Changelog fragments have been written (if appropriate)
    • Relevant tickets are mentioned in commit messages
    • Formatting, PNG optimization, etc. are updated
  • PR
    • (For external contributions) Corresponding issue exists and is linked in the description
    • Targeting master unless this is a cherry-pick backport
    • Self-reviewed the diff
    • Useful pull request description
    • Reviewer requested

@bezirg bezirg changed the title Bezirg/modexp [plinth][plc][api] Draft implementation of 'modularExponentiation' builtin Jul 26, 2024
@bezirg bezirg force-pushed the bezirg/modexp branch 2 times, most recently from 1a9f8ce to 5cfb8de Compare July 26, 2024 15:15
@bezirg bezirg self-assigned this Jul 26, 2024
@bezirg bezirg marked this pull request as ready for review July 26, 2024 15:24
Copy link
Contributor

@effectfully effectfully left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good.

Copy link
Member

@zliu41 zliu41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable. Regarding costing, make sure to follow the process outlined in #5874.

modularExponentiation b e m =
case integerPowMod# b e m of
(# n | #) -> pure n
(# | () #) -> fail "modularExponentiation: failure"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@effectfully is it possible for a builtin function to return Maybe, or only builtin types can be returned?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on what we want Nothing to mean. If we want it to fail evaluation, then we can do that. If we want Nothing to become some kind of UPLC value, then no, unless we add Maybe to the universe or treat [] as Nothing (and [a] as Just x) or teach builtins about SOPs or...

I'd be against making Nothing mean failure, not only because it'd be weird to treat a common Haskell value as a UPLC exception, but also because every failure must come with an error message and Maybe doesn't have that functionality. So yeah, wanna fail a builtin -- you have to use BuiltinResult as your monad.

@@ -110,9 +109,7 @@ instance Semigroup PluginRecompile where

instance Monoid PluginRecompile where
mempty = NoForceRecompile
#if __GLASGOW_HASKELL__ < 840
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remind me what the GHC stub stuff is for?

Copy link
Contributor Author

@bezirg bezirg Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also dnt know what the ghc stub is for. It seems to be sth for cross-compiling.

I did this change since the number 840 points to ghc 8.40.X which is not a ghc version ever released. Most likely the original author wanted to say 804 which is ghc 8.4.X. Further, since we don't even support ghc<8.4 , I thought the check is redundant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After talking to @hamishmack , ghc stub was used for ghcjs cross-compilation. I don't think we need it anymore (except for windows plutustx dev perhaps)

Copy link
Contributor

@kwxm kwxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks OK, although I'd be inclined to get rid of Natural and maybe give the function a less cumbersome name.

plutus-core/plutus-core/src/PlutusCore/Default/Builtins.hs Outdated Show resolved Hide resolved
@@ -31,6 +31,7 @@ import Data.Word
import GHC.Exts (Int (I#))
import GHC.Integer
import GHC.Integer.Logarithms
import GHC.Natural
Copy link
Contributor

@kwxm kwxm Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm dubious about introducing Natural into the types here. We already have a number of builtins that require inputs in a particular range: see here, here, and here for example. Just doing a similar check for the modulus would seem simpler than introducing a new type into the machinery along with the extra code to deal with it in various places that we'll also have to maintain.

Furthermore I think we should probably restrict the modulus/base (ie, the final argument) to be at least 2. If we use Natural to enforce bounds on the base then 0 and 1 are both allowed, and they're kind of edge cases which aren't very meaningful and which may depend on the implementation. For example, GHC's integerPowMod# appears to fail for modulus 0, but cryptonite's expFast/expSafe always returns 0 (although neither library documents their behaviour for modulus 0 or 1). There's also a good mathematical case to be made that exponentiation modulo 0 should be the usual exponentiation in ℤ. To improve stability we should probably avoid these ambiguous cases by requiring modulus >= 2, but then if we do that there's little point in using Natural in the first place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding and maintaining a bunch of code for Natural is perfectly reasonable and I wanted to do that long before this PR, I just didn't care enough about it. That being said, after reading Kenneth's reasoning I agree that we should probably pass the Integer in and check that it's >= 1 at least. Why >= 2 BTW? mod 1 seems to be uncontroversial?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why >= 2 BTW? mod 1 seems to be uncontroversial?

Well mod 1 is identically zero (if you divide something by 1 then you have a remainder of 0), so it's kind of pointless. It seems like the kind of thing where an implementer might choose to raise an error or something instead, so you might get unexpected behaviour.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding and maintaining a bunch of code for Natural is perfectly reasonable

I'm scared that we might end up with a natural type in PLC and then we'd have to duplicate all of the arithmetic and comparison functions...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well mod 1 is identically zero (if you divide something by 1 then you have a remainder of 0), so it's kind of pointless.

Sure, but somebody may have some general code and they may rely on x mod 1 being 0. It's like saying that Semigroup for () is kind of pointless: sure it is, but you still want to have it in case it gets triggered in some general code, except in case of mod it's much more important, since you'll get a runtime error for some perfectly reasonable code, not even a type one.

So I'd be much in favor of doing reasonable things regardless of how useless they are, because somebody may rely on reasonable things working.

I'm scared that we might end up with a natural type in PLC and then we'd have to duplicate all of the arithmetic and comparison functions...

No, I don't think that would make any sense and I'd be against doing it. However, Natural does fit very well in the Universe module where we also have all the IntN and WordN types, which we don't add to the universe either.

I'm not saying we should add handling of Natural to the builtins machinery, let alone in this PR (as I agree with your reasoning on the 0 edge case), I'm only saying that I don't think that would be problematic or have high maintenance cost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kwxm I do not agree about (2) : not allowing mod=1 because it is trivial, similar to what @effectfully said that we allow div 1 and mod 1

Copy link
Contributor Author

@bezirg bezirg Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About what is the proper mathematical definition of modular exponentiation, I am not a mathematician so I am following what was described in CIP-109: https://developers.cardano.org/docs/governance/cardano-improvement-proposals/cip-0109/ . Maybe @perturbing can comment about what mathematical definition we actually want here.

I can only comment about the specific interfaces/implementations we have in hand.

There are 2 fast interfaces in Haskell (labelled here as OLD and NEW) that provide some kind of "modular exponentation" operation.
Both of these interfaces utilise underneath the gmp library.

OLD (powModInteger)

powModInteger :: Integer -> Integer -> Integer -> Integer

As you can see this has everywhere Integers, which does not make complete sense for the 3rd argument and the result;
it would be better to type them as Naturals.

There are 2 different behaviors of the powModInteger , using the same interface.

Behavior1 of powModInteger with GHC<=8.10:

-- mod: 0
ghci810> powModInteger 1 1 0
ghc: libraries/integer-gmp/cbits/wrappers.c:671: integer_gmp_powm1: Assertion `m0' failed.
Aborted (core dumped)

-- negative exponent with non-prime mod
ghci> powModInteger 2 (-5) 4
Floating point exception (core dumped)

As you can see this powModInteger+<=ghc8.10 is very dangerous: it fails catastrophically without a way
to catch the exception.
If we want to use this interface, we would then need to validate before-hand the
input ourselves before calling the function. Specifically, I think we would need
to validate that the mod arg is a prime number (which further implies >=0).

Behavior2 of powModInteger with GHC>=9:

-- mod: 0
ghci810> powModInteger 1 1 0
In the use of ‘powModInteger’
(imported from GHC.Integer.GMP.Internals):
Deprecated: "Use integerPowMod# instead"
0

-- negative exponent with non-prime mod
ghci> powModInteger 2 (-5) 4
In the use of ‘powModInteger’
(imported from GHC.Integer.GMP.Internals):
Deprecated: "Use integerPowMod# instead"
0

Well, how come this work? Well, this powModInteger (deprecated)
has changed implementations and points
now underneath to the new implementation (the implementation used also by the NEW interface). Instead of failing
catastrophically, it instead maps all the
those values that don't have a modular-exponentation result to 0 (arbitrarily):

{-# DEPRECATED powModInteger "Use integerPowMod# instead" #-}
powModInteger :: Integer -> Integer -> Integer -> Integer
powModInteger b e m = case I.integerPowMod# b e (I.integerToNatural m) of
   (# r  | #) -> I.integerFromNatural r
   (# | () #) -> 0

I am not a mathematician, but returning 0 for all the results that cannot be computed
does not sound right to me. But there might be some reasoning behind it.

NEW (integerPowMod#)

Is provided only for GHC>=9.

integerPowMod# :: Integer -> Integer -> Natural -> (# Natural | () #)

As you can see it types the 3rd mod-arg and the result better.
Second, it never fails catastrophically, even when mod-arg == 0 :: Natural.
So we don't have to do sth like nonZeroSecondArg that we do for our other builtins.

ghci9> integerPowMod# 1 1 0
(# | () #)
ghci9> integerPowMod 2 (-5) 4
(# | () #)

What about Cryptonite's expFast function?

It is just an alias to the OLD powModInteger interface described above.
Which means that cryptonite+GHC<=8.10 fails catastrophically,
whereas cryptonite+GHC>=9 returns 0.

To complicate things, cryptonite will fallback to a pure-Haskell implementation
when it cannot find gmp lib. Which might lead to 2 different behaviors
even under the same GHC version:

-- ghc9,WITH-gmp,cryptonite-0.30
> expFast 2 (-5) 4
0
-- ghc9,WITHOUT-gmp,cryptonite-0.30
> expFast 2 (-5) 4
*** Exception: stack overflow

What about Cryptonite's expSafe function?

This is an adaptation that avoids a side-channel attack on the gmp implementation.
I don't think we need it, but if you want to know more about it / discuss it , we can do that.

TLDR:

IMO: powModInteger is unusable, cryptonite is messy, and the only sane choice
is integerPowMod# , which unfortunately requires GHC>=9.

My idea was to kind of manually "backport" what integerPowMod# does to GHC8.10
iff deemed necessary, since we might drop support for GHC8.10 soon.

Copy link
Contributor Author

@bezirg bezirg Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore, I don't know what should be the proper result when mod-arg is 0, but for reference, here is the behaviour of modular exponentation in python3's stdlib:

$python
Python 3.12.4 (main, Jun  7 2024, 06:33:07) [GCC 14.1.1 20240522] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> help(pow)
pow(base, exp, mod=None)
    Equivalent to base**exp with 2 arguments or base**exp % mod with 3 arguments

    Some types, such as ints, are able to use a more efficient algorithm when
    invoked using the three argument form.
>>> pow(1,1,0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: pow() 3rd argument cannot be 0
>>> pow(2,-5,4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: base is not invertible for the given modulus
>>> pow(1,1,1)
0

, which matches the behavior of integerPowMod# by mapping every raised python error to Haskell's ()

Copy link
Contributor Author

@bezirg bezirg Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CIP-109 https://developers.cardano.org/docs/governance/cardano-improvement-proposals/cip-0109/ says about this:

$b$ is an integer, the exponent $e$ a non-negative integer and the modulus $m$ a positive integer.
[...]
We propose to also include this extension [negative exponent e ]to the plutus built-in, for optimized inversion when this is possible. This is inversion is not guaranteed to exist for all numbers $b$, only when the modulus $m$ is a prime number this is guaranteed.

We also propose that the built-in fails if this inverse does not exist, and if a modulus is provided that is smaller than one.

So from what I understand from the CIP-109, the interface of the builtin looks like this:

  • base : integer
  • exponent : integer
  • modulo : integer
  • result: integer

If the modulo is <= 0 , fail.
If the exponent is < 0 , SOMETIMES fail.

So I think the above expected results are matched by integerPowMod# and not by powModInteger, nor cryptonite as a whole.

Copy link
Contributor Author

@bezirg bezirg Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kwxm @effectfully

After discussing with @effectfully , I should have made it more explicit:

integerPowMod# returns () when the mod-arg is 0
=>
which makes this PR's builtin implementation to fail, as expected by CIP-109.

nonZeroThirdArg --- same in spirit as nonZeroSecondArg --- is not necessary in this case.

plutus-core/plutus-core/src/PlutusCore/Crypto/ModExp.hs Outdated Show resolved Hide resolved
plutus-tx/src/PlutusTx/Builtins/Internal.hs Outdated Show resolved Hide resolved
Copy link
Contributor

@effectfully effectfully left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to withdraw my approval after reading Kenneth's comments, I think he has great points. I don't disapprove the PR either though.

Copy link
Contributor

@effectfully effectfully left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nikos very helpfully explained to me that the builtin already behaves like I want it to. Conversion of the last argument to Natural filters out negative integers and integerPowMod# returns an explicit failure when the last argument is 0. So by my standards it's all good, I'd only like to see unit tests demonstrating all this behavior.

@bezirg could you please add unit tests to Evaluation.Builtins.Definition similar to all the other unit tests that we have there (like those in test_Integer, test_Crypto etc)?

@bezirg
Copy link
Contributor Author

bezirg commented Jul 29, 2024

Nikos very helpfully explained to me that the builtin already behaves like I want it to. Conversion of the last argument to Natural filters out negative integers and integerPowMod# returns an explicit failure when the last argument is 0. So by my standards it's all good, I'd only like to see unit tests demonstrating all this behavior.

@bezirg could you please add unit tests to Evaluation.Builtins.Definition similar to all the other unit tests that we have there (like those in test_Integer, test_Crypto etc)?

I was planning to add tests in another PR (to keep the PRs smaller), but I regret it;
I should have done it in this PR, then the behavior of the builtin would be shown.

I will add the tests here, and I will ask for a second round of review. Thank you all for your feedback!

@kwxm kwxm disabled auto-merge July 30, 2024 03:08
@kwxm
Copy link
Contributor

kwxm commented Jul 30, 2024

Nikos very helpfully explained to me that the builtin already behaves like I want it to. Conversion of the last argument to Natural filters out negative integers and integerPowMod# returns an explicit failure when the last argument is 0. So by my standards it's all good, I'd only like to see unit tests demonstrating all this behavior.

I'd prefer to put an explicit check for modulus > 0 in the builtin. integerPowMod# does this itself (although you only discover that by looking at its source code), but let's be sure in case something changes. I've looked at a number of implementations in different languages and libraries and so far I've found three different behaviours when the modulus is zero. There's no commonly-accepted agreement on what the behaviour should be, and none of the things I've looked at actually document what the behaviour is: you have to use trial and error or examine the code to find out. This is really fragile, so let's just ban it outright! I think we should accept m=1 always giving you a^b mod m = 0 since that seems to be pretty universal , but we should still add a test.

I'm still a bit sceptical about using Natural to enforce bounds for us. We already have multiple builtins that do their own bound checks, but now we have two ways of checking n>=0, which could be confusing. Maybe we should use Natural everywhere we do that though.

@effectfully
Copy link
Contributor

I'd prefer to put an explicit check for modulus > 0 in the builtin.

Just to be clear, I don't mind that at all. Only saying that for me both the options work, so if you feel like we should really make it an explicit check, we can invoke the "whoever has the strongest opinion wins" clause.

This is really fragile

I don't think it is! We should cover all edge cases with tests. We can even make both unit and golden tests just to be dead sure that the behavior isn't gonna change. And I don't feel like making the check explicit will contribute much towards stability of behavior, given proper testing.

I think we should accept m=1 always giving you a^b mod m = 0 since that seems to be pretty universal , but we should still add a test.

👍

I'm still a bit sceptical about using Natural to enforce bounds for us. We already have multiple builtins that do their own bound checks, but now we have two ways of checking n>=0, which could be confusing. Maybe we should use Natural everywhere we do that though.

The underlying primitive base function has Natural in its type, so we just conveniently wrap it directly instead of checking the bounds manually and converting the Integer argument manually to Natural in order to pass it to the function. Less boilerplate and it's still pretty clear what's going on, since we do the same for Int64, Word8 etc etc. Again, no strong inclinations from me, but I feel like using Natural is pretty reasonable and follows existing conventions.

I do agree that there are two ways to check bounds (explicitly and implicitly) and it may be confusing, but both are used extensively, are we going to implement bounds checking manually for all the various integral types just so that all bounds checking is done explicitly?

@bezirg bezirg force-pushed the bezirg/modexp branch 6 times, most recently from 13bbfb4 to bb81861 Compare August 1, 2024 13:09
Comment on lines +24 to +31
-- FIXME: fugly stub implementation to make the various test-suites/CI pass for GHC8.10.
-- This means that we cannot provide random testing for expMod at the moment.
expMod _ _ 0 = fail "Cannot divide by zero"
expMod 500 0 500 = pure 1
expMod 500 5 500 = pure 0
expMod 1 (-3) 4 = pure 1
expMod 2 (-3) 3 = pure 2
expMod 4 (-5) 9 = pure 4
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super ugly hack to make things happy for GHC8.10 , without disabling building for ghc8.10 for now.

@bezirg
Copy link
Contributor Author

bezirg commented Aug 1, 2024

I added some unit tests. I didn't add random testing of the builtin because of the stub code i have for ghc8.10

@bezirg bezirg requested review from kwxm and effectfully August 1, 2024 13:12
@bezirg bezirg force-pushed the bezirg/modexp branch 5 times, most recently from 6923430 to da5ebc4 Compare August 12, 2024 09:10
@bezirg bezirg force-pushed the bezirg/modexp branch 8 times, most recently from e1c9cda to 8e77b21 Compare August 14, 2024 19:49
@bezirg bezirg enabled auto-merge (squash) August 14, 2024 19:52
@bezirg bezirg disabled auto-merge August 14, 2024 19:52
@bezirg bezirg enabled auto-merge (squash) August 14, 2024 19:54
@bezirg bezirg merged commit cf5dfb6 into master Aug 14, 2024
6 checks passed
@bezirg bezirg deleted the bezirg/modexp branch August 14, 2024 20:37
test_ExpModInteger

test_ExpModInteger :: TestNested
test_ExpModInteger = testNestedM "ExpMod" $ do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not objecting to these tests (more tests are better!) but maybe there's a discussion to be had about whether we need unit tests like this (which we have for lots of other builtins as well) when we also have the conformance tests. I suppose these get run more often than the conformance tests though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants