Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port ZArith from Coq for performance #3976

Closed
sbp opened this issue Aug 4, 2017 · 3 comments
Closed

Port ZArith from Coq for performance #3976

sbp opened this issue Aug 4, 2017 · 3 comments

Comments

@sbp
Copy link

sbp commented Aug 4, 2017

Using Nat for arithmetic on large numbers in Idris causes significant performance degradation when type checking, compiling, evaluating, and executing code. Consider the example of taking the remainder of dividing the current unixtime by the number of seconds in a day:

main : IO ()
main = do printLn (show (modNatNZ 1501857320 (S 86399) SIsNotZ))

This file takes so long to type check or to compile that the duration would have to be extrapolated from smaller examples. Instead, we can use %freeze to lift the slow terms to the top level:

n1501857320 : Nat
n1501857320 = 1501857320

n86399 : Nat
n86399 = 86399

%freeze n1501857320
%freeze n86399

main : IO ()
main = do printLn (show (modNatNZ n1501857320 (S n86399) SIsNotZ))

This file is reasonably fast to type check and to compile, but now the runtime takes so long to execute that once again the actual duration could only be extrapolated from smaller examples. The code is also extremely ugly.

Instead, it would be useful if Idris provided an implementation of ZArith from Coq. An element of ZArith is already mentioned several times in the Idris literature:

There's also a different element of it in Data.ZZ. This issue proposes that all of ZArith be ported to Idris for its performance gains.

This is somewhat related to Issue #3516, The cost of computing Nat equality proofs at type check time, but doesn't solve that issue directly since here a new type is proposed to sidestep the issue altogether. Indeed, @edwinb suggested an orthogonal solution there:

You know, I think I'm going to take back that comment about "little point in hard coding things for Nat" because realistically that's the biggest problem we're going to encounter at compile time, and given that we say that Nat is for unbounded unsigned things, we probably ought to be a bit cleverer about it.

Proof of concept

To see what performance gains may be made, I created a small but usable port of a subset of ZArith to Idris, called Bi:

https://github.com/sbp/idris-bi

This code is based on Coq.NArith.BinPos and ZArith.Zdiv. The tutorial in VFA explains why such arithmetic is necessary and how it is constructed for anybody missing the background. The main types are as follows:

data Bip = U | O Bip | I Bip
data Bin = BZero | BPos Bip
data Bi = BNat Bin | BNeg Bip

This is slightly different from Coq, which has a single Z case covering negative, zero, and positive numbers instead of both Bin and Bi. The extended three-level tower is probably more useful.

We can now rewrite the poorly-performing example above as:

import Bi

main : IO ()
main = do printLn (show (the Integer (cast (binMod 1501857320 86400))))

Which compiles in reasonable time:

$ time idris Performance.idr -o Performance
idris Performance.idr -o Performance
   5.09s user 0.48s system 108% cpu 5.147 total

And runs with blazing speed:

$ time ./Performance
"52520"
./Performance
   0.00s user 0.00s system 40% cpu 0.013 total

None of the Coq proofs have been ported, so the port may contain bugs. As a proof of concept, however, it shows that if Idris used ZArith like Coq then there would be significant performance gains at all stages of code development involving large numbers.

I asked on IRC before submitting this issue whether there was any prior art or other reason not to submit, but received no reply. I am sbp on #idris on freenode if anybody wants to discuss this issue there.

@sbp
Copy link
Author

sbp commented Aug 4, 2017

Coq does have rough equivalents of Bip, Bin, and Bi called, respectivelly, positive, N, and Z per Coq.Numbers.BinNums. The way that Z is structured is different however: it's defined only in terms of positive, not in terms of N.

@sbp
Copy link
Author

sbp commented Aug 4, 2017

@ahmadsalim suggested on IRC that the most likely way that this feature request would be accepted is if I were to submit my own PR. I would be happy to do that, as long as I am able to bring idris-bi to some reasonable level of completion. I am already working on making the interface more consistent with Coq, and documenting what work remains to be done. The datatypes will now be:

data Bip = U | O Bip | I Bip
data Bin = BinZ | BinP Bip
data Biz = BizZ | BizP Bip | BizM Bip

These use the -P suffix for Plus and -M for Minus, to avoid confusion between N for Negative and N for Natural (i.e. ℕ). I was also going to use N for Nil instead of Z for Zero, but the likelihood of visual minim confusion between N and M was too great so I felt that it was probably better to stick with Z despite the possible confusion with ℤ.

@ahmadsalim
Copy link

I guess this is taken care of in idris-bi, right? 😄

I will close the issue here then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants