Port ZArith from Coq for performance #3976

sbp · 2017-08-04T15:17:42Z

Using Nat for arithmetic on large numbers in Idris causes significant performance degradation when type checking, compiling, evaluating, and executing code. Consider the example of taking the remainder of dividing the current unixtime by the number of seconds in a day:

main : IO ()
main = do printLn (show (modNatNZ 1501857320 (S 86399) SIsNotZ))

This file takes so long to type check or to compile that the duration would have to be extrapolated from smaller examples. Instead, we can use %freeze to lift the slow terms to the top level:

n1501857320 : Nat
n1501857320 = 1501857320

n86399 : Nat
n86399 = 86399

%freeze n1501857320
%freeze n86399

main : IO ()
main = do printLn (show (modNatNZ n1501857320 (S n86399) SIsNotZ))

This file is reasonably fast to type check and to compile, but now the runtime takes so long to execute that once again the actual duration could only be extrapolated from smaller examples. The code is also extremely ugly.

Instead, it would be useful if Idris provided an implementation of ZArith from Coq. An element of ZArith is already mentioned several times in the Idris literature:

There's also a different element of it in Data.ZZ. This issue proposes that all of ZArith be ported to Idris for its performance gains.

This is somewhat related to Issue #3516, The cost of computing Nat equality proofs at type check time, but doesn't solve that issue directly since here a new type is proposed to sidestep the issue altogether. Indeed, @edwinb suggested an orthogonal solution there:

You know, I think I'm going to take back that comment about "little point in hard coding things for Nat" because realistically that's the biggest problem we're going to encounter at compile time, and given that we say that Nat is for unbounded unsigned things, we probably ought to be a bit cleverer about it.

Proof of concept

To see what performance gains may be made, I created a small but usable port of a subset of ZArith to Idris, called Bi:

https://github.com/sbp/idris-bi

This code is based on Coq.NArith.BinPos and ZArith.Zdiv. The tutorial in VFA explains why such arithmetic is necessary and how it is constructed for anybody missing the background. The main types are as follows:

data Bip = U | O Bip | I Bip
data Bin = BZero | BPos Bip
data Bi = BNat Bin | BNeg Bip

This is slightly different from Coq, which has a single Z case covering negative, zero, and positive numbers instead of both Bin and Bi. The extended three-level tower is probably more useful.

We can now rewrite the poorly-performing example above as:

import Bi

main : IO ()
main = do printLn (show (the Integer (cast (binMod 1501857320 86400))))

Which compiles in reasonable time:

$ time idris Performance.idr -o Performance
idris Performance.idr -o Performance
   5.09s user 0.48s system 108% cpu 5.147 total

And runs with blazing speed:

$ time ./Performance
"52520"
./Performance
   0.00s user 0.00s system 40% cpu 0.013 total

None of the Coq proofs have been ported, so the port may contain bugs. As a proof of concept, however, it shows that if Idris used ZArith like Coq then there would be significant performance gains at all stages of code development involving large numbers.

I asked on IRC before submitting this issue whether there was any prior art or other reason not to submit, but received no reply. I am sbp on #idris on freenode if anybody wants to discuss this issue there.

The text was updated successfully, but these errors were encountered:

sbp · 2017-08-04T18:24:58Z

Coq does have rough equivalents of Bip, Bin, and Bi called, respectivelly, positive, N, and Z per Coq.Numbers.BinNums. The way that Z is structured is different however: it's defined only in terms of positive, not in terms of N.

sbp · 2017-08-04T20:34:27Z

@ahmadsalim suggested on IRC that the most likely way that this feature request would be accepted is if I were to submit my own PR. I would be happy to do that, as long as I am able to bring idris-bi to some reasonable level of completion. I am already working on making the interface more consistent with Coq, and documenting what work remains to be done. The datatypes will now be:

data Bip = U | O Bip | I Bip
data Bin = BinZ | BinP Bip
data Biz = BizZ | BizP Bip | BizM Bip

These use the -P suffix for Plus and -M for Minus, to avoid confusion between N for Negative and N for Natural (i.e. ℕ). I was also going to use N for Nil instead of Z for Zero, but the likelihood of visual minim confusion between N and M was too great so I felt that it was probably better to stick with Z despite the possible confusion with ℤ.

ahmadsalim · 2017-10-23T15:28:03Z

I guess this is taken care of in idris-bi, right? 😄

I will close the issue here then.

ahmadsalim added A-Paper Cut C-Moderate Effort U-Library Design Z-Feature Request labels Aug 5, 2017

ahmadsalim closed this as completed Oct 23, 2017

steshaw mentioned this issue Jan 17, 2018

List construction works very slow in REPL if I don't explicitly define its type as the (List Int) #4290

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port ZArith from Coq for performance #3976

Port ZArith from Coq for performance #3976

sbp commented Aug 4, 2017 •

edited

Loading

sbp commented Aug 4, 2017

sbp commented Aug 4, 2017

ahmadsalim commented Oct 23, 2017

Port ZArith from Coq for performance #3976

Port ZArith from Coq for performance #3976

Comments

sbp commented Aug 4, 2017 • edited Loading

Proof of concept

sbp commented Aug 4, 2017

sbp commented Aug 4, 2017

ahmadsalim commented Oct 23, 2017

sbp commented Aug 4, 2017 •

edited

Loading