254-bit Prime Field Arithmetic Benchmark on GPU with CUDA Backend
$ DEVICE=gpu make cuda && ./run
Benchmark running on Tesla V100-SXM2-16GB
Addition on F(21888242871839275222246405745257275088548364400416034343698204186575808495617)
dimension iterations total per op ops/ sec
128 x 128 1024 2234855 ns 0.133208 ns 7.50707e+09
256 x 256 1024 613577 ns 0.00914301 ns 1.09373e+11
512 x 512 1024 2278511 ns 0.00848811 ns 1.17812e+11
1024 x 1024 1024 8674787 ns 0.00807902 ns 1.23777e+11
Subtraction on F(21888242871839275222246405745257275088548364400416034343698204186575808495617)
dimension iterations total per op ops/ sec
128 x 128 1024 229056 ns 0.0136528 ns 7.3245e+10
256 x 256 1024 691766 ns 0.0103081 ns 9.70109e+10
512 x 512 1024 2566303 ns 0.00956022 ns 1.046e+11
1024 x 1024 1024 10098982 ns 0.00940541 ns 1.06322e+11
Multiplication on F(21888242871839275222246405745257275088548364400416034343698204186575808495617)
dimension iterations total per op ops/ sec
128 x 128 1024 35480343 ns 2.11479 ns 4.72859e+08
256 x 256 1024 168584024 ns 2.5121 ns 3.98074e+08
512 x 512 1024 1002255496 ns 3.73369 ns 2.67831e+08
1024 x 1024 1024 4120332867 ns 3.83736 ns 2.60596e+08
Division on F(21888242871839275222246405745257275088548364400416034343698204186575808495617)
dimension iterations total per op ops/ sec
128 x 128 1024 197345977 ns 11.7627 ns 8.50142e+07
256 x 256 1024 703453235 ns 10.4823 ns 9.53992e+07
512 x 512 1024 2065334020 ns 7.69397 ns 1.29972e+08
1024 x 1024 1024 7218084706 ns 6.72237 ns 1.48757e+08
Inversion on F(21888242871839275222246405745257275088548364400416034343698204186575808495617)
dimension iterations total per op ops/ sec
128 x 128 1024 174981181 ns 10.4297 ns 9.58801e+07
256 x 256 1024 451331865 ns 6.72537 ns 1.48691e+08
512 x 512 1024 1801308585 ns 6.7104 ns 1.49022e+08
1024 x 1024 1024 6754218367 ns 6.29036 ns 1.58974e+08
Exponentiation on F(21888242871839275222246405745257275088548364400416034343698204186575808495617)
dimension iterations total per op ops/ sec
128 x 128 1024 704050844 ns 41.9647 ns 2.38296e+07
256 x 256 1024 2545503329 ns 37.931 ns 2.63637e+07
512 x 512 1024 8911049141 ns 33.1962 ns 3.01239e+07
1024 x 1024 1024 33079934464 ns 30.8081 ns 3.2459e+07