Add small precision fast-path to Quo #98

aaronc · 2020-11-02T19:13:55Z

I've been doing some benchmarks to compare APD's performance vs other approaches we're considering (cosmos/cosmos-sdk#7772).

It seems that Quo operations perform pretty poorly compare to Add, Mul, etc.:

Benchmark/1_Ops:_apd.Decimal128.Add
Benchmark/1_Ops:_apd.Decimal128.Add-16         	 5285344	       218 ns/op
Benchmark/1_Ops:_apd.Decimal128.Sub
Benchmark/1_Ops:_apd.Decimal128.Sub-16         	 5665453	       208 ns/op
Benchmark/1_Ops:_apd.Decimal128.Mul
Benchmark/1_Ops:_apd.Decimal128.Mul-16         	 9138602	       138 ns/op
Benchmark/1_Ops:_apd.Decimal128.Quo
Benchmark/1_Ops:_apd.Decimal128.Quo-16         	  162537	      7790 ns/op

This benchmark is basically just perform a single operation on two random floating point numbers (different ones each round). I'm using a Context configured with decimal128 parameters (i.e. Precision = 34, etc.)

Is Quo inherently expected to be 40x slower than other operations or could there be a performance bottleneck?

The text was updated successfully, but these errors were encountered:

aaronc · 2020-11-03T17:08:02Z

So looking at the Quo code, there are a lot of for loops. It looks like it's trying to do things digit by digit. Is this algorithm designed around cases with very large precision? What if there were a fast path for precision <= 34 (decimal128, etc.)?

I benchmarked some decimal128 code based on https://software.intel.com/content/www/us/en/develop/articles/intel-decimal-floating-point-math-library.html by the way and division is comparable to multiplication there. So it must be possible for apd to have a more efficient Quo implementation for "low" precision.

nvanbenschoten · 2022-01-29T20:00:43Z

Hi @aaronc, thanks for the report. Quo performance has improved recently (it's about 50% faster on your benchmark), but it still lags the performance of other operations by quite a margin. As you identified, this has to do with the digit-by-digit loop in Quo. The algorithm is designed around cases with very large precision and does not include any form of fast-path for low precision values. I'd like to keep this issue open to track the introduction of such a fast path.

Fixes cockroachdb#98. This commit speeds up `Context.Quo`, replacing its per-digit long division with a call to `BigInt.QuoRem`. This avoids per-digit iteration. It also allows `Context.Quo` to benefit further from the inline fast-path of `BigInt.QuoRem` for small coefficient values that fit in a uint64. This is a partial revert of 1eddda3, at least in spirit. Before that commit, `Context.Quo` did try to use `big.Int.Quo`. Unfortunately, it was inaccurate for certain values because it was not scaling the coefficients correctly before dividing. It tried to compensate for this by using a very large precision (`c.Precision*2 + 8`), but this was insufficient on its own because it was not taking the size of the values into account. So the commit abandoned that approach. However, that commit, which was based on the description given on the GDA site, did demonstrate how to scale the coefficients in a way that would permit this kind of division. Specifically, it began adjusting the operands so that the coefficient of the dividend was greater than or equal to the coefficient of the divisor and was also less than ten times the coefficient of the divisor. This commit uses the coefficient adjustment introduced in 1eddda3 to revive the call into `BigInt.QuoRem`. With the two operands adjusted, it now is possible to scale the coefficient of the dividend by the desired precision and then perform a direct division on the operands.

aaronc mentioned this issue Nov 2, 2020

DNM: Decimal error measurements and benchmarks cosmos/cosmos-sdk#7772

Closed

9 tasks

aaronc mentioned this issue Jun 29, 2021

Is this maintained? #100

Closed

nvanbenschoten changed the title ~~Poor Quo performance~~ Add small precision fast-path to Quo Jan 29, 2022

nvanbenschoten mentioned this issue Jan 31, 2022

apd: optimize Quo, remove per-digit long division #114

Merged

nvanbenschoten closed this as completed in #114 Jan 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add small precision fast-path to Quo #98

Add small precision fast-path to Quo #98

aaronc commented Nov 2, 2020 •

edited

Loading

aaronc commented Nov 3, 2020

nvanbenschoten commented Jan 29, 2022

Add small precision fast-path to Quo #98

Add small precision fast-path to Quo #98

Comments

aaronc commented Nov 2, 2020 • edited Loading

aaronc commented Nov 3, 2020

nvanbenschoten commented Jan 29, 2022

aaronc commented Nov 2, 2020 •

edited

Loading