Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster constant-time division #643

Merged
merged 9 commits into from
Aug 18, 2024

Conversation

andrewwhitehead
Copy link
Contributor

@andrewwhitehead andrewwhitehead commented Aug 7, 2024

This replaces the div_rem method for Uint and BoxedUint with one based on the updated vartime division method.

Unlike the old method, this one panics if NonZero(0) is given as the divisor. The sqrt method relied on the old behavior and is updated.

The Default implementation of NonZero was also producing invalid NonZero(0) values, which affected the NonZero::map usage in checked_div. The trait implementation is updated to return Self::ONE as the default.

Some relevant benchmarks:

wrapping ops/div/rem, U256/U128, full size
                        time:   [312.41 ns 314.29 ns 317.18 ns]
                        change: [-81.412% -79.729% -78.416%] (p = 0.00 < 0.05)
                        Performance has improved.
wrapping ops/rem, U256/U128, full size
                        time:   [310.37 ns 312.08 ns 315.15 ns]
                        change: [-78.497% -78.170% -77.896%] (p = 0.00 < 0.05)
                        Performance has improved.
sqrt/sqrt, U256         time:   [3.1401 µs 3.1769 µs 3.2335 µs]
                        change: [-78.636% -78.245% -77.823%] (p = 0.00 < 0.05)
                        Performance has improved.
wrapping ops/boxed_div_rem
                        time:   [7.6462 µs 7.6800 µs 7.7228 µs]
                        change: [-99.614% -99.609% -99.605%] (p = 0.00 < 0.05)
                        Performance has improved.
wrapping ops/boxed_rem  time:   [7.6416 µs 7.7808 µs 7.9917 µs]
                        change: [-99.615% -99.608% -99.601%] (p = 0.00 < 0.05)
                        Performance has improved.
boxed_sqrt/boxed_sqrt, 4096
                        time:   [110.47 µs 111.58 µs 113.16 µs]
                        change: [-99.601% -99.593% -99.586%] (p = 0.00 < 0.05)
                        Performance has improved.

Signed-off-by: Andrew Whitehead <cywolf@gmail.com>
Signed-off-by: Andrew Whitehead <cywolf@gmail.com>
Signed-off-by: Andrew Whitehead <cywolf@gmail.com>
Signed-off-by: Andrew Whitehead <cywolf@gmail.com>
Signed-off-by: Andrew Whitehead <cywolf@gmail.com>
Signed-off-by: Andrew Whitehead <cywolf@gmail.com>
Signed-off-by: Andrew Whitehead <cywolf@gmail.com>
Signed-off-by: Andrew Whitehead <cywolf@gmail.com>
@andrewwhitehead andrewwhitehead force-pushed the feat/fast-const-div branch 2 times, most recently from 5811643 to 858ffff Compare August 16, 2024 00:19
Signed-off-by: Andrew Whitehead <cywolf@gmail.com>
@tarcieri tarcieri merged commit 8520fdb into RustCrypto:master Aug 18, 2024
18 checks passed
@andrewwhitehead
Copy link
Contributor Author

I saw one dudect failure after this was merged which had me concerned, but it seems to be passing since then? I might try to run it on an Intel machine today.

@andrewwhitehead andrewwhitehead deleted the feat/fast-const-div branch August 20, 2024 16:39
@tarcieri tarcieri mentioned this pull request Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants