Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved checked_isqrt and isqrt methods #128166

Merged
merged 2 commits into from
Aug 29, 2024
Merged

Improved checked_isqrt and isqrt methods #128166

merged 2 commits into from
Aug 29, 2024

Conversation

ChaiTRex
Copy link
Contributor

@ChaiTRex ChaiTRex commented Jul 25, 2024

Improved tests of isqrt and checked_isqrt implementations

  • Inputs chosen more thoroughly and systematically.
  • Checks that isqrt and checked_isqrt have equivalent results for signed types, either equivalent numerically or equivalent as a panic and a None.
  • Checks that isqrt has numerically-equivalent results for unsigned types and their NonZero counterparts.

Added benchmarks for isqrt implementations

Greatly sped up checked_isqrt and isqrt methods

  • Uses a lookup table for 8-bit integers and then the Karatsuba square root algorithm for larger integers.
  • Includes optimization hints that give the compiler the exact numeric range of results.

Feature tracking issue

isqrt is an unstable feature tracked at #116226.

Benchmarked improvements

Command used to benchmark

./x bench library/core -- int_sqrt

Before

benchmarks:
    num::int_sqrt::i128::isqrt           439591.65/iter  +/- 6652.70
    num::int_sqrt::i16::isqrt              5302.97/iter   +/- 160.93
    num::int_sqrt::i32::isqrt             62999.11/iter  +/- 2022.05
    num::int_sqrt::i64::isqrt            125248.81/iter  +/- 1674.43
    num::int_sqrt::i8::isqrt                123.56/iter     +/- 1.87
    num::int_sqrt::isize::isqrt          125356.56/iter  +/- 1017.03
    num::int_sqrt::non_zero_u128::isqrt  437443.75/iter  +/- 3535.43
    num::int_sqrt::non_zero_u16::isqrt     8604.58/iter    +/- 94.76
    num::int_sqrt::non_zero_u32::isqrt    62933.33/iter   +/- 517.30
    num::int_sqrt::non_zero_u64::isqrt   125076.38/iter +/- 11340.61
    num::int_sqrt::non_zero_u8::isqrt       221.51/iter     +/- 1.58
    num::int_sqrt::non_zero_usize::isqrt 136005.21/iter  +/- 2020.35
    num::int_sqrt::u128::isqrt           439014.55/iter  +/- 3920.45
    num::int_sqrt::u16::isqrt              8575.08/iter   +/- 148.06
    num::int_sqrt::u32::isqrt             63008.89/iter   +/- 803.67
    num::int_sqrt::u64::isqrt            125088.09/iter   +/- 879.29
    num::int_sqrt::u8::isqrt                230.18/iter     +/- 2.04
    num::int_sqrt::usize::isqrt          125237.51/iter  +/- 4747.83

After

benchmarks:
    num::int_sqrt::i128::isqrt           105184.89/iter +/- 1171.38
    num::int_sqrt::i16::isqrt              1910.26/iter   +/- 78.50
    num::int_sqrt::i32::isqrt             34260.34/iter  +/- 960.84
    num::int_sqrt::i64::isqrt             45939.19/iter +/- 2525.65
    num::int_sqrt::i8::isqrt                 22.87/iter    +/- 0.45
    num::int_sqrt::isize::isqrt           45884.17/iter  +/- 595.49
    num::int_sqrt::non_zero_u128::isqrt  106344.27/iter  +/- 780.99
    num::int_sqrt::non_zero_u16::isqrt     2790.19/iter   +/- 53.43
    num::int_sqrt::non_zero_u32::isqrt    33613.99/iter  +/- 362.96
    num::int_sqrt::non_zero_u64::isqrt    46235.42/iter  +/- 429.69
    num::int_sqrt::non_zero_u8::isqrt        31.78/iter    +/- 0.75
    num::int_sqrt::non_zero_usize::isqrt  46208.75/iter  +/- 375.27
    num::int_sqrt::u128::isqrt           106385.94/iter +/- 1649.95
    num::int_sqrt::u16::isqrt              2747.69/iter   +/- 28.72
    num::int_sqrt::u32::isqrt             33627.09/iter  +/- 475.68
    num::int_sqrt::u64::isqrt             46182.29/iter  +/- 311.16
    num::int_sqrt::u8::isqrt                 33.10/iter    +/- 0.30
    num::int_sqrt::usize::isqrt           46165.00/iter  +/- 388.41

Tracking Issue for {u8,i8,...}::isqrt #116226

try-job: test-various

@rustbot
Copy link
Collaborator

rustbot commented Jul 25, 2024

r? @tgross35

rustbot has assigned @tgross35.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 25, 2024
@rust-log-analyzer

This comment has been minimized.

@tgross35
Copy link
Contributor

I'll take a closer look in a bit but FYI your last commit message has a typo (s/sped/speed)

@ChaiTRex
Copy link
Contributor Author

The last force push added a missing file.

I'll take a closer look in a bit but FYI your last commit message has a typo (s/sped/speed)

OK, thanks for taking a look. Oh, I was using 'sped' as a past tense of the verb form of 'speed'.

Copy link
Contributor

@tgross35 tgross35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a very superficial review, will need to take some time to double check the important parts (I think the proposed deduplication will make this a bit more straightforward). At a high level these changes are awesome, thanks for the PR!

library/core/src/num/int_sqrt.rs Outdated Show resolved Hide resolved
library/core/src/num/int_macros.rs Outdated Show resolved Hide resolved
library/core/src/num/int_sqrt.rs Outdated Show resolved Hide resolved
library/core/benches/num/int_sqrt/mod.rs Outdated Show resolved Hide resolved
library/core/src/num/int_macros.rs Outdated Show resolved Hide resolved
library/core/src/num/int_sqrt.rs Outdated Show resolved Hide resolved
@tgross35
Copy link
Contributor

I'll take a closer look in a bit but FYI your last commit message has a typo (s/sped/speed)

OK, thanks for taking a look. Oh, I was using 'sped' as a past tense of the verb form of 'speed'.

Ah, I see you did the same for the other commits too. Commit messages are usually in imperative mood to describe what the change does, i.e. "Improve test", "Add benchmarks" "Greatly speed up". Hence my confusion :)

(Rust has no commit message policy so no need to anything if you don't feel like it).

@tgross35
Copy link
Contributor

Looks pretty reasonable after a more thorough read through, awesome improvements to the tests. I think this should be good after combining some similar code, just to make things a bit easier to understand & maintain.

If you run benchmarks again, could you post your cpu model for a reference?

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 26, 2024
@ChaiTRex
Copy link
Contributor Author

ChaiTRex commented Aug 1, 2024

If you run benchmarks again, could you post your cpu model for a reference?

I was using aarch64-apple-darwin on a MacBook Air with an M1 processor under macOS 14.5.1 (now 14.6).

After I replaced the benchmarks with equivalents to ilog10's benchmarks (which use an exponential distribution for the random inputs), I get the following:

Benchmarks

Before this pull request

benchmarks:
    num::int_sqrt::u128_sqrt_predictable  108945.82/iter +/- 1673.86
    num::int_sqrt::u128_sqrt_random        16389.17/iter  +/- 231.44
    num::int_sqrt::u128_sqrt_random_small   1290.22/iter   +/- 41.90
    num::int_sqrt::u16_sqrt_predictable     1080.06/iter   +/- 43.44
    num::int_sqrt::u16_sqrt_random          1064.57/iter   +/- 20.79
    num::int_sqrt::u16_sqrt_random_small     579.85/iter   +/- 13.81
    num::int_sqrt::u32_sqrt_predictable     2979.65/iter   +/- 29.79
    num::int_sqrt::u32_sqrt_random          1718.20/iter   +/- 58.86
    num::int_sqrt::u32_sqrt_random_small     480.19/iter   +/- 26.59
    num::int_sqrt::u64_sqrt_predictable    13089.55/iter  +/- 484.72
    num::int_sqrt::u64_sqrt_random          3751.04/iter   +/- 79.66
    num::int_sqrt::u64_sqrt_random_small     498.58/iter    +/- 3.60
    num::int_sqrt::u8_sqrt_predictable       329.07/iter   +/- 12.64
    num::int_sqrt::u8_sqrt_random            576.37/iter   +/- 47.12
    num::int_sqrt::u8_sqrt_random_small      589.01/iter   +/- 22.53

When I first made this pull request

benchmarks:
    num::int_sqrt::u128_sqrt_predictable  30728.33/iter +/- 1093.31
    num::int_sqrt::u128_sqrt_random        4392.30/iter  +/- 149.35
    num::int_sqrt::u128_sqrt_random_small   323.58/iter    +/- 7.26
    num::int_sqrt::u16_sqrt_predictable     304.50/iter   +/- 10.60
    num::int_sqrt::u16_sqrt_random          334.78/iter   +/- 18.02
    num::int_sqrt::u16_sqrt_random_small    139.06/iter    +/- 2.50
    num::int_sqrt::u32_sqrt_predictable    1585.35/iter   +/- 59.03
    num::int_sqrt::u32_sqrt_random          972.40/iter   +/- 38.47
    num::int_sqrt::u32_sqrt_random_small    149.02/iter    +/- 4.50
    num::int_sqrt::u64_sqrt_predictable    5641.28/iter  +/- 136.16
    num::int_sqrt::u64_sqrt_random         1703.97/iter   +/- 43.75
    num::int_sqrt::u64_sqrt_random_small    197.42/iter    +/- 4.03
    num::int_sqrt::u8_sqrt_predictable       42.90/iter    +/- 1.39
    num::int_sqrt::u8_sqrt_random           133.55/iter    +/- 3.92
    num::int_sqrt::u8_sqrt_random_small     133.85/iter    +/- 2.36

With the changes I'm going to push soon

benchmarks:
    num::int_sqrt::u128_sqrt_predictable  26622.63/iter +/- 386.54
    num::int_sqrt::u128_sqrt_random        3584.59/iter +/- 131.42
    num::int_sqrt::u128_sqrt_random_small   904.23/iter  +/- 22.04
    num::int_sqrt::u16_sqrt_predictable     388.01/iter   +/- 7.85
    num::int_sqrt::u16_sqrt_random          480.00/iter  +/- 16.07
    num::int_sqrt::u16_sqrt_random_small    332.05/iter   +/- 7.46
    num::int_sqrt::u32_sqrt_predictable    1317.23/iter  +/- 25.71
    num::int_sqrt::u32_sqrt_random          783.17/iter   +/- 8.53
    num::int_sqrt::u32_sqrt_random_small    415.10/iter  +/- 14.06
    num::int_sqrt::u64_sqrt_predictable    4563.42/iter +/- 129.48
    num::int_sqrt::u64_sqrt_random         1378.32/iter  +/- 60.46
    num::int_sqrt::u64_sqrt_random_small    658.41/iter   +/- 8.93
    num::int_sqrt::u8_sqrt_predictable       42.90/iter   +/- 0.82
    num::int_sqrt::u8_sqrt_random           132.98/iter   +/- 1.53
    num::int_sqrt::u8_sqrt_random_small     132.94/iter   +/- 2.94

There's a tradeoff here. In the push I'm going to make soon, randomly chosen inputs from the whole input type range get a decent speedup from the initial pull request's code, but randomly chosen small inputs get about 3 times slower. Is that a good tradeoff? Is there another benchmark that should be added?

@ChaiTRex

This comment was marked as outdated.

@ChaiTRex
Copy link
Contributor Author

ChaiTRex commented Aug 1, 2024

With a uniform distribution, we get these results:

Uniform distribution benchmarks
original_u8             time:   [9.7782 ns 9.8095 ns 9.8501 ns]
karatsuba_u8            time:   [5.2377 ns 5.2448 ns 5.2531 ns]
karatsuba_2_u8          time:   [5.2304 ns 5.2347 ns 5.2399 ns]

original_u16            time:   [13.415 ns 13.437 ns 13.465 ns]
karatsuba_u16           time:   [6.7425 ns 6.7487 ns 6.7568 ns]
karatsuba_2_u16         time:   [6.3988 ns 6.4035 ns 6.4094 ns]

original_u32            time:   [19.016 ns 19.044 ns 19.078 ns]
karatsuba_u32           time:   [10.162 ns 10.182 ns 10.211 ns]
karatsuba_2_u32         time:   [8.8756 ns 8.8866 ns 8.8993 ns]

original_u64            time:   [44.361 ns 44.413 ns 44.480 ns]
karatsuba_u64           time:   [19.043 ns 19.144 ns 19.276 ns]
karatsuba_2_u64         time:   [16.330 ns 16.359 ns 16.392 ns]

original_u128           time:   [147.14 ns 147.49 ns 147.88 ns]
karatsuba_u128          time:   [51.335 ns 51.455 ns 51.602 ns]
karatsuba_2_u128        time:   [45.447 ns 45.507 ns 45.575 ns]

library/core/src/num/int_sqrt.rs Show resolved Hide resolved
library/core/src/num/int_sqrt.rs Show resolved Hide resolved
library/core/src/num/int_macros.rs Outdated Show resolved Hide resolved
library/core/src/num/nonzero.rs Show resolved Hide resolved
library/core/src/num/nonzero.rs Show resolved Hide resolved
library/core/src/num/int_sqrt.rs Outdated Show resolved Hide resolved
library/core/src/num/int_sqrt.rs Show resolved Hide resolved
library/core/src/num/int_sqrt.rs Outdated Show resolved Hide resolved
library/core/src/num/int_macros.rs Outdated Show resolved Hide resolved
library/core/src/num/int_sqrt.rs Outdated Show resolved Hide resolved
@tgross35
Copy link
Contributor

Thanks for the changes, this looks a lot tidier with the deduplication. I left a handful of new comments but those are just small things, nothing major.

I'm not sure whether you had been waiting on a review, but comment @rustbot ready whenever you are done. This will need a squash before merge but no need to do that until it is final.

@rustbot rustbot added the has-merge-commits PR has merge commits, merge with caution. label Aug 19, 2024
@rustbot

This comment was marked as outdated.

@rustbot rustbot removed the has-merge-commits PR has merge commits, merge with caution. label Aug 19, 2024
@ChaiTRex
Copy link
Contributor Author

With regard to the squash, would it be alright to have two final commits, one for the testing and benchmarking and the next for the implementation? This would allow the benchmarking at least to be run on the prior implementation for comparison.

@ChaiTRex
Copy link
Contributor Author

@rustbot ready

@rustbot rustbot removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Aug 20, 2024
@tgross35
Copy link
Contributor

That makes more sense. I'll retry after PR check finishes.

@rust-log-analyzer
Copy link
Collaborator

The job test-various failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
file:.git/config remote.origin.url=https://github.com/rust-lang-ci/rust
file:.git/config remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*
file:.git/config gc.auto=0
file:.git/config http.https://github.com/.extraheader=AUTHORIZATION: basic ***
file:.git/config branch.try.remote=origin
file:.git/config branch.try.merge=refs/heads/try
file:.git/config submodule.library/backtrace.url=https://github.com/rust-lang/backtrace-rs.git
file:.git/config submodule.library/stdarch.active=true
file:.git/config submodule.library/stdarch.url=https://github.com/rust-lang/stdarch.git
file:.git/config submodule.src/doc/book.active=true
---
test num::int_log::ilog10_of_0_panic ... ignored
Error: failed to run main module `/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-std/wasm32-wasip1/release/deps/coretests-725e4eb03cd34f4a.wasm`

Caused by:
    0: failed to invoke command default
    1: error while executing at wasm backtrace:
           0: 0x1889d5 - coretests-725e4eb03cd34f4a.wasm!__rust_start_panic
           1: 0x1884a7 - coretests-725e4eb03cd34f4a.wasm!rust_panic
           2: 0x188306 - coretests-725e4eb03cd34f4a.wasm!std::panicking::rust_panic_with_hook::h4b7cdc6a86811ab2
           3: 0x187153 - coretests-725e4eb03cd34f4a.wasm!std::panicking::begin_panic_handler::{{closure}}::h5d51873c650ae66a
           4: 0x187092 - coretests-725e4eb03cd34f4a.wasm!std::sys::backtrace::__rust_end_short_backtrace::h99bf0f7bbe9f5e25
           5: 0x187c42 - coretests-725e4eb03cd34f4a.wasm!rust_begin_unwind
           6: 0x195f5d - coretests-725e4eb03cd34f4a.wasm!core::panicking::panic_fmt::hf9f7de3dbe87d903
           7: 0x195fa1 - coretests-725e4eb03cd34f4a.wasm!core::num::int_sqrt::panic_for_negative_argument::hc1127e51220785cc
           8: 0x2073 - coretests-725e4eb03cd34f4a.wasm!std::panicking::try::do_call::h8729d35fde7e957c
           9: 0xc2cdd - coretests-725e4eb03cd34f4a.wasm!core::ops::function::FnOnce::call_once::hae34aa187e462869
          10: 0x179318 - coretests-725e4eb03cd34f4a.wasm!test::__rust_begin_short_backtrace::hed313d8f52171515
          11: 0x17917c - coretests-725e4eb03cd34f4a.wasm!test::types::RunnableTest::run::h7997ca9580bbaeb7
          12: 0x163115 - coretests-725e4eb03cd34f4a.wasm!test::run_test::h08c5263a618aa01e
          13: 0x15aec0 - coretests-725e4eb03cd34f4a.wasm!test::console::run_tests_console::h092bd764fe45d0b9
          14: 0x1794b6 - coretests-725e4eb03cd34f4a.wasm!test::test_main::h1b006f7d4c4b791a
          15: 0x17a16c - coretests-725e4eb03cd34f4a.wasm!test::test_main_static::h28974b6ae93adab2
          16: 0x1433f6 - coretests-725e4eb03cd34f4a.wasm!coretests::main::h5f5ac5fe77cac24f
          17: 0x1e55 - coretests-725e4eb03cd34f4a.wasm!std::sys::backtrace::__rust_begin_short_backtrace::h38b211e3e9744845
          18: 0x1e48 - coretests-725e4eb03cd34f4a.wasm!std::rt::lang_start::{{closure}}::h299149183f8c9350
          19: 0x181bbd - coretests-725e4eb03cd34f4a.wasm!std::rt::lang_start_internal::h15e4ecb84bb7fefe
          20: 0x14342e - coretests-725e4eb03cd34f4a.wasm!__main_void
          21: 0x1812 - coretests-725e4eb03cd34f4a.wasm!_start
       note: using the `WASMTIME_BACKTRACE_DETAILS=1` environment variable may show more debugging information
    2: wasm trap: wasm `unreachable` instruction executed
test num::int_log::ilog10_u16 ... ok
test num::int_log::ilog10_u32 ... ok
test num::int_log::ilog10_u64 ... ok
test num::int_log::ilog10_u8 ... ok
test num::int_log::ilog10_u8 ... ok
test num::int_log::ilog1_of_1_panic ... ignored
test num::int_log::ilog2_of_0_panic ... ignored
test num::int_log::ilog3_of_0_panic ... ignored
error: test failed, to rerun pass `-p core --test coretests`

Caused by:
  process didn't exit successfully: `/wasmtime-v19.0.0-x86_64-linux/wasmtime run -C cache=n --dir . --env RUSTC_BOOTSTRAP /checkout/obj/build/x86_64-unknown-linux-gnu/stage1-std/wasm32-wasip1/release/deps/coretests-725e4eb03cd34f4a.wasm -Z unstable-options --format json` (exit status: 134)
note: test exited abnormally; to see the full output pass --nocapture to the harness.
  local time: Thu Aug 29 03:39:58 UTC 2024
  network time: Thu, 29 Aug 2024 03:39:58 GMT
##[error]Process completed with exit code 1.
Post job cleanup.

@tgross35
Copy link
Contributor

Yay

@bors try

@bors
Copy link
Contributor

bors commented Aug 29, 2024

⌛ Trying commit 7af8e21 with merge daf8e6a...

bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 29, 2024
Improved `checked_isqrt` and `isqrt` methods

### Improved tests of `isqrt` and `checked_isqrt` implementations

* Inputs chosen more thoroughly and systematically.
* Checks that `isqrt` and `checked_isqrt` have equivalent results for signed types, either equivalent numerically or equivalent as a panic and a `None`.
* Checks that `isqrt` has numerically-equivalent results for unsigned types and their `NonZero` counterparts.

### Added benchmarks for `isqrt` implementations

### Greatly sped up `checked_isqrt` and `isqrt` methods

* Uses a lookup table for 8-bit integers and then the Karatsuba square root algorithm for larger integers.
* Includes optimization hints that give the compiler the exact numeric range of results.

### Feature tracking issue

`isqrt` is an unstable feature tracked at rust-lang#116226.

<details><summary>Benchmarked improvements</summary>

### Command used to benchmark

    ./x bench library/core -- int_sqrt

### Before

    benchmarks:
        num::int_sqrt::i128::isqrt           439591.65/iter  +/- 6652.70
        num::int_sqrt::i16::isqrt              5302.97/iter   +/- 160.93
        num::int_sqrt::i32::isqrt             62999.11/iter  +/- 2022.05
        num::int_sqrt::i64::isqrt            125248.81/iter  +/- 1674.43
        num::int_sqrt::i8::isqrt                123.56/iter     +/- 1.87
        num::int_sqrt::isize::isqrt          125356.56/iter  +/- 1017.03
        num::int_sqrt::non_zero_u128::isqrt  437443.75/iter  +/- 3535.43
        num::int_sqrt::non_zero_u16::isqrt     8604.58/iter    +/- 94.76
        num::int_sqrt::non_zero_u32::isqrt    62933.33/iter   +/- 517.30
        num::int_sqrt::non_zero_u64::isqrt   125076.38/iter +/- 11340.61
        num::int_sqrt::non_zero_u8::isqrt       221.51/iter     +/- 1.58
        num::int_sqrt::non_zero_usize::isqrt 136005.21/iter  +/- 2020.35
        num::int_sqrt::u128::isqrt           439014.55/iter  +/- 3920.45
        num::int_sqrt::u16::isqrt              8575.08/iter   +/- 148.06
        num::int_sqrt::u32::isqrt             63008.89/iter   +/- 803.67
        num::int_sqrt::u64::isqrt            125088.09/iter   +/- 879.29
        num::int_sqrt::u8::isqrt                230.18/iter     +/- 2.04
        num::int_sqrt::usize::isqrt          125237.51/iter  +/- 4747.83
### After

    benchmarks:
        num::int_sqrt::i128::isqrt           105184.89/iter +/- 1171.38
        num::int_sqrt::i16::isqrt              1910.26/iter   +/- 78.50
        num::int_sqrt::i32::isqrt             34260.34/iter  +/- 960.84
        num::int_sqrt::i64::isqrt             45939.19/iter +/- 2525.65
        num::int_sqrt::i8::isqrt                 22.87/iter    +/- 0.45
        num::int_sqrt::isize::isqrt           45884.17/iter  +/- 595.49
        num::int_sqrt::non_zero_u128::isqrt  106344.27/iter  +/- 780.99
        num::int_sqrt::non_zero_u16::isqrt     2790.19/iter   +/- 53.43
        num::int_sqrt::non_zero_u32::isqrt    33613.99/iter  +/- 362.96
        num::int_sqrt::non_zero_u64::isqrt    46235.42/iter  +/- 429.69
        num::int_sqrt::non_zero_u8::isqrt        31.78/iter    +/- 0.75
        num::int_sqrt::non_zero_usize::isqrt  46208.75/iter  +/- 375.27
        num::int_sqrt::u128::isqrt           106385.94/iter +/- 1649.95
        num::int_sqrt::u16::isqrt              2747.69/iter   +/- 28.72
        num::int_sqrt::u32::isqrt             33627.09/iter  +/- 475.68
        num::int_sqrt::u64::isqrt             46182.29/iter  +/- 311.16
        num::int_sqrt::u8::isqrt                 33.10/iter    +/- 0.30
        num::int_sqrt::usize::isqrt           46165.00/iter  +/- 388.41

</details>

Tracking Issue for {u8,i8,...}::isqrt rust-lang#116226

try-job: test-various
@bors
Copy link
Contributor

bors commented Aug 29, 2024

☀️ Try build successful - checks-actions
Build commit: daf8e6a (daf8e6a755a4ccac4f08164a49fff14a2cd7e1d0)

@tgross35
Copy link
Contributor

Great! Thanks for the fix.

@bors r+

@bors
Copy link
Contributor

bors commented Aug 29, 2024

📌 Commit 7af8e21 has been approved by tgross35

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 29, 2024
@workingjubilee
Copy link
Member

Sorry to bother but is there a particular reason that #[should_panic] annotations on a separate test function is not appropriate? Like so: https://doc.rust-lang.org/rust-by-example/testing/unit_testing.html#testing-panics

@tgross35
Copy link
Contributor

I thought about that too - but it runs the tests on a range of numbers, so it's not really feasible. In theory we could spot check a few numbers or we probably don't really need to test this, but it seems pretty harmless.

@workingjubilee
Copy link
Member

hm, okay!

@workingjubilee
Copy link
Member

The most useful negative integers to test are usually -1, -2, -10, and i{N}::MIN.

The last often manages to do something clever.

workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Aug 29, 2024
Improved `checked_isqrt` and `isqrt` methods

### Improved tests of `isqrt` and `checked_isqrt` implementations

* Inputs chosen more thoroughly and systematically.
* Checks that `isqrt` and `checked_isqrt` have equivalent results for signed types, either equivalent numerically or equivalent as a panic and a `None`.
* Checks that `isqrt` has numerically-equivalent results for unsigned types and their `NonZero` counterparts.

### Added benchmarks for `isqrt` implementations

### Greatly sped up `checked_isqrt` and `isqrt` methods

* Uses a lookup table for 8-bit integers and then the Karatsuba square root algorithm for larger integers.
* Includes optimization hints that give the compiler the exact numeric range of results.

### Feature tracking issue

`isqrt` is an unstable feature tracked at rust-lang#116226.

<details><summary>Benchmarked improvements</summary>

### Command used to benchmark

    ./x bench library/core -- int_sqrt

### Before

    benchmarks:
        num::int_sqrt::i128::isqrt           439591.65/iter  +/- 6652.70
        num::int_sqrt::i16::isqrt              5302.97/iter   +/- 160.93
        num::int_sqrt::i32::isqrt             62999.11/iter  +/- 2022.05
        num::int_sqrt::i64::isqrt            125248.81/iter  +/- 1674.43
        num::int_sqrt::i8::isqrt                123.56/iter     +/- 1.87
        num::int_sqrt::isize::isqrt          125356.56/iter  +/- 1017.03
        num::int_sqrt::non_zero_u128::isqrt  437443.75/iter  +/- 3535.43
        num::int_sqrt::non_zero_u16::isqrt     8604.58/iter    +/- 94.76
        num::int_sqrt::non_zero_u32::isqrt    62933.33/iter   +/- 517.30
        num::int_sqrt::non_zero_u64::isqrt   125076.38/iter +/- 11340.61
        num::int_sqrt::non_zero_u8::isqrt       221.51/iter     +/- 1.58
        num::int_sqrt::non_zero_usize::isqrt 136005.21/iter  +/- 2020.35
        num::int_sqrt::u128::isqrt           439014.55/iter  +/- 3920.45
        num::int_sqrt::u16::isqrt              8575.08/iter   +/- 148.06
        num::int_sqrt::u32::isqrt             63008.89/iter   +/- 803.67
        num::int_sqrt::u64::isqrt            125088.09/iter   +/- 879.29
        num::int_sqrt::u8::isqrt                230.18/iter     +/- 2.04
        num::int_sqrt::usize::isqrt          125237.51/iter  +/- 4747.83
### After

    benchmarks:
        num::int_sqrt::i128::isqrt           105184.89/iter +/- 1171.38
        num::int_sqrt::i16::isqrt              1910.26/iter   +/- 78.50
        num::int_sqrt::i32::isqrt             34260.34/iter  +/- 960.84
        num::int_sqrt::i64::isqrt             45939.19/iter +/- 2525.65
        num::int_sqrt::i8::isqrt                 22.87/iter    +/- 0.45
        num::int_sqrt::isize::isqrt           45884.17/iter  +/- 595.49
        num::int_sqrt::non_zero_u128::isqrt  106344.27/iter  +/- 780.99
        num::int_sqrt::non_zero_u16::isqrt     2790.19/iter   +/- 53.43
        num::int_sqrt::non_zero_u32::isqrt    33613.99/iter  +/- 362.96
        num::int_sqrt::non_zero_u64::isqrt    46235.42/iter  +/- 429.69
        num::int_sqrt::non_zero_u8::isqrt        31.78/iter    +/- 0.75
        num::int_sqrt::non_zero_usize::isqrt  46208.75/iter  +/- 375.27
        num::int_sqrt::u128::isqrt           106385.94/iter +/- 1649.95
        num::int_sqrt::u16::isqrt              2747.69/iter   +/- 28.72
        num::int_sqrt::u32::isqrt             33627.09/iter  +/- 475.68
        num::int_sqrt::u64::isqrt             46182.29/iter  +/- 311.16
        num::int_sqrt::u8::isqrt                 33.10/iter    +/- 0.30
        num::int_sqrt::usize::isqrt           46165.00/iter  +/- 388.41

</details>

Tracking Issue for {u8,i8,...}::isqrt rust-lang#116226

try-job: test-various
@tgross35
Copy link
Contributor

Ah that's actually a good point - I thought we were checking a range but it looks like we are checking MIN https://github.com/rust-lang/rust/pull/128166/files#diff-eef7e7d7a0a129c86de4f1152c95cbbfcdf4dd594cfbf8d4a42c5b7f90746f4dR24.

@ChaiTRex am I missing something? If not, it would be good to add Jubilee's other suggested spot checks since the machinery is already there. It can just be a follow up since this is already in a rollup, and I don't really expect them to fail.

workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Aug 29, 2024
Improved `checked_isqrt` and `isqrt` methods

### Improved tests of `isqrt` and `checked_isqrt` implementations

* Inputs chosen more thoroughly and systematically.
* Checks that `isqrt` and `checked_isqrt` have equivalent results for signed types, either equivalent numerically or equivalent as a panic and a `None`.
* Checks that `isqrt` has numerically-equivalent results for unsigned types and their `NonZero` counterparts.

### Added benchmarks for `isqrt` implementations

### Greatly sped up `checked_isqrt` and `isqrt` methods

* Uses a lookup table for 8-bit integers and then the Karatsuba square root algorithm for larger integers.
* Includes optimization hints that give the compiler the exact numeric range of results.

### Feature tracking issue

`isqrt` is an unstable feature tracked at rust-lang#116226.

<details><summary>Benchmarked improvements</summary>

### Command used to benchmark

    ./x bench library/core -- int_sqrt

### Before

    benchmarks:
        num::int_sqrt::i128::isqrt           439591.65/iter  +/- 6652.70
        num::int_sqrt::i16::isqrt              5302.97/iter   +/- 160.93
        num::int_sqrt::i32::isqrt             62999.11/iter  +/- 2022.05
        num::int_sqrt::i64::isqrt            125248.81/iter  +/- 1674.43
        num::int_sqrt::i8::isqrt                123.56/iter     +/- 1.87
        num::int_sqrt::isize::isqrt          125356.56/iter  +/- 1017.03
        num::int_sqrt::non_zero_u128::isqrt  437443.75/iter  +/- 3535.43
        num::int_sqrt::non_zero_u16::isqrt     8604.58/iter    +/- 94.76
        num::int_sqrt::non_zero_u32::isqrt    62933.33/iter   +/- 517.30
        num::int_sqrt::non_zero_u64::isqrt   125076.38/iter +/- 11340.61
        num::int_sqrt::non_zero_u8::isqrt       221.51/iter     +/- 1.58
        num::int_sqrt::non_zero_usize::isqrt 136005.21/iter  +/- 2020.35
        num::int_sqrt::u128::isqrt           439014.55/iter  +/- 3920.45
        num::int_sqrt::u16::isqrt              8575.08/iter   +/- 148.06
        num::int_sqrt::u32::isqrt             63008.89/iter   +/- 803.67
        num::int_sqrt::u64::isqrt            125088.09/iter   +/- 879.29
        num::int_sqrt::u8::isqrt                230.18/iter     +/- 2.04
        num::int_sqrt::usize::isqrt          125237.51/iter  +/- 4747.83
### After

    benchmarks:
        num::int_sqrt::i128::isqrt           105184.89/iter +/- 1171.38
        num::int_sqrt::i16::isqrt              1910.26/iter   +/- 78.50
        num::int_sqrt::i32::isqrt             34260.34/iter  +/- 960.84
        num::int_sqrt::i64::isqrt             45939.19/iter +/- 2525.65
        num::int_sqrt::i8::isqrt                 22.87/iter    +/- 0.45
        num::int_sqrt::isize::isqrt           45884.17/iter  +/- 595.49
        num::int_sqrt::non_zero_u128::isqrt  106344.27/iter  +/- 780.99
        num::int_sqrt::non_zero_u16::isqrt     2790.19/iter   +/- 53.43
        num::int_sqrt::non_zero_u32::isqrt    33613.99/iter  +/- 362.96
        num::int_sqrt::non_zero_u64::isqrt    46235.42/iter  +/- 429.69
        num::int_sqrt::non_zero_u8::isqrt        31.78/iter    +/- 0.75
        num::int_sqrt::non_zero_usize::isqrt  46208.75/iter  +/- 375.27
        num::int_sqrt::u128::isqrt           106385.94/iter +/- 1649.95
        num::int_sqrt::u16::isqrt              2747.69/iter   +/- 28.72
        num::int_sqrt::u32::isqrt             33627.09/iter  +/- 475.68
        num::int_sqrt::u64::isqrt             46182.29/iter  +/- 311.16
        num::int_sqrt::u8::isqrt                 33.10/iter    +/- 0.30
        num::int_sqrt::usize::isqrt           46165.00/iter  +/- 388.41

</details>

Tracking Issue for {u8,i8,...}::isqrt rust-lang#116226

try-job: test-various
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 29, 2024
…kingjubilee

Rollup of 9 pull requests

Successful merges:

 - rust-lang#120221 (Don't make statement nonterminals match pattern nonterminals)
 - rust-lang#127912 (std: make `thread::current` available in all `thread_local!` destructors)
 - rust-lang#128166 (Improved `checked_isqrt` and `isqrt` methods)
 - rust-lang#129123 (rustdoc-json: Add test for `Self` type)
 - rust-lang#129366 (linker: Synchronize native library search in rustc and linker)
 - rust-lang#129527 (Don't use `TyKind` in a lint)
 - rust-lang#129534 (Deny `wasm_c_abi` lint to nudge the last 25%)
 - rust-lang#129640 (Re-enable android tests/benches in alloc/core)
 - rust-lang#129675 (allow BufReader::peek to be called on unsized types)

r? `@ghost`
`@rustbot` modify labels: rollup
Comment on lines +26 to +31
for n in (0..=127)
.chain(<$T>::MAX - 127..=<$T>::MAX)
.chain((0..<$T>::MAX.count_ones()).map(|exponent| (1 << exponent) - 1))
.chain((0..<$T>::MAX.count_ones()).map(|exponent| 1 << exponent))
{
isqrt_consistency_check(n);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the range.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I meant specifically a range including negative values

@ChaiTRex
Copy link
Contributor Author

ChaiTRex commented Aug 29, 2024

@tgross35 @workingjubilee There are several places where isqrt_consistency_check is called, such as six places in the isqrt_extended test. It basically checks i*::MIN and the negative version of every positive value that gets tested (there are thousands of positive values that get checked).

Do #[should_panic] tests run on platforms like Wasm where all panics abort? If they do, I could add a few #[should_panic] tests to also test on those platforms.

bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 29, 2024
…llaumeGomez

Rollup of 7 pull requests

Successful merges:

 - rust-lang#123940 (debug-fmt-detail option)
 - rust-lang#128166 (Improved `checked_isqrt` and `isqrt` methods)
 - rust-lang#128970 (Add `-Zlint-llvm-ir`)
 - rust-lang#129316 (riscv64imac: allow shadow call stack sanitizer)
 - rust-lang#129690 (Add `needs-unwind` compiletest directive to `libtest-thread-limit` and replace some `Path` with `path` in `run-make`)
 - rust-lang#129732 (Add `unreachable_pub`, round 3)
 - rust-lang#129743 (Fix rustdoc clippy lints)

r? `@ghost`
`@rustbot` modify labels: rollup
@ChaiTRex
Copy link
Contributor Author

ChaiTRex commented Aug 29, 2024

nemo157 mentioned on the community Discord server that -Zpanic-abort-test can handle should_panic tests on platforms that abort during panic. Since abort-on-panic eliminates my catch_unwind checks, I'll add a few should_panic tests for negative inputs in my next pull request so that panicking is tested on those platforms.

@tgross35
Copy link
Contributor

@tgross35 @workingjubilee There are several places where isqrt_consistency_check is called, such as six places in the isqrt_extended test. It basically checks i*::MIN and the negative version of every positive value that gets tested (there are thousands of positive values that get checked).

I missed that it checked .wrapping_neg() too within that function, all good then.

Do #[should_panic] tests run on platforms like Wasm where all panics abort? If they do, I could add a few #[should_panic] tests to also test on those platforms.

[...]

nemo157 mentioned on the community Discord server that -Zpanic-abort-test can handle should_panic tests on platforms that abort during panic. Since abort-on-panic eliminates my catch_unwind checks, I'll add a few should_panic tests for negative inputs in my next pull request so that panicking is tested on those platforms.

We can pretty reasonably expect that if it panics on one platform and all other tests pass, it will continue on panic on others. So no worries here, I don't think you need a follow up.

bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 29, 2024
…llaumeGomez

Rollup of 7 pull requests

Successful merges:

 - rust-lang#123940 (debug-fmt-detail option)
 - rust-lang#128166 (Improved `checked_isqrt` and `isqrt` methods)
 - rust-lang#128970 (Add `-Zlint-llvm-ir`)
 - rust-lang#129316 (riscv64imac: allow shadow call stack sanitizer)
 - rust-lang#129690 (Add `needs-unwind` compiletest directive to `libtest-thread-limit` and replace some `Path` with `path` in `run-make`)
 - rust-lang#129732 (Add `unreachable_pub`, round 3)
 - rust-lang#129743 (Fix rustdoc clippy lints)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 4b08b2e into rust-lang:master Aug 29, 2024
7 checks passed
@rustbot rustbot added this to the 1.82.0 milestone Aug 29, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Aug 29, 2024
Rollup merge of rust-lang#128166 - ChaiTRex:isqrt, r=tgross35

Improved `checked_isqrt` and `isqrt` methods

### Improved tests of `isqrt` and `checked_isqrt` implementations

* Inputs chosen more thoroughly and systematically.
* Checks that `isqrt` and `checked_isqrt` have equivalent results for signed types, either equivalent numerically or equivalent as a panic and a `None`.
* Checks that `isqrt` has numerically-equivalent results for unsigned types and their `NonZero` counterparts.

### Added benchmarks for `isqrt` implementations

### Greatly sped up `checked_isqrt` and `isqrt` methods

* Uses a lookup table for 8-bit integers and then the Karatsuba square root algorithm for larger integers.
* Includes optimization hints that give the compiler the exact numeric range of results.

### Feature tracking issue

`isqrt` is an unstable feature tracked at rust-lang#116226.

<details><summary>Benchmarked improvements</summary>

### Command used to benchmark

    ./x bench library/core -- int_sqrt

### Before

    benchmarks:
        num::int_sqrt::i128::isqrt           439591.65/iter  +/- 6652.70
        num::int_sqrt::i16::isqrt              5302.97/iter   +/- 160.93
        num::int_sqrt::i32::isqrt             62999.11/iter  +/- 2022.05
        num::int_sqrt::i64::isqrt            125248.81/iter  +/- 1674.43
        num::int_sqrt::i8::isqrt                123.56/iter     +/- 1.87
        num::int_sqrt::isize::isqrt          125356.56/iter  +/- 1017.03
        num::int_sqrt::non_zero_u128::isqrt  437443.75/iter  +/- 3535.43
        num::int_sqrt::non_zero_u16::isqrt     8604.58/iter    +/- 94.76
        num::int_sqrt::non_zero_u32::isqrt    62933.33/iter   +/- 517.30
        num::int_sqrt::non_zero_u64::isqrt   125076.38/iter +/- 11340.61
        num::int_sqrt::non_zero_u8::isqrt       221.51/iter     +/- 1.58
        num::int_sqrt::non_zero_usize::isqrt 136005.21/iter  +/- 2020.35
        num::int_sqrt::u128::isqrt           439014.55/iter  +/- 3920.45
        num::int_sqrt::u16::isqrt              8575.08/iter   +/- 148.06
        num::int_sqrt::u32::isqrt             63008.89/iter   +/- 803.67
        num::int_sqrt::u64::isqrt            125088.09/iter   +/- 879.29
        num::int_sqrt::u8::isqrt                230.18/iter     +/- 2.04
        num::int_sqrt::usize::isqrt          125237.51/iter  +/- 4747.83
### After

    benchmarks:
        num::int_sqrt::i128::isqrt           105184.89/iter +/- 1171.38
        num::int_sqrt::i16::isqrt              1910.26/iter   +/- 78.50
        num::int_sqrt::i32::isqrt             34260.34/iter  +/- 960.84
        num::int_sqrt::i64::isqrt             45939.19/iter +/- 2525.65
        num::int_sqrt::i8::isqrt                 22.87/iter    +/- 0.45
        num::int_sqrt::isize::isqrt           45884.17/iter  +/- 595.49
        num::int_sqrt::non_zero_u128::isqrt  106344.27/iter  +/- 780.99
        num::int_sqrt::non_zero_u16::isqrt     2790.19/iter   +/- 53.43
        num::int_sqrt::non_zero_u32::isqrt    33613.99/iter  +/- 362.96
        num::int_sqrt::non_zero_u64::isqrt    46235.42/iter  +/- 429.69
        num::int_sqrt::non_zero_u8::isqrt        31.78/iter    +/- 0.75
        num::int_sqrt::non_zero_usize::isqrt  46208.75/iter  +/- 375.27
        num::int_sqrt::u128::isqrt           106385.94/iter +/- 1649.95
        num::int_sqrt::u16::isqrt              2747.69/iter   +/- 28.72
        num::int_sqrt::u32::isqrt             33627.09/iter  +/- 475.68
        num::int_sqrt::u64::isqrt             46182.29/iter  +/- 311.16
        num::int_sqrt::u8::isqrt                 33.10/iter    +/- 0.30
        num::int_sqrt::usize::isqrt           46165.00/iter  +/- 388.41

</details>

Tracking Issue for {u8,i8,...}::isqrt rust-lang#116226

try-job: test-various
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants