-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Number of iterations does not scale #133
Comments
Hey, thanks for trying out Criterion.rs. I checked out your test code. When I run it locally, it behaves as expected and completes the benchmarks in the expected amount of time. I tried changing the constants to 10 and 20 as well as some other experiments, but Criterion.rs continued to behave as expected. I did my testing with the stable-x86_64-unknown-linux-gnu build1.24.1 as well. I'd like to help you, but I can't think of anything more I can do without being able to reproduce this issue myself. The only thing I can think of is that something might cause the CPU or the benchmark process to slow down drastically right after the warmup phase is completed. Power control settings, maybe, or interference from other processes? I'm just guessing at this point. Sorry I couldn't be more helpful. If you can find any more information, please share it. |
Thank you for looking into the issue. I was both hoping and afraid this was a localized problem specific to me :) This happens on two independent machines; a desktop PC and a notebook so I would almost rule out power saving issues. The only common thing about them is they run Arch Linux configured by me, which is something to look into. I will keep you updated when I find out anything useful. |
Here comes the update... I ran my tests on a raspberry with the same results again. So it does not seem to be related to Arch Linux at least. I then downgraded to the 1.23.0 toolchain and got the expected results. I only tried this on one machine yet, but this seems seems to strongly point at the toolchain. Searching for time in the release notes only revealed rust-lang/rust#46828. No idea if this is related. Update Tested on Windows too with the same results. Here is an overview of what I tried and the results I got:
Results: ✔️ = works; ❌ = does not work; ❔ = not tested |
Curiouser and curiouser. I've run this test on one of my own machines which runs Arch on 1.23, 1.24.1 and nightly and still was not able to reproduce this issue. However, I tried running it on Windows with 1.24.1 and I was able to reproduce this issue. However, it does not appear with 1.23 even on Windows. Very strange. I will continue digging into this as soon as I have some time to do so. I'm very reluctant to call this a compiler regression, but it does seem to be correlated with the compiler version. Edit: Have confirmed this on two different Windows machines, but running the Linux compiler (whether on Arch or through WSL) behaves as expected. |
I should probably go to bed, but this is too strange. I've narrowed this down to the Stranger still, if I change it to this:
It suddenly starts working correctly, even though This is starting to look like a compiler regression. |
This does sound like a compiler regression. My unasked for 2c is that it is time to open a issue on the rust repo. They are smart, knowledge people with good tools to track this down. and they don't bite. |
Could the compiler under certain conditions optimize away the the benchmark closure in the warmup phase? But if so, why doesn't it optimize away the sampling run? Just wildly speculating... |
Works for me too. Thank you! |
I tried to get started with Criterion by basically copy&pasting the Getting Started guide into a new cargo project. Then, running
cargo bench
never finished collecting samples.By reducing the fibonacci number and the measurement time I finally managed to get results.
This is the result for
fibonacci(5)
:And for This is the result for
fibonacci(10)
:The total
cargo
run times (12.72s and 656.87s) include compilation and analysis time, but these should be mostly negligible compared to the time spent collecting samples, especially in the second case.Interestingly, the number of iterations is almost the same in both runs (3776M and 3777M). Since the function is supposed to run slower in the second run, I assume there should be less iterations. Could there be a problem with estimating the required number of iterations from the warm-up phase?
Update
I was able to reproduce the behavior on my laptop. Both system run
stable-x86_64-unknown-linux-gnu
toolchain withrustc 1.24.1
. Here is the code I used: https://github.com/mbillingr/criterion-test.rsThe text was updated successfully, but these errors were encountered: