-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible miscompilation in >= 1.24.0 #49336
Comments
Manual bisection log using rustup: good rustc 1.24.0-nightly (edbd7d2 2017-12-20) So the change is inside the commit range cddc4a6...1abeb43 .... I only see bugfixes and ICE fixes... Old state:
New state:
|
Usually I only see a performance impact from some changes. Is there |
There is unsafe code elsewhere in Criterion.rs, yes. Not in this part of the code though. |
So it seems like there are only two possible ways forward:
|
We're going to call this P-high for now and revisit next week. I think the most plausible way to get to the bottom of this bug though is to compare the generated code with and without the workaround and try to see what optimization is being inhibited. |
triage: P-high |
I looked at the |
I took another shot at minimizing this, taking a different approach. Rather than try to construct a minimal case from scratch, I instead made a branch of Criterion.rs and started deleting everything I could delete without masking the bug or breaking the test case. I can't guarantee that this is a minimal case, but it's much closer to it than I was able to provide before. It also compiles much faster than previously, which should make it easier to iterate. Some things I discovered while doing so:
Updated steps to reproduce:
I hope this helps. If you need any further help tracking this down, let me know. |
Had a look at this again. TLDR: This doesn't seem like a miscompilation. Looking at @mbillingr's example code: fn fibonacci(n: u64) -> u64 {
match n {
0 => 1,
1 => 1,
n => fibonacci(n-1) + fibonacci(n-2),
}
}
fn criterion_benchmark1(c: &mut Criterion) {
c.bench_function("fib 1", |b| b.iter(|| fibonacci(1)));
}
fn criterion_benchmark2(c: &mut Criterion) {
c.bench_function("fib 2", |b| b.iter(|| fibonacci(3)));
} We can see benchmarks being performed on fibonacci calculating functions. With this code I see the following behaviour:
here, "before" means Now let's change the two functions a bit by evaluating the fibonacci expression: fn criterion_benchmark1(c: &mut Criterion) {
c.bench_function("fib 1", |b| b.iter(|| 1));
}
fn criterion_benchmark2(c: &mut Criterion) {
c.bench_function("fib 2", |b| b.iter(|| 3));
} We get:
Let's change it again to use black_box: #![feature(test)]
extern crate test;
use test::black_box;
fn criterion_benchmark1(c: &mut Criterion) {
c.bench_function("fib 1", |b| b.iter(|| fibonacci(black_box(3))));
}
fn criterion_benchmark2(c: &mut Criterion) {
c.bench_function("fib 2", |b| b.iter(|| fibonacci(black_box(3))));
}
So for me it seems that @eddyb 's change enabled an optimization that allowed LLVM to eval the |
Forgive me if I've missed something, but I don't think that explains everything. If it is just LLVM correctly optimizing the code, then why does black-boxing an apparently-unrelated value in the warmup routine break the optimization? For that matter, it doesn't explain why moving the warmup function into the lib module without changing it also breaks the optimization. Edited to add: |
Optimizers often have weird behaviour. They rely on metrics on whether to optimize something or not.
Very much possible. LLVM could first inline and then optimize it to a constant. |
I'm starting to think you might be right. I had discounted the possibility that the benchmarked function might get inlined all the way into warmup and measurement because I thought Rust couldn't inline functions across crates without |
Depends, are there any type parameters? If so, you can't compile without instantiating cross-crate. |
Ah, of course. You're correct, I'd forgotten about that. I think you're right, which suggests that the solution to this is to add a few carefully-chosen |
Fixes #133.
A user reported this to Criterion.rs - bheisler/criterion.rs#133
The short version is this - when the user's test crate's benchmarks are run on 1.24.0 or newer, Criterion.rs calculates some values incorrectly even though the code looks correct to me. They claim to have reproduced this behavior on Arch Linux, Windows and Raspbian. I have only been able to confirm it on Windows. Each of us has confirmed this behavior on multiple machines.
I was initially reluctant to call this a miscompilation, but when I started investigating it, any change I made caused the bug to stop happening - inserting println's, or even pushing values into a vector and printing them later caused the bug to disappear. Eventually I tried simply adding a call to
test::black_box
, which should have no observable effect on the code except to inhibit certain compiler optimizations. That also caused the bug to stop occurring. It may still be a bug in my code, but if so I can't find it.I've tried to create a minimal test case, but was unsuccessful. This bug is quite fragile.
Steps to reproduce:
criterion
(this isn't necessary but will save some compilation time)1.23.0-x86_64-pc-windows-msvc
, runcargo bench --bench my_benchmark -- --verbose
.rustc_fix
branch of https://github.com/japaric/criterion.rs and modify the Cargo.toml file ofcriterion-test.rs
to use that instead. This patch merely enables thetest
feature/crate and inserts one call totest::black_box
inroutine.rs:warm_up
. Verify that the expected behavior is restored.The text was updated successfully, but these errors were encountered: