Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wasm32 miscompilation when using u128 with multivalue and optimizations #127318

Open
arriven opened this issue Jul 4, 2024 · 7 comments
Open
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-external-bug Category: issue that is caused by bugs in software beyond our control I-miscompile Issue: Correct Rust code lowers to incorrect machine code I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-wasm Target: WASM (WebAssembly), http://webassembly.org/ P-high High priority S-has-mcve Status: A Minimal Complete and Verifiable Example has been found for this issue T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@arriven
Copy link

arriven commented Jul 4, 2024

I am getting a miscompiled wasm that incorrectly performs arithmetic operations on u128 and Option<u128> values. Based on my (poor) analysis of MIR, LLVM IR, and resulting wasm the issue seems to be on llvm side but it would be best for someone more competent to double-check that;

I managed to get the reproduction down to this code:

#[inline(never)]
fn get_total_opt(a: u32, b: u32) -> Option<u128> {
    if a > 100 {
        Some(a as u128 + b as u128)
    } else {
        None
    }
}

#[inline(never)]
fn calculate(a: u32, b: u32, diff: u128) -> u128 {
    let total_opt = get_total_opt(a, b);
    // uncommenting any of these lines fixes the issue
    // println!("{}", total_opt.is_some());
    // assert!(total_opt.is_some());
    total_opt.unwrap_or_default() + diff
}

fn main() {
    let mut input_line = std::string::String::new();
    std::io::stdin()
        .read_line(&mut input_line)
        .expect("Failed to read line");
    let x: u32 = input_line.trim().parse().expect("Input not an integer");
    let mut input_line = std::string::String::new();
    std::io::stdin()
        .read_line(&mut input_line)
        .expect("Failed to read line");
    let diff: u128 = input_line.trim().parse().expect("Input not an integer");
    println!("{:?}", calculate(x, x, diff));
}

Compiled with

rustc wasmmain.rs --target wasm32-wasi -C "target-feature=+multivalue" -C opt-level=s

And launched with wasmtime

I expected to see this happen: when entering two values above 100 (i.e. 111 and 111) I expect to see correct result (i.e. 333)

Instead, this happened:

# wasmtime wasmmain.wasm 
111
111
111

The behavior seems to only be exhibited in the calculate function; replacing unwrap_or_default with unwrap or adding a side-effect between the call to get_total_opt and the calculation seems to fix the issue. All of the changes to produced wasm are residing in the calculate function (if we exclude offsets changes)

Also disabling the optimizations (opt-level=0) or turning off the multivalue feature both address the issue as well. The issue also happens on wasm32-unknown-unknown target but I don't have as clean of a setup to showcase that

Emitted MIR
fn calculate(_1: u32, _2: u32, _3: u128) -> u128 {
    debug a => _1;
    debug b => _2;
    debug diff => _3;
    let mut _0: u128;
    let _4: std::option::Option<u128>;
    let mut _5: u128;
    scope 1 {
        debug total_opt => _4;
    }

    bb0: {
        _4 = get_total_opt(move _1, move _2) -> [return: bb1, unwind unreachable];
    }

    bb1: {
        StorageLive(_5);
        _5 = Option::<u128>::unwrap_or_default(move _4) -> [return: bb2, unwind unreachable];
    }

    bb2: {
        _0 = Add(move _5, _3);
        StorageDead(_5);
        return;
    }
}
Emitted llvm-ir
; wasmmain::calculate
; Function Attrs: mustprogress nofree noinline norecurse nosync nounwind optsize willreturn memory(none)
define internal fastcc noundef i128 @_ZN8wasmmain9calculate17hd962bed4287d8d71E(i32 noundef %a, i32 noundef %b, i128 noundef %diff) unnamed_addr #5 {
start:
; call wasmmain::get_total_opt
  %0 = tail call fastcc { i64, i128 } @_ZN8wasmmain13get_total_opt17h693d6b0c9dfa9b41E(i32 noundef %a, i32 noundef %b) #12
  %total_opt.0 = extractvalue { i64, i128 } %0, 0
  %total_opt.1 = extractvalue { i64, i128 } %0, 1
  %1 = and i64 %total_opt.0, 4294967295
  %switch.i = icmp eq i64 %1, 0
  %spec.select.i = select i1 %switch.i, i128 0, i128 %total_opt.1
  %_0 = add i128 %spec.select.i, %diff
  ret i128 %_0
}
Emitted wasm (converted to wat with wasm2wat)
(func $_ZN8wasmmain9calculate17hd962bed4287d8d71E (type 9) (param i32 i32) (result i64 i64)
    (local i64 i64 i64 i64)
    local.get 0
    local.get 1
    call $_ZN8wasmmain13get_total_opt17h693d6b0c9dfa9b41E
    local.set 4
    local.set 3
    local.set 2
    local.get 3
    i64.const 100
    i64.add
    local.tee 5
    i64.const 100
    local.get 2
    i32.wrap_i64
    local.tee 1
    select
    local.get 4
    local.get 5
    local.get 3
    i64.lt_u
    i64.extend_i32_u
    i64.add
    i64.const 0
    local.get 1
    select)

Both MIR and llvm-ir seem correct and with some wasm2c and simplification it seems like the resulting wasm is unconditionally returning the p2 and then conditionally returning either p3 or p3+corresponding u64 of the Option<u128> and I was able to confirm that with this modified program

Modified code

Code:

#[inline(never)]
fn get_total_opt(a: u32, b: u32) -> Option<u128> {
    if a > 100 {
        Some(a as u128 + b as u128 + u64::MAX as u128)
    } else {
        None
    }
}

#[inline(never)]
fn calculate(a: u32, b: u32, diff: u128) -> u128 {
    let total_opt = get_total_opt(a, b);
    // uncommenting any of these lines fixes the issue
    // println!("{}", total_opt.is_some());
    // assert!(total_opt.is_some());
    total_opt.unwrap_or_default() + diff
}

fn main() {
    let mut input_line = std::string::String::new();
    std::io::stdin()
        .read_line(&mut input_line)
        .expect("Failed to read line");
    let x: u32 = input_line.trim().parse().expect("Input not an integer");
    let mut input_line = std::string::String::new();
    std::io::stdin()
        .read_line(&mut input_line)
        .expect("Failed to read line");
    let diff: u128 = input_line.trim().parse().expect("Input not an integer");
    println!("{:?}", calculate(x, x, diff + u64::MAX as u128));
    println!("{:?}", x as u128 + x as u128 + diff + u64::MAX as u128 + u64::MAX as u128);
}

Output:

111
111
36893488147419103342
36893488147419103563

Meta

rustc --version --verbose:

rustc 1.78.0 (9b00956e5 2024-04-29)
binary: rustc
commit-hash: 9b00956e56009bab2aa15d7bff10916599e3d6d6
commit-date: 2024-04-29
host: x86_64-unknown-linux-gnu
release: 1.78.0
LLVM version: 18.1.2

rustc +nightly --version --verbose:

rustc 1.81.0-nightly (aa1d4f682 2024-07-03)
binary: rustc
commit-hash: aa1d4f6826de006b02fed31a718ce4f674203721
commit-date: 2024-07-03
host: x86_64-unknown-linux-gnu
release: 1.81.0-nightly
LLVM version: 18.1.7
@arriven arriven added the C-bug Category: This is a bug. label Jul 4, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jul 4, 2024
@bjorn3 bjorn3 added the O-wasm Target: WASM (WebAssembly), http://webassembly.org/ label Jul 4, 2024
@beetrees
Copy link
Contributor

beetrees commented Jul 4, 2024

Reduced:

#[inline(never)]
fn get_total_opt() -> (u64, u64) {
    if std::hint::black_box(true) {
        (1, 7)
    } else {
        (0, 5)
    }
}

#[inline(never)]
fn calculate() -> u64 {
    let (flag, val) = get_total_opt();
    if flag != 0 {
        val
    } else {
        42
    }
}

fn main() {
    dbg!(calculate());
}

This should print [code.rs:19:5] calculate() = 7 but when compiled with rustc --target wasm32-wasip1 -O -Ctarget-feature=+multivalue prints [src/main.rs:19:5] calculate() = 0.

Edit: further reduced (u128 and enums no longer required).

@beetrees
Copy link
Contributor

beetrees commented Jul 4, 2024

Looking at the WASM (compiler explorer), the LLVM miscompilation appears to be related to locals. The WASM for the reduced calculate function is

example::calculate::h0788ad06022fa2ed:
        i64.const       42
        local.get       0
        call    example::get_total_opt::hb7b536b220445017
        local.set       0
        i64.eqz
        i64.select
        end_function

The i64.select ends up choosing between the results of i64.const 42 and local.get 0. The correct result is stored in local 0 with local.set 0, however it is stored after local 0 is loaded onto the stack ready for the i64.select. As the local is loaded before any value is stored in it, the local.get 0 results in the default local value of 0, which ends up being returned.

@veera-sivarajan
Copy link
Contributor

@rustbot label -needs-triage +A-LLVM +T-compiler +I-unsound

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. I-prioritize Issue: Indicates that prioritization has been requested for this issue. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Jul 5, 2024
@apiraino
Copy link
Contributor

apiraino commented Jul 8, 2024

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-high

@rustbot rustbot added P-high High priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Jul 8, 2024
@apiraino
Copy link
Contributor

apiraino commented Jul 8, 2024

@alexcrichton do you have any insights about the state of the target feature multivalue on WASM?

@alexcrichton
Copy link
Member

My impression is that multivalue in the LLVM backend "in theory works" but is not that heavily tested. I believe the maintainers are interested in cataloguing bugs but I'm not sure if they're interested in actively fixing issues. The best thing to do with issues like this is to remove Rust from the equation and get a reproduction with just LLVM IR. That could then be submitted as an upstream LLVM bug report.

@nikic
Copy link
Contributor

nikic commented Jul 10, 2024

Upstream issue: llvm/llvm-project#98323

@workingjubilee workingjubilee added the I-miscompile Issue: Correct Rust code lowers to incorrect machine code label Oct 2, 2024
@jieyouxu jieyouxu added S-has-mcve Status: A Minimal Complete and Verifiable Example has been found for this issue C-external-bug Category: issue that is caused by bugs in software beyond our control and removed C-bug Category: This is a bug. labels Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-external-bug Category: issue that is caused by bugs in software beyond our control I-miscompile Issue: Correct Rust code lowers to incorrect machine code I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-wasm Target: WASM (WebAssembly), http://webassembly.org/ P-high High priority S-has-mcve Status: A Minimal Complete and Verifiable Example has been found for this issue T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

10 participants