Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ICE][miscompile] segv with combined atomic operation and nonnull pointer arithmetic in incremental build #123695

Closed
SchrodingerZhu opened this issue Apr 9, 2024 · 10 comments · Fixed by #125288
Assignees
Labels
A-codegen Area: Code generation A-incr-comp Area: Incremental compilation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@SchrodingerZhu
Copy link

SchrodingerZhu commented Apr 9, 2024

This bug is spotted jointly by @oxalica @Enter-tainer and @QuarticCat.

We tried this code (minimized by @QuarticCat and @oxalica):

// compile and run with: rustc -Copt-level=2 '-Cincremental=./inc' bug.rs -o bug && ./bug 
extern crate core;
use core::marker::PhantomData;
use core::mem;
use core::ptr::{null_mut, NonNull};
use core::sync::atomic::{AtomicUsize, Ordering::*};

struct Header {
    refs: AtomicUsize,
}

struct EcoVec<T> {
    ptr: NonNull<T>,
    phantom: PhantomData<T>,
}

impl<T> EcoVec<T> {
    fn new() -> Self {
        Self {
            ptr: Self::dangling(),
            phantom: PhantomData,
        }
    }

    fn offset() -> usize {
        mem::size_of::<Header>().max(mem::align_of::<T>())
    }

    fn dangling() -> NonNull<T> {
        let ptr = null_mut::<T>().wrapping_byte_add(Self::offset());
        unsafe { NonNull::new_unchecked(ptr) }
    }

    fn header(&self) -> Option<&Header> {
        let ptr = self.ptr.as_ptr().wrapping_byte_sub(Self::offset());
        (self.ptr != Self::dangling()).then(|| unsafe { &*ptr.cast() })
    }
}

impl<T> Drop for EcoVec<T> {
    fn drop(&mut self) {
        self.header()
            .map(|header| header.refs.fetch_sub(1, Release) != 1);
    }
}

struct Foo {
    _foo: Option<EcoVec<Foo>>,
}

fn main() {
    drop(EcoVec::<Foo>::new());
}

I expected to see this happen: The program should exit without error.

Instead, this happened: segmentation fault.

  • Notice that the code will pass cargo miri run. (miri 0.1.0 (9d5cdf7 2024-04-07)).
  • The error is reproducible with opt-level larger than 0.
  • The error seems to be incremental related, as rustc -Copt-level=3 bug.rs -o bug does not reproduce the bug.
  • There seems to be no bug in LLVM IR.
rustc -Copt-level=3 '-Cincremental=./inc' --emit=llvm-ir bug.rs -o bug.ll
clang -O3 bug.ll /home/schrodingerzy/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/libstd-6e3078511c661ac3.so
./a.out # exit normally
  • @oxalica noticed that cargo asm produces different asm compared with the one in the ELF output.
  • Notice that we encountered an ICE while exploring the code (for now, we are not sure whether the ICE is related to SEGV, we can file a separate report for ICE if needed). Basically, invoking the rustc consecutively, one will get (cleaning up the inc directory can help):
~/Documents via C v13.2.1-gcc via 🐍 v3.11.8 via 🦀 v1.79.0-nightly via t v0.11.0 
❯ rustc -Copt-level=3 '-Cincremental=./inc' --emit=llvm-ir bug.rs -o bug.ll

~/Documents via C v13.2.1-gcc via 🐍 v3.11.8 via 🦀 v1.79.0-nightly via t v0.11.0 
❯ rustc -Copt-level=3 '-Cincremental=./inc' --emit=llvm-ir bug.rs -o bug.ll
thread 'cpy 455z5hp9q7fv9wsp' panicked at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/compiler/rustc_codegen_ssa/src/back/write.rs:910:5:
assertion failed: module_config.emit_obj != EmitObj::None
stack backtrace:
   0:     0x740cfe589885 - std::backtrace_rs::backtrace::libunwind::trace::hc79cced6f418596d
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/std/src/../../backtrace/src/backtrace/libunwind.rs:105:5
   1:     0x740cfe589885 - std::backtrace_rs::backtrace::trace_unsynchronized::h06f3eef6c8a22cf0
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x740cfe589885 - std::sys_common::backtrace::_print_fmt::hba273d0c77fc3421
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/std/src/sys_common/backtrace.rs:68:5
   3:     0x740cfe589885 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h409f1e3c1e32650e
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/std/src/sys_common/backtrace.rs:44:22
   4:     0x740cfe5d8b4b - core::fmt::rt::Argument::fmt::h8811fe3c91cda7b3
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/core/src/fmt/rt.rs:142:9
   5:     0x740cfe5d8b4b - core::fmt::write::h7a8f70a9b146d9ee
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/core/src/fmt/mod.rs:1153:17
   6:     0x740cfe57e3ff - std::io::Write::write_fmt::h32fb818656611c58
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/std/src/io/mod.rs:1843:15
   7:     0x740cfe58965e - std::sys_common::backtrace::_print::h0dc0bbf9b429a58b
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/std/src/sys_common/backtrace.rs:47:5
   8:     0x740cfe58965e - std::sys_common::backtrace::print::hf60182bd4aee207d
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/std/src/sys_common/backtrace.rs:34:9
   9:     0x740cfe58c159 - std::panicking::default_hook::{{closure}}::hd90db44a41f772dc
  10:     0x740cfe58be75 - std::panicking::default_hook::hd86be16b87521210
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/std/src/panicking.rs:291:9
  11:     0x740cfae7f71e - std[2c420855e262d42e]::panicking::update_hook::<alloc[56e68475af91072d]::boxed::Box<rustc_driver_impl[10b0a2e726609308]::install_ice_hook::{closure#0}>>::{closure#0}
  12:     0x740cfe58c85c - <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call::h2fe2a6e53d9884ad
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/alloc/src/boxed.rs:2032:9
  13:     0x740cfe58c85c - std::panicking::rust_panic_with_hook::ha4f8caa112a16574
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/std/src/panicking.rs:792:13
  14:     0x740cfe58c5cd - std::panicking::begin_panic_handler::{{closure}}::hc879855deab44ed0
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/std/src/panicking.rs:649:13
  15:     0x740cfe589d49 - std::sys_common::backtrace::__rust_end_short_backtrace::h85e59f289fdfff6c
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/std/src/sys_common/backtrace.rs:171:18
  16:     0x740cfe58c337 - rust_begin_unwind
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/std/src/panicking.rs:645:5
  17:     0x740cfe5d4fe6 - core::panicking::panic_fmt::h0baef2c59e253f8d
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/core/src/panicking.rs:72:14
  18:     0x740cfe5d508f - core::panicking::panic::h03522a15492dd08d
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/core/src/panicking.rs:141:5
  19:     0x740cfcf3dc87 - std[2c420855e262d42e]::sys_common::backtrace::__rust_begin_short_backtrace::<<rustc_codegen_llvm[6954383712d6b8a]::LlvmCodegenBackend as rustc_codegen_ssa[d592042e500eb30]::traits::backend::ExtraBackendMethods>::spawn_named_thread<rustc_codegen_ssa[d592042e500eb30]::back::write::spawn_work<rustc_codegen_llvm[6954383712d6b8a]::LlvmCodegenBackend>::{closure#0}, ()>::{closure#0}, ()>
  20:     0x740cfcf3cde7 - <<std[2c420855e262d42e]::thread::Builder>::spawn_unchecked_<<rustc_codegen_llvm[6954383712d6b8a]::LlvmCodegenBackend as rustc_codegen_ssa[d592042e500eb30]::traits::backend::ExtraBackendMethods>::spawn_named_thread<rustc_codegen_ssa[d592042e500eb30]::back::write::spawn_work<rustc_codegen_llvm[6954383712d6b8a]::LlvmCodegenBackend>::{closure#0}, ()>::{closure#0}, ()>::{closure#1} as core[a43033d72a2ad972]::ops::function::FnOnce<()>>::call_once::{shim:vtable#0}
  21:     0x740cfe59628b - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h5cf039e566d31df2
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/alloc/src/boxed.rs:2018:9
  22:     0x740cfe59628b - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h5b8a7e7667fbf80b
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/alloc/src/boxed.rs:2018:9
  23:     0x740cfe59628b - std::sys::pal::unix::thread::Thread::new::thread_start::h47ad6cb551091e6a
                               at /rustc/9d5cdf75aa42faaf0b58ba21a510117e8d0051a3/library/std/src/sys/pal/unix/thread.rs:108:17
  24:     0x740cf7c92b9d - start_thread
                               at /usr/src/debug/glibc/glibc/nptl/pthread_create.c:447:8
  25:     0x740cf7d1dcdc - clone3
  26:                0x0 - <unknown>

Meta

Both nightly and stable versions are affected
rustc --version --verbose:

rustc 1.77.1 (7cf61ebde 2024-03-27)
binary: rustc
commit-hash: 7cf61ebde7b22796c69757901dd346d0fe70bd97
commit-date: 2024-03-27
host: x86_64-unknown-linux-gnu
release: 1.77.1
LLVM version: 17.0.6
rustc 1.79.0-nightly (9d5cdf75a 2024-04-07)
binary: rustc
commit-hash: 9d5cdf75aa42faaf0b58ba21a510117e8d0051a3
commit-date: 2024-04-07
host: x86_64-unknown-linux-gnu
release: 1.79.0-nightly
LLVM version: 18.1.3
@SchrodingerZhu SchrodingerZhu added the C-bug Category: This is a bug. label Apr 9, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Apr 9, 2024
@matthiaskrgr
Copy link
Member

may be running into #123234

@lqd
Copy link
Member

lqd commented Apr 10, 2024

This started on stable in 1.73, and bisects to nightly-2023-08-09:

A couple notes, when using cargo:

  • incremental compilation is not enabled on release builds
  • for people who opt into optimizing dependencies even in debug builds: incremental compilation is not enabled for dependencies either, only local crates

It would be interesting to see exactly what impacts codegen here when enabling incremental compilation of a single session (but at the same time, it seems probable that not many users should encounter this issue in the more common configurations?).

@jieyouxu jieyouxu added A-codegen Area: Code generation T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. A-incr-comp Area: Incremental compilation E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Apr 15, 2024
pacak added a commit to pacak/rust that referenced this issue Apr 16, 2024
This pull request partially reverts changes from e16c3b4

Original motivation for this assert was described with "A WorkProduct without a saved file is useless"
which was true at the time but now it is possible to have work products with other types of files
(llvm-ir, asm, etc) and there are bugreports for this failure:

For example: rust-lang#123695

Fixes rust-lang#123234

Now existing `assert` and `.unwrap_or_else` are unified into a single
check that emits slightly more user friendly error message if an object
files was meant to be produced but it's missing
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Apr 16, 2024
Allow workproducts without object files.

This pull request partially reverts changes from e16c3b4

Original motivation for this assert was described with "A WorkProduct without a saved file is useless"
which was true at the time but now it is possible to have work products with other types of files
(llvm-ir, asm, etc) and there are bugreports for this failure:

For example: rust-lang#123695

Fixes rust-lang#123234

Now existing `assert` and `.unwrap_or_else` are unified into a single
check that emits slightly more user friendly error message if an object
files was meant to be produced but it's missing
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Apr 16, 2024
Rollup merge of rust-lang#124023 - pacak:less-splody, r=jieyouxu

Allow workproducts without object files.

This pull request partially reverts changes from e16c3b4

Original motivation for this assert was described with "A WorkProduct without a saved file is useless"
which was true at the time but now it is possible to have work products with other types of files
(llvm-ir, asm, etc) and there are bugreports for this failure:

For example: rust-lang#123695

Fixes rust-lang#123234

Now existing `assert` and `.unwrap_or_else` are unified into a single
check that emits slightly more user friendly error message if an object
files was meant to be produced but it's missing
@SchrodingerZhu
Copy link
Author

Some additional context: the code is largely related to ecow.

In case people compile typst/ecow related code with incremental build and level 1 optimization enabled and run into segmentation fault, this should be the bug behind it.

@RalfJung
Copy link
Member

RalfJung commented May 6, 2024

So the error happens just from adding -Cincremental=./inc', without there actually being a previous incremental cache there? Wow.

Cc @rust-lang/wg-incr-comp

Seems likely that the LLVM 17 update is involved. Cc @nikic

@michaelwoerister
Copy link
Member

So the error happens just from adding -Cincremental=./inc', without there actually being a previous incremental cache there?

I can confirm that this reproduces with an empty cache -- which should be good news in terms of debugging the issue.

I'm pretty sure this is a ThinLTO-related bug and incremental somehow configures CGUs and LLVM settings in a way that triggers the bug. The segfault does not reproduce when compiling with -Copt-level=2 -Cincremental=./inc -Clto=off.

@michaelwoerister
Copy link
Member

Running the miscompiled binary in gdb (with -Cdebuginfo=2) gives:

Program received signal SIGSEGV, Segmentation fault.
main::main () at main.rs:52
52          drop(EcoVec::<Foo>::new());

@nikic
Copy link
Contributor

nikic commented May 6, 2024

$ llvm-dis < bug.307i3fz7peiqe4ag.rcgu.bc 

[...]
define internal void @_ZN3bug4main17h08d7a889ae812856E() unnamed_addr #0 personality ptr @rust_eh_personality {
  %1 = atomicrmw sub ptr null, i64 1 release, align 8, !noalias !5
  ret void
}

Yeah, that would crash...

@nikic
Copy link
Contributor

nikic commented May 6, 2024

I think this is a minimal reproducer: https://llvm.godbolt.org/z/49Kbq15xn

define ptr @test(ptr nonnull %arg) {
  %res = getelementptr i8, ptr %arg, i64 -8
  ret ptr %res
}

We infer that the result of this function is nonnull...

@nikic nikic self-assigned this May 6, 2024
@nikic nikic added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness labels May 6, 2024
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label May 6, 2024
@nikic
Copy link
Contributor

nikic commented May 6, 2024

Upstream issue: llvm/llvm-project#91177

@apiraino
Copy link
Contributor

apiraino commented May 7, 2024

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-high

@rustbot rustbot added P-high High priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels May 7, 2024
@nikic nikic added the llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade label May 13, 2024
@bors bors closed this as completed in 20483b6 May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-incr-comp Area: Incremental compilation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants