-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Safe Rust code miscompilation due to a bug in LLVM's Global Value Numbering #45839
Comments
Conservatively tagging this as I-unsound, feel free to downgrade to I-wrong if necessary. |
We discussed this in the @rust-lang/compiler team meeting and we are thinking of closing this bug. It doesn't make sense to have a bug for every LLVM bug, and this one hasn't been observed "in the wild" for Rust code, only for artificial examples, so it's probably not adding much value. @RalfJung thanks for bringing it to our attention. If LLVM decides not to fix, we may want to re-open, though it's unclear if there is much we can practically do to prevent this sort of optimization. |
Fine for me.
I'd say this depends on their reasoning. If they somehow introduce a rule that this ought to be UB, we have a problem -- I don't see any good way to rule out such safe code. I would also be surprised if it was not possible to expand this miscompilation to trigger actual UB in safe Rust (notwithstanding the fact that miscompilations giving the wrong results in safe code aren't really less bad than UB in safe code). |
FWIW, we could fix this by replacing the #[bench]
fn current(b: &mut Bencher) {
let mut v = Vec::with_capacity(512);
let mut r = 0u32;
v.clear();
for _ in 0..512 {
r = r.wrapping_mul(1664525).wrapping_add(1013904223);
v.push(r);
}
b.iter(|| {
let v = black_box(&v);
let mut p = v.as_ptr();
let end = v.as_ptr().wrapping_offset(v.len() as isize);
let mut sum = 0;
while p != end {
sum += unsafe { *p };
p = p.wrapping_offset(1);
}
black_box(sum)
});
} goes from 23ns/iter up to 187ns/iter . Investigating the asm (playground), it looks like it inhibits auto-vectorisation on: pub trait WrappingOffsetExt {
fn wrapping_offset_ext(self, offset: isize) -> Self;
}
impl<T> WrappingOffsetExt for *const T {
fn wrapping_offset_ext(self, offset: isize) -> Self {
((self as isize).wrapping_add(offset * (::std::mem::size_of::<T>() as isize)) as *const T)
}
}
#[no_mangle]
pub fn via_wrapping_offset(v: &[u32]) -> u32 {
let mut p = v.as_ptr();
let end = v.as_ptr().wrapping_offset(v.len() as isize);
let mut sum = 0;
while p != end {
sum += unsafe { *p };
p = p.wrapping_offset(1);
}
sum
}
#[no_mangle]
pub fn via_integer(v: &[u32]) -> u32 {
let mut p = v.as_ptr();
let end = v.as_ptr().wrapping_offset_ext(v.len() as isize);
let mut sum = 0;
while p != end {
sum += unsafe { *p };
p = p.wrapping_offset_ext(1);
}
sum
}
fn main() {} Edit: Fixed lack of |
That's what I mentioned in my previous commit; however, your patch is slightly wrong: You need to multiply |
Not sure which commit that, I must have missed it, sorry.
D'oh, sorry. Numbers and conclusions stay exactly the same. @eddyb |
@eddyb thanks for the pointer to #25434. It seems that |
Can I ask if you (Rust compiler) folks have looked over the emitted IR for this one super carefully to make sure it doesn't contain anything that gives LLVM an easy excuse to invoke UB? |
Obviously, the entire reason there is UB here is because LLVM's semantics are not clear enough. However, the 2 "known" exploits of this bug rely either on out-of-bounds |
@arielb1 I was trying to politely ask if y'all are completely sure the Rust compiler isn't doing something stupid. |
@regehr Yes; I've looked over the IR. Also, there is a C version of the same bug reported here which doesn't appear to invoke any undefined behavior either. The underlying problem appears to replacing if (p == q) {
*p = 42;
} In LLVM, In C terms, the equality comparison may succeed when "one [operand] is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space" (6.5.9p6), but the dereference of the one-past-the-end pointer isn't valid despite being equal to a pointer which would have a valid dereference, because "If the result [of pointer arithmetic] points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated." (6.5.6p8). |
Thanks @sunfishcode! This stuff is super difficult and I wanted some reassurance this is a hard bug and not an easy bug :) |
Another interesting question is if we can use this LVLM bug to trigger UB in safe Rust code. So far it seems we have a gadget to "replace" one value of a given type by another; if we could apply that selectively so that the during bounds checks, the index is replaced when checking but not when accessing, we would get UB. Not sure if that is possible though, because the Rust compiler is emitting both of these together. |
The LLVM bug seems to shows a program which has UB in C. In In Rust, I.e. why is wrapping_offset safe on pointers when it's potentially undefined in every other systems language? |
@ahmedcharles the result of (int)p has type int and it obeys integer rules, not pointer rules. adding 1 would only be UB if (int)p had evaluated to INT_MAX |
That's not correct. Whether the resulting pointer may be used to access memory is a different question, also see the updated documentation for this method. |
lib/src.rs
in thefoo
crate:src/main.rs
in the main crate:This prints
b = 1, x = 7777
. That is wrong; ifb
is1
, it must be the case that the then-branch in theq as *const _ == p
conditional was taken, sor
is&mut x
(sinceb2
istrue
), sox
cannot still be7777
.This is a bug in LLVM's GVN, see https://bugs.llvm.org/show_bug.cgi?id=35229 for the upstream report and some further analysis.
Original C test case by Gil Hur.
The text was updated successfully, but these errors were encountered: