Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using ManuallyDrop causes allocas and memcpys that LLVM cannot remove #79914

Open
scottmcm opened this issue Dec 10, 2020 · 5 comments
Open

Using ManuallyDrop causes allocas and memcpys that LLVM cannot remove #79914

scottmcm opened this issue Dec 10, 2020 · 5 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@scottmcm
Copy link
Member

scottmcm commented Dec 10, 2020

It's great that ManuallyDrop helps people do the right thing by pre-leaking things they plan to move elsewhere, helping to avoid double-free issues.

But unfortunately ManuallyDrop::new gets codegened as copying the whole thing into a local variable, and LLVM frequently cannot remove it. This is particularly frustrating for large types where the ABI passes them by pointer, as they get copied from that pointer to the stack (which, in addition to the copy, means more stack manipulation and __rust_probestack call than necessary). Demo: https://rust.godbolt.org/z/Y8o7MG

#![feature(maybe_uninit_extra)]
#![feature(min_const_generics)]
use std::mem::{ManuallyDrop, MaybeUninit};
fn array_map<T, U, const N: usize>(a: [T; N], f: impl Fn(T) -> U) -> [U;N]
{
    // For this demo, always just leak on panic.
    // Obviously a full implementation should do more,
    // but this is sufficient to show the problem.
    let a = ManuallyDrop::new(a);
    union Foo<V, const M: usize> {
        uninit: (),
        partial: ManuallyDrop<[MaybeUninit<V>; M]>,
        done: ManuallyDrop<[V; M]>,
    }
    let mut result = Foo::<U, N> { uninit: () };
    unsafe {
        for i in 0..N {
            (*result.partial)[i].write(f(std::ptr::read(&a[i])));
        }
        ManuallyDrop::into_inner(result.done)
    }
}

pub fn demo(a: [u32; 400]) -> [i64; 400] {
    array_map(a, From::from)
}

Note in particular the following (generated with nightly 2020-12-07 in the godbolt link):

define void @_ZN7example4demo17hf199b6e7768bfe59E([400 x i64]* noalias nocapture sret dereferenceable(3200) %0, [400 x i32]* noalias nocapture readonly dereferenceable(1600) %a) unnamed_addr #0 personality i32 (i32, i32, i64, %"unwind::libunwind::_Unwind_Exception"*, %"unwind::libunwind::_Unwind_Context"*)* @rust_eh_personality !dbg !6 {
start:
  %result.i = alloca %"array_map::Foo<i64, 400_usize>", align 8
  %a1.i = alloca %"std::mem::ManuallyDrop<[u32; 400]>", align 4

There's no good reason for either of those locals, and this is comparatively easy mode -- panic=abort and the monomorphization is only using Copy types that definitely don't need dropping.

At the ABI level this is fn demo(*mut [i64; 400], *const [u32; 400]), so there ought to be a way to write this function in Rust such that it would produce the obvious no-stack-needed implementation of out[i] = static_cast<_>(in[i]);. But I can't find one.

Undeveloped musings on possible fixes:

  • Don't lower ManuallyDrop as its own type in codegen (it's already a lang item anyway)
  • Write a mir-opt to make the wrapper disappear (since it's transparent anyway and in MIR drops are explicit so the type difference is unimportant)
  • Create some sort of forget-in-place intrinsic that just suppresses the drop without moving anything or invalidating the binding
  • Figure out why LLVM can't optimize this away and fix that
  • ...

Edit: FWIW, -Z mir-opt-level=3 -Z unsound-mir-opts=yes doesn't fix this either (as of 2020-12-12)


I started looking into this as part of figuring out what was happening in #75571

Finally decided to open this after this CI failure (which I'd made sure worked locally), https://github.com/rust-lang/rust/runs/1532668348

/checkout/src/test/codegen/vec-extend_from_array.rs:12:16: error: CHECK-NOT: excluded string found in input
 // CHECK-NOT: alloca
               ^
/checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/vec-extend_from_array/vec-extend_from_array.ll:128:14: note: found here
 %array1.i = alloca %"std::mem::ManuallyDrop<[std::string::String; 400]>", align 8
             ^~~~~~

MaybeUninit has similar issues, but that might be covered by #61011

@scottmcm scottmcm added the C-bug Category: This is a bug. label Dec 10, 2020
@scottmcm
Copy link
Member Author

I'm happy to put some work into fixing this, if people can suggest what their preferred approach would be.

@jyn514
Copy link
Member

jyn514 commented Dec 10, 2020

@scottmcm in this example, could you use drop_in_place?

@jyn514 jyn514 added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Dec 10, 2020
@scottmcm
Copy link
Member Author

@jyn514 But that would drop it -- what I want here is more like forget_in_place.

@cynecx
Copy link
Contributor

cynecx commented Dec 12, 2020

I may have opened a duplicate issue for the same issue? #79754

@scottmcm
Copy link
Member Author

scottmcm commented Apr 8, 2024

Triage: looks like this is still happening on nightly 2024-04-05.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

3 participants