Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] master from rust-lang:master #8

Merged
merged 80 commits into from
Sep 9, 2024
Merged

[pull] master from rust-lang:master #8

merged 80 commits into from
Sep 9, 2024

Conversation

pull[bot]
Copy link

@pull pull bot commented Sep 7, 2024

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

onur-ozkan and others added 30 commits August 9, 2024 12:22
Calling `Cargo::configure_linker` unconditionally slows down certain
commands (e.g., "check" command) without providing any benefit.

Signed-off-by: onur-ozkan <work@onurozkan.dev>
For commands like check/clean there is no need to check for target tools.
Avoiding this check can also speed up the process.

Signed-off-by: onur-ozkan <work@onurozkan.dev>
This example triggers an assertion failure:
```
fn f() -> u32 {
    #[cfg_eval] #[cfg(not(FALSE))] 0
}
```
The sequence of events:
- `configure_annotatable` calls `parse_expr_force_collect`, which calls
  `collect_tokens`.
- Within that, we end up in `parse_expr_dot_or_call`, which again calls
  `collect_tokens`.
  - The return value of the `f` call is the expression `0`.
  - This inner call collects tokens for `0` (parser range 10..11) and
    creates a replacement covering `#[cfg(not(FALSE))] 0` (parser range
    0..11).
- We return to the outer `collect_tokens` call. The return value of the
  `f` call is *again* the expression `0`, again with the range 10..11,
  but the replacement from earlier covers the range 0..11. The code
  mistakenly assumes that any attributes from an inner `collect_tokens`
  call fit entirely within the body of the result of an outer
  `collect_tokens` call. So it adjusts the replacement parser range
  0..11 to a node range by subtracting 10, resulting in -10..1. This is
  an invalid range and triggers an assertion failure.

It's tricky to follow, but basically things get complicated when an AST
node is returned from an inner `collect_tokens` call and then returned
again from an outer `collect_token` node without being wrapped in any
kind of additional layer.

This commit changes `collect_tokens` to return early in some extra cases,
avoiding the construction of lazy tokens. In the example above, the
outer `collect_tokens` returns earlier because the `0` token already has
tokens and `self.capture_state.capturing` is `Capturing::No`. This early
return avoids the creation of the invalid range and the assertion
failure.

Fixes #129166. Note: these invalid ranges have been happening for a long
time. #128725 looks like it's at fault only because it introduced the
assertion that catches the invalid ranges.
In a case like this:
```
mod a {
    mod b {
        #[cfg_attr(unix, inline)]
        fn f() {
            #[cfg_attr(linux, inline)]
            fn g1() {}
            #[cfg_attr(linux, inline)]
            fn g2() {}
        }
    }
}
```
We currently end up with the following replacement ranges.
- The lazy tokens for `f` has replacement ranges for `g1` and `g2`.
- The lazy tokens for `a` has replacement ranges for `f`, `g1`, and
  `g2`.

I.e. the replacement ranges for `g1` and `g2` are duplicated. In
general, replacement ranges for inner AST nodes are duplicated up the
chain for each nested `collect_tokens` call. And the code that processes
the replacements is careful about the ordering in which the replacements
are applied, to ensure that inner replacements are applied before outer
replacements.

But all of this is unnecessary. If you apply an inner replacement and
then an outer replacement, the outer replacement completely overwrites
the inner replacement.

This commit avoids the duplication by removing replacements from
`self.capture_state.parser_replacements` when they are used. (The effect
on the example above is that the lazy tokesn for `a` no longer include
replacement ranges for `g1` and `g2`.) This eliminates the possibility
of nested replacements on individual AST nodes, which avoids the need
for careful ordering of replacements.
- Trim some unnecessary fat from the type declaration.
- Add another attribute, to make it a stronger test of `cfg_attr`
  processing. Note that the current output is incorrect, because it
  duplicates the added attribute. The next commit will fix this.
By keeping track of attributes that have been previously processed.

This fixes the `macro-rules-derive-cfg.stdout` test, and is necessary
for #124141 which removes nonterminals.

Also shrink the `SmallVec` inline size used in `IntervalSet`. 2 gives
slightly better perf than 4 now that there's an `IntervalSet` in
`Parser`, which is cloned reasonably often.
Use `Cow` to avoid cloning `ret.attrs()` unless necessary. This requires
moving some things around to satisfy the borrow checker.
also fixes a discrepency where the rust side doesn't use -L

must not be merged before #129134

docs are only on the rust side, since duplicated prose
has a tendancy to get out-of-sync, and also because there
are talks of removing the python script all together eventually.
…function merging

Resolves #129438

The `-Zmerge-functions=disabled` compile flag exists for this purpose.
... can not be correctly gated using #[cfg] macro
…x, FreeBSD)

The developer experience for panics is to provide the backtrace and
exit the program. When running under debugger, that might be improved
by breaking into the debugger once the code panics thus enabling
the developer to examine the program state at the exact time when
the code panicked.
Let the developer catch the panic in the debugger if it is attached.
If the debugger is not attached, nothing changes. Providing this feature
inside the standard library facilitates better debugging experience.

Validated under Windows, Linux, macOS 14.6, and FreeBSD 13.3..14.1.
compiler-errors and others added 22 commits September 7, 2024 14:21
…idtwco

Don't store region in `CapturedPlace`

It's not necessary anymore, since we erase all regions in writeback anyways.
bypass linker configuration and cross target check for specific commands

Avoids configuring the linker and checking cross-target-specific tools unless necessary.

Resolves #128180

cc `@ChrisDenton`
…rors

Rollup of 9 pull requests

Successful merges:

 - #128871 (bypass linker configuration and cross target check for specific commands)
 - #129468 ([testsuite][cleanup] Remove all usages of `dont_merge` hack to avoid function merging)
 - #129614 (Adjust doc comment of Condvar::wait_while)
 - #129840 (Implement suggestions for `elided_named_lifetimes`)
 - #129891 (Do not request sanitizers for naked functions)
 - #129899 (Add Suggestions for Misspelled Keywords)
 - #129940 (s390x: Fix a regression related to backchain feature)
 - #129987 (Don't store region in `CapturedPlace`)
 - #130054 (Add missing quotation marks)

r? `@ghost`
`@rustbot` modify labels: rollup
Delegation: support generics in associated delegation items

This is a continuation of #125929.

[design](https://github.com/Bryanskiy/posts/blob/master/delegation%20in%20generic%20contexts.md)

Generic parameters inheritance was implemented in all contexts. Generic arguments are not yet supported.

r? `@petrochenkov`
Bump boostrap compiler to new beta

Accidentally left some comments on the update cfgs commit directly xd
Implement raw lifetimes and labels (`'r#ident`)

This PR does two things:
1. Reserve lifetime prefixes, e.g. `'prefix#lt` in edition 2021.
2. Implements raw lifetimes, e.g. `'r#async` in edition 2021.

This PR additionally extends the `keyword_idents_2024` lint to also check lifetimes.

cc `@traviscross`
r? parser
stabilize const_float_bits_conv

This stabilizes `const_float_bits_conv`, and thus fixes #72447. With #128596 having landed, this is entirely a libs-only question now.

```rust
impl f32 {
    pub const fn to_bits(self) -> u32;
    pub const fn from_bits(v: u32) -> Self;
    pub const fn to_be_bytes(self) -> [u8; 4];
    pub const fn to_le_bytes(self) -> [u8; 4]
    pub const fn to_ne_bytes(self) -> [u8; 4];
    pub const fn from_be_bytes(bytes: [u8; 4]) -> Self;
    pub const fn from_le_bytes(bytes: [u8; 4]) -> Self;
    pub const fn from_ne_bytes(bytes: [u8; 4]) -> Self;
}

impl f64 {
    pub const fn to_bits(self) -> u64;
    pub const fn from_bits(v: u64) -> Self;
    pub const fn to_be_bytes(self) -> [u8; 8];
    pub const fn to_le_bytes(self) -> [u8; 8]
    pub const fn to_ne_bytes(self) -> [u8; 8];
    pub const fn from_be_bytes(bytes: [u8; 8]) -> Self;
    pub const fn from_le_bytes(bytes: [u8; 8]) -> Self;
    pub const fn from_ne_bytes(bytes: [u8; 8]) -> Self;
}
````

Cc `@rust-lang/wg-const-eval` `@rust-lang/libs-api`
…larsan68

explain the options bootstrap passes to curl

also fixes a discrepancy where the rust side doesn't use -L

docs are only on the rust side, since duplicated prose has a tendancy to get out-of-sync, and also because there are talks of removing the python script all together eventually.
Don't build by-move body when async closure is tainted

Fixes #129676

See explanation in the ui test.
Do not call query to compute coroutine layout for synthetic body of async closure

There is code in the MIR validator that attempts to prevent query cycles when inlining a coroutine into itself, and will use the coroutine layout directly from the body when it detects that's the same coroutine as the one that's being validated. After #128506, this logic didn't take into account the fact that the coroutine def id will differ if it's the "by-move body" of an async closure. This PR implements that.

Fixes #129811
add a few more crashtests

Added them for #123629, #127033 and #129372.
…narycat,GuillaumeGomez

rustdoc-search: allow trailing `Foo ->` arg search

Fixes #129710
str: make as_mut_ptr and as_bytes_mut unstably const

`@rust-lang/libs-api` the corresponding non-mutable methods are already const fn, so this seems pretty trivial. I hope this is small enough that it does not need an ACP? :)

I would like to get these stabilized ASAP because I want to avoid people doing `s.as_ptr().cast_mut()`, which is UB if they ever write to it, but is already const-stable.

TODO: create a tracking issue.
Win: Add dbghelp to the list of import libraries

This is used by the backtrace crate. But we use a submodule to include backtrace in std (rather than being a real crate) so we need to add the dependency here.
Remove the unused  `llvm-skip-rebuild` option from x.py

Fixes #130039
Rollup of 10 pull requests

Successful merges:

 - #126452 (Implement raw lifetimes and labels (`'r#ident`))
 - #129555 (stabilize const_float_bits_conv)
 - #129594 (explain the options bootstrap passes to curl)
 - #129677 (Don't build by-move body when async closure is tainted)
 - #129847 (Do not call query to compute coroutine layout for synthetic body of async closure)
 - #129869 (add a few more crashtests)
 - #130009 (rustdoc-search: allow trailing `Foo ->` arg search)
 - #130046 (str: make as_mut_ptr and as_bytes_mut unstably const)
 - #130047 (Win: Add dbghelp to the list of import libraries)
 - #130059 (Remove the unused  `llvm-skip-rebuild` option from x.py)

r? `@ghost`
`@rustbot` modify labels: rollup
@pull pull bot added the ⤵️ pull label Sep 8, 2024
bors added 5 commits September 8, 2024 03:11
Supress niches in coroutines to avoid aliasing violations

As mentioned [here](#63818 (comment)), using niches in fields of coroutines that are referenced by other fields is unsound: the discriminant accesses violate the aliasing requirements of the reference pointing to the relevant field. This issue causes [Miri errors in practice](rust-lang/miri#3780).

The "obvious" fix for this is to suppress niches in coroutines. That's what this PR does. However, we have several tests explicitly ensuring that we *do* use niches in coroutines. So I see two options:
- We guard this behavior behind a `-Z` flag (that Miri will set by default). There is no known case of these aliasing violations causing miscompilations. But absence of evidence is not evidence of absence...
- (What this PR does right now.) We temporarily adjust the coroutine layout logic and the associated tests until the proper fix lands. The "proper fix" here is to wrap fields that other fields can point to in [`UnsafePinned`](#125735) and make `UnsafePinned` suppress niches; that would then still permit using niches of *other* fields (those that never get borrowed). However, I know that coroutine sizes are already a problem, so I am not sure if this temporary size regression is acceptable.

`@compiler-errors` any opinion? Also who else should be Cc'd here?
…kens, r=petrochenkov

Fix double handling in `collect_tokens`

Double handling of AST nodes can occur in `collect_tokens`. This is when an inner call to `collect_tokens` produces an AST node, and then an outer call to `collect_tokens` produces the same AST node. This can happen in a few places, e.g. expression statements where the statement delegates `HasTokens` and `HasAttrs` to the expression. It will also happen more after #124141.

This PR fixes some double handling cases that cause problems, including #129166.

r? `@petrochenkov`
Split x86_64-msvc-ext into two jobs

This is an attempt to mitigate (but not resolve) the high failure rate of the x86_64-msvc-ext builder. The theory being that doing less makes it less likely to fail. But this may not work as having an extra job that may fail might be worse.

try-job: x86_64-msvc-ext
try-job: x86_64-msvc-ext2
Break into the debugger (if attached) on panics (Windows, Linux, macOS, FreeBSD)

The developer experience for panics is to provide the backtrace and
exit the program. When running under debugger, that might be improved
by breaking into the debugger once the code panics thus enabling
the developer to examine the program state at the exact time when
the code panicked.

Let the developer catch the panic in the debugger if it is attached.
If the debugger is not attached, nothing changes. Providing this feature
inside the standard library facilitates better debugging experience.

Validated under Windows, Linux, macOS 14.6, and FreeBSD 13.3..14.1.
better implementation of signed div_floor/ceil

Tracking issue for signed `div_floor`/`div_ceil`: #88581.

This PR improves the implementation of those two functions by adding a better branchless algorithm. Side-by-side comparison of `i32::div_floor` on x86-64:

```asm
div_floor_new:                               div_floor_old:
        push    rax                                  push    rax
        test    esi, esi                             test    esi, esi
        je      .LBB0_3                              je      .LBB1_6
        mov     eax, esi                             mov     eax, esi
        not     eax                                  not     eax
        lea     ecx, [rdi - 2147483648]              lea     ecx, [rdi - 2147483648]
        or      ecx, eax                             or      ecx, eax
        je      .LBB0_2                              je      .LBB1_7
        mov     eax, edi                             mov     eax, edi
        cdq                                          cdq
        idiv    esi                                  idiv    esi
        xor     esi, edi                             test    edx, edx
        sar     esi, 31                              setg    cl
        test    edx, edx                             test    esi, esi
        cmove   esi, edx                             sets    dil
        add     eax, esi                             test    dil, cl
        pop     rcx                                  jne     .LBB1_4
        ret                                          test    edx, edx
.LBB0_3:                                             setns   cl
        lea     rdi, [rip + .L__unnamed_1]           test    esi, esi
        call    qword ptr [rip + panic...]          setle   dl
.LBB0_2:                                             or      dl, cl
        lea     rdi, [rip + .L__unnamed_1]           jne     .LBB1_5
        call    qword ptr [rip + panic...]   .LBB1_4:
                                                     dec     eax
                                             .LBB1_5:
                                                     pop     rcx
                                                     ret
                                             .LBB1_6:
                                                     lea     rdi, [rip + .L__unnamed_2]
                                                     call    qword ptr [rip + panic...]
                                             .LBB1_7:
                                                     lea     rdi, [rip + .L__unnamed_2]
                                                     call    qword ptr [rip + panic...]
```

And on Aarch64:

```asm
_div_floor_new:                                   _div_floor_old:
        stp     x29, x30, [sp, #-16]!                     stp     x29, x30, [sp, #-16]!
        mov     x29, sp                                   mov     x29, sp
        cbz     w1, LBB0_4                                cbz     w1, LBB1_9
        mov     w8, #-2147483648                          mov     x8, x0
        cmp     w0, w8                                    mov     w9, #-2147483648
        b.ne    LBB0_3                                    cmp     w0, w9
        cmn     w1, #1                                    b.ne    LBB1_3
        b.eq    LBB0_5                                    cmn     w1, #1
LBB0_3:                                                   b.eq    LBB1_10
        sdiv    w8, w0, w1                        LBB1_3:
        msub    w9, w8, w1, w0                            sdiv    w0, w8, w1
        eor     w10, w1, w0                               msub    w8, w0, w1, w8
        asr     w10, w10, #31                             tbz     w1, #31, LBB1_5
        cmp     w9, #0                                    cmp     w8, #0
        csel    w9, wzr, w10, eq                          b.gt    LBB1_7
        add     w0, w9, w8                        LBB1_5:
        ldp     x29, x30, [sp], #16                       cmp     w1, #1
        ret                                               b.lt    LBB1_8
LBB0_4:                                                   tbz     w8, #31, LBB1_8
        adrp    x0, l___unnamed_1@PAGE            LBB1_7:
        add     x0, x0, l___unnamed_1@PAGEOFF             sub     w0, w0, #1
        bl      panic...                          LBB1_8:
LBB0_5:                                                   ldp     x29, x30, [sp], #16
        adrp    x0, l___unnamed_1@PAGE                    ret
        add     x0, x0, l___unnamed_1@PAGEOFF     LBB1_9:
        bl      panic...                                  adrp    x0, l___unnamed_2@PAGE
                                                          add     x0, x0, l___unnamed_2@PAGEOFF
                                                          bl      panic...
                                                  LBB1_10:
                                                          adrp    x0, l___unnamed_2@PAGE
                                                          add     x0, x0, l___unnamed_2@PAGEOFF
                                                          bl      panic...
```
@pull pull bot merged commit adf8d16 into AKJUS:master Sep 9, 2024
2 checks passed
pull bot pushed a commit that referenced this pull request Oct 17, 2024
…raheemdev

Optimize `Box::default` and `Arc::default` to construct more types in place

Both the `Arc` and `Box` `Default` impls currently call `T::default()` before allocating, and then moving the resulting `T` into the allocation.

Most `Default` impls are trivial, which should in theory allow
LLVM to construct `T: Default` directly in the `Box` allocation when calling
`<Box<T>>::default()`.

However, the allocation may fail, which necessitates calling `T`'s destructor if it has one.
If the destructor is non-trivial, then LLVM has a hard time proving that it's
sound to elide, which makes it construct `T` on the stack first, and then copy it into the allocation.

Change both of these impls to allocate first, and then call `T::default` into the uninitialized allocation, so that LLVM doesn't have to prove that it's sound to elide the destructor/initial stack copy.

For example, given the following Rust code:

```rust
#[derive(Default, Clone)]
struct Foo {
    x: Vec<u8>,
    z: String,
    y: Vec<u8>,
}

#[no_mangle]
pub fn src() -> Box<Foo> {
    Box::default()
}
```

<details open>
<summary>Before this PR:</summary>

```llvm
`@__rust_no_alloc_shim_is_unstable` = external global i8

; drop_in_place() generated in case the allocation fails

; core::ptr::drop_in_place<playground::Foo>
; Function Attrs: nounwind nonlazybind uwtable
define internal fastcc void `@"_ZN4core3ptr36drop_in_place$LT$playground..Foo$GT$17hff376aece491233bE"(ptr` noalias nocapture noundef readonly align 8 dereferenceable(72) %_1) unnamed_addr #0 personality ptr `@rust_eh_personality` {
start:
  %_1.val = load i64, ptr %_1, align 8
  %0 = icmp eq i64 %_1.val, 0
  br i1 %0, label %bb6, label %"_ZN63_$LT$alloc..alloc..Global$u20$as$u20$core..alloc..Allocator$GT$10deallocate17heaa87468709346b1E.exit.i.i.i4.i"

"_ZN63_$LT$alloc..alloc..Global$u20$as$u20$core..alloc..Allocator$GT$10deallocate17heaa87468709346b1E.exit.i.i.i4.i": ; preds = %start
  %1 = getelementptr inbounds i8, ptr %_1, i64 8
  %_1.val6 = load ptr, ptr %1, align 8, !nonnull !3, !noundef !3
  tail call void `@__rust_dealloc(ptr` noundef nonnull %_1.val6, i64 noundef %_1.val, i64 noundef 1) #8
  br label %bb6

bb6:                                              ; preds = %"_ZN63_$LT$alloc..alloc..Global$u20$as$u20$core..alloc..Allocator$GT$10deallocate17heaa87468709346b1E.exit.i.i.i4.i", %start
  %2 = getelementptr inbounds i8, ptr %_1, i64 24
  %.val9 = load i64, ptr %2, align 8
  %3 = icmp eq i64 %.val9, 0
  br i1 %3, label %bb5, label %"_ZN63_$LT$alloc..alloc..Global$u20$as$u20$core..alloc..Allocator$GT$10deallocate17heaa87468709346b1E.exit.i.i.i4.i.i11"

"_ZN63_$LT$alloc..alloc..Global$u20$as$u20$core..alloc..Allocator$GT$10deallocate17heaa87468709346b1E.exit.i.i.i4.i.i11": ; preds = %bb6
  %4 = getelementptr inbounds i8, ptr %_1, i64 32
  %.val10 = load ptr, ptr %4, align 8, !nonnull !3, !noundef !3
  tail call void `@__rust_dealloc(ptr` noundef nonnull %.val10, i64 noundef %.val9, i64 noundef 1) #8
  br label %bb5

bb5:                                              ; preds = %"_ZN63_$LT$alloc..alloc..Global$u20$as$u20$core..alloc..Allocator$GT$10deallocate17heaa87468709346b1E.exit.i.i.i4.i.i11", %bb6
  %5 = getelementptr inbounds i8, ptr %_1, i64 48
  %.val4 = load i64, ptr %5, align 8
  %6 = icmp eq i64 %.val4, 0
  br i1 %6, label %"_ZN4core3ptr46drop_in_place$LT$alloc..vec..Vec$LT$u8$GT$$GT$17hb5ca95423e113cf7E.exit16", label %"_ZN63_$LT$alloc..alloc..Global$u20$as$u20$core..alloc..Allocator$GT$10deallocate17heaa87468709346b1E.exit.i.i.i4.i15"

"_ZN63_$LT$alloc..alloc..Global$u20$as$u20$core..alloc..Allocator$GT$10deallocate17heaa87468709346b1E.exit.i.i.i4.i15": ; preds = %bb5
  %7 = getelementptr inbounds i8, ptr %_1, i64 56
  %.val5 = load ptr, ptr %7, align 8, !nonnull !3, !noundef !3
  tail call void `@__rust_dealloc(ptr` noundef nonnull %.val5, i64 noundef %.val4, i64 noundef 1) #8
  br label %"_ZN4core3ptr46drop_in_place$LT$alloc..vec..Vec$LT$u8$GT$$GT$17hb5ca95423e113cf7E.exit16"

"_ZN4core3ptr46drop_in_place$LT$alloc..vec..Vec$LT$u8$GT$$GT$17hb5ca95423e113cf7E.exit16": ; preds = %bb5, %"_ZN63_$LT$alloc..alloc..Global$u20$as$u20$core..alloc..Allocator$GT$10deallocate17heaa87468709346b1E.exit.i.i.i4.i15"
  ret void
}

; Function Attrs: nonlazybind uwtable
define noalias noundef nonnull align 8 ptr `@src()` unnamed_addr #1 personality ptr `@rust_eh_personality` {
start:

; alloca to place `Foo` in.
  %_1 = alloca [72 x i8], align 8
  call void `@llvm.lifetime.start.p0(i64` 72, ptr nonnull %_1)
  store i64 0, ptr %_1, align 8
  %_2.sroa.4.0._1.sroa_idx = getelementptr inbounds i8, ptr %_1, i64 8
  store ptr inttoptr (i64 1 to ptr), ptr %_2.sroa.4.0._1.sroa_idx, align 8
  %_2.sroa.5.0._1.sroa_idx = getelementptr inbounds i8, ptr %_1, i64 16
  %_3.sroa.4.0..sroa_idx = getelementptr inbounds i8, ptr %_1, i64 32
  call void `@llvm.memset.p0.i64(ptr` noundef nonnull align 8 dereferenceable(16) %_2.sroa.5.0._1.sroa_idx, i8 0, i64 16, i1 false)
  store ptr inttoptr (i64 1 to ptr), ptr %_3.sroa.4.0..sroa_idx, align 8
  %_3.sroa.5.0..sroa_idx = getelementptr inbounds i8, ptr %_1, i64 40
  %_4.sroa.4.0..sroa_idx = getelementptr inbounds i8, ptr %_1, i64 56
  call void `@llvm.memset.p0.i64(ptr` noundef nonnull align 8 dereferenceable(16) %_3.sroa.5.0..sroa_idx, i8 0, i64 16, i1 false)
  store ptr inttoptr (i64 1 to ptr), ptr %_4.sroa.4.0..sroa_idx, align 8
  %_4.sroa.5.0..sroa_idx = getelementptr inbounds i8, ptr %_1, i64 64
  store i64 0, ptr %_4.sroa.5.0..sroa_idx, align 8
  %0 = load volatile i8, ptr `@__rust_no_alloc_shim_is_unstable,` align 1, !noalias !4
  %_0.i.i.i = tail call noalias noundef align 8 dereferenceable_or_null(72) ptr `@__rust_alloc(i64` noundef 72, i64 noundef 8) #8, !noalias !4
  %1 = icmp eq ptr %_0.i.i.i, null
  br i1 %1, label %bb2.i, label %"_ZN5alloc5boxed12Box$LT$T$GT$3new17h0864de14f863a27aE.exit"

bb2.i:                                            ; preds = %start
; invoke alloc::alloc::handle_alloc_error
  invoke void `@_ZN5alloc5alloc18handle_alloc_error17h98142d0d8d74161bE(i64` noundef 8, i64 noundef 72) #9
          to label %.noexc unwind label %cleanup.i

.noexc:                                           ; preds = %bb2.i
  unreachable

cleanup.i:                                        ; preds = %bb2.i
  %2 = landingpad { ptr, i32 }
          cleanup
; call core::ptr::drop_in_place<playground::Foo>
  call fastcc void `@"_ZN4core3ptr36drop_in_place$LT$playground..Foo$GT$17hff376aece491233bE"(ptr` noalias noundef nonnull align 8 dereferenceable(72) %_1) #10
  resume { ptr, i32 } %2

"_ZN5alloc5boxed12Box$LT$T$GT$3new17h0864de14f863a27aE.exit": ; preds = %start

; Copy from stack to heap if allocation is successful
  call void `@llvm.memcpy.p0.p0.i64(ptr` noundef nonnull align 8 dereferenceable(72) %_0.i.i.i, ptr noundef nonnull align 8 dereferenceable(72) %_1, i64 72, i1 false)
  call void `@llvm.lifetime.end.p0(i64` 72, ptr nonnull %_1)
  ret ptr %_0.i.i.i
}

```
</details>

<details>
<summary>After this PR</summary>

```llvm
; Notice how there's no `drop_in_place()` generated as well

define noalias noundef nonnull align 8 ptr `@src()` unnamed_addr #0 personality ptr `@rust_eh_personality` {
start:
; no stack allocation

  %0 = load volatile i8, ptr `@__rust_no_alloc_shim_is_unstable,` align 1
  %_0.i.i.i.i.i = tail call noalias noundef align 8 dereferenceable_or_null(72) ptr `@__rust_alloc(i64` noundef 72, i64 noundef 8) #5
  %1 = icmp eq ptr %_0.i.i.i.i.i, null
  br i1 %1, label %bb3.i, label %"_ZN5alloc5boxed16Box$LT$T$C$A$GT$13new_uninit_in17h80d6355ef4b73ea3E.exit"

bb3.i:                                            ; preds = %start
; call alloc::alloc::handle_alloc_error
  tail call void `@_ZN5alloc5alloc18handle_alloc_error17h98142d0d8d74161bE(i64` noundef 8, i64 noundef 72) #6
  unreachable

"_ZN5alloc5boxed16Box$LT$T$C$A$GT$13new_uninit_in17h80d6355ef4b73ea3E.exit": ; preds = %start
; construct `Foo` directly into the allocation if successful

  store i64 0, ptr %_0.i.i.i.i.i, align 8
  %_8.sroa.4.0._1.sroa_idx = getelementptr inbounds i8, ptr %_0.i.i.i.i.i, i64 8
  store ptr inttoptr (i64 1 to ptr), ptr %_8.sroa.4.0._1.sroa_idx, align 8
  %_8.sroa.5.0._1.sroa_idx = getelementptr inbounds i8, ptr %_0.i.i.i.i.i, i64 16
  %_8.sroa.7.0._1.sroa_idx = getelementptr inbounds i8, ptr %_0.i.i.i.i.i, i64 32
  tail call void `@llvm.memset.p0.i64(ptr` noundef nonnull align 8 dereferenceable(16) %_8.sroa.5.0._1.sroa_idx, i8 0, i64 16, i1 false)
  store ptr inttoptr (i64 1 to ptr), ptr %_8.sroa.7.0._1.sroa_idx, align 8
  %_8.sroa.8.0._1.sroa_idx = getelementptr inbounds i8, ptr %_0.i.i.i.i.i, i64 40
  %_8.sroa.10.0._1.sroa_idx = getelementptr inbounds i8, ptr %_0.i.i.i.i.i, i64 56
  tail call void `@llvm.memset.p0.i64(ptr` noundef nonnull align 8 dereferenceable(16) %_8.sroa.8.0._1.sroa_idx, i8 0, i64 16, i1 false)
  store ptr inttoptr (i64 1 to ptr), ptr %_8.sroa.10.0._1.sroa_idx, align 8
  %_8.sroa.11.0._1.sroa_idx = getelementptr inbounds i8, ptr %_0.i.i.i.i.i, i64 64
  store i64 0, ptr %_8.sroa.11.0._1.sroa_idx, align 8
  ret ptr %_0.i.i.i.i.i
}
```

</details>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.