Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Returning MaybeUninit consts generates code to zero out the return value. #83657

Closed
rodrimati1992 opened this issue Mar 29, 2021 · 7 comments · Fixed by #83698
Closed

Returning MaybeUninit consts generates code to zero out the return value. #83657

rodrimati1992 opened this issue Mar 29, 2021 · 7 comments · Fixed by #83698
Labels
A-codegen Area: Code generation A-const-eval Area: Constant evaluation, covers all const contexts (static, const fn, ...) C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@rodrimati1992
Copy link
Contributor

I tried this code:

use std::mem::MaybeUninit;

pub const fn foo() -> MaybeUninit<[u8; 10]> {
    const M: MaybeUninit<[u8; 10]> = MaybeUninit::uninit();
    M
}

pub const fn bar() -> MaybeUninit<[u8; 10]> {
    MaybeUninit::uninit()
}

I expected to see this happen: Both functions to compile to just a return instruction

Instead, this happened: Only bar did, foo zeros the return value in release builds.

playground::foo: # @playground::foo
# %bb.0:
	xorl	%eax, %eax
	xorl	%edx, %edx
	retq
                                        # -- End function

playground::bar: # @playground::bar
# %bb.0:
	retq
                                        # -- End function

Meta

rustc --version --verbose:

1.53.0-nightly (2021-03-28 4a20eb6a9da36c88ee92)
@rodrimati1992 rodrimati1992 added the C-bug Category: This is a bug. label Mar 29, 2021
@rodrimati1992 rodrimati1992 changed the title Returning ManuallyDrop consts generates code to zero out the return value. Returning MaybeUninit consts generates code to zero out the return value. Mar 29, 2021
@rodrimati1992
Copy link
Contributor Author

This turns into a memset when the array is large, which is significantly slower than doing nothing

@jonas-schievink jonas-schievink added the I-slow Issue: Problems and improvements with respect to performance of generated code. label Mar 29, 2021
@RalfJung
Copy link
Member

Cc @oli-obk

@RalfJung
Copy link
Member

Good catch! Yes we probably could be emitting proper undef constants to LLVM here (assuming LLVM has a way to let us do that).

@oli-obk
Copy link
Contributor

oli-obk commented Mar 30, 2021

Oh! yea we should totally fix that. Not sure how to do it for partially initialized aggregates, but we should at least cover the fully uninit case.

@oli-obk oli-obk added A-codegen Area: Code generation A-const-eval Area: Constant evaluation, covers all const contexts (static, const fn, ...) labels Mar 30, 2021
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Apr 1, 2021
Use undef for uninitialized bytes in constants

Fixes rust-lang#83657

This generates good code when the const is fully uninit, e.g.

```rust
#[no_mangle]
pub const fn fully_uninit() -> MaybeUninit<[u8; 10]> {
    const M: MaybeUninit<[u8; 10]> = MaybeUninit::uninit();
    M
}
```
generates
```asm
fully_uninit:
	ret
```

as you would expect.

There is no improvement, however, when it's partially uninit, e.g.

```rust
pub struct PartiallyUninit {
    x: u64,
    y: MaybeUninit<[u8; 10]>
}

#[no_mangle]
pub const fn partially_uninit() -> PartiallyUninit {
    const X: PartiallyUninit = PartiallyUninit { x: 0xdeadbeefcafe, y: MaybeUninit::uninit() };
    X
}
```
generates
```asm
partially_uninit:
	mov	rax, rdi
	mov	rcx, qword ptr [rip + .L__unnamed_1+16]
	mov	qword ptr [rdi + 16], rcx
	movups	xmm0, xmmword ptr [rip + .L__unnamed_1]
	movups	xmmword ptr [rdi], xmm0
	ret

.L__unnamed_1:
	.asciz	"\376\312\357\276\255\336\000"
	.zero	16
	.size	.L__unnamed_1, 24
```
which copies a bunch of zeros in place of the undef bytes, the same as before this change.

The constant is undef where it should be, though (see the added test) so this seems to be a missed optimization in LLVM.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Apr 2, 2021
Use undef for uninitialized bytes in constants

Fixes rust-lang#83657

This generates good code when the const is fully uninit, e.g.

```rust
#[no_mangle]
pub const fn fully_uninit() -> MaybeUninit<[u8; 10]> {
    const M: MaybeUninit<[u8; 10]> = MaybeUninit::uninit();
    M
}
```
generates
```asm
fully_uninit:
	ret
```

as you would expect.

There is no improvement, however, when it's partially uninit, e.g.

```rust
pub struct PartiallyUninit {
    x: u64,
    y: MaybeUninit<[u8; 10]>
}

#[no_mangle]
pub const fn partially_uninit() -> PartiallyUninit {
    const X: PartiallyUninit = PartiallyUninit { x: 0xdeadbeefcafe, y: MaybeUninit::uninit() };
    X
}
```
generates
```asm
partially_uninit:
	mov	rax, rdi
	mov	rcx, qword ptr [rip + .L__unnamed_1+16]
	mov	qword ptr [rdi + 16], rcx
	movups	xmm0, xmmword ptr [rip + .L__unnamed_1]
	movups	xmmword ptr [rdi], xmm0
	ret

.L__unnamed_1:
	.asciz	"\376\312\357\276\255\336\000"
	.zero	16
	.size	.L__unnamed_1, 24
```
which copies a bunch of zeros in place of the undef bytes, the same as before this change.

The constant is undef where it should be, though (see the added test) so this seems to be a missed optimization in LLVM.
@erikdesjardins
Copy link
Contributor

erikdesjardins commented Apr 6, 2021

NB: while #83698 fixes this issue as written, it likely does not fix the arrayvec regression, since ArrayVec { xs: MaybeUninit::uninit().assume_init(), len: 0 } is only partially uninit (see #83698 description).

@rodrimati1992
Copy link
Contributor Author

rodrimati1992 commented Apr 6, 2021

@erikdesjardins How so? Only the MaybeUninit is a constant. ArrayVec::new_const called in a non-const context should not have any problems with #83698

Which is to say that the two functions below should generate the same code:

fn foo() -> ArrayVec<u8, 1000>{
    ArrayVec::new() 
}
fn bar() -> ArrayVec<u8, 1000> {
    ArrayVec::new_const() 
}

Them not being equivalent was the motivation for the existence of new_const.

@erikdesjardins
Copy link
Contributor

Ah, you're right of course, I wasn't thinking

@bors bors closed this as completed in 20997f6 Aug 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-const-eval Area: Constant evaluation, covers all const contexts (static, const fn, ...) C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants