-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Small tweaks in ToOwned::clone_into #70201
Conversation
r? @dtolnay (rust_highfive has picked a reviewer for you, use r? to override) |
To justify the slice optimization, here's output from compiler explorer. They are nearly the same, but note the smaller prologue and epilogue in the new example::clone_into_orig:
push r15
push r14
push r12
push rbx
push rax
mov r14, rdx
mov r12, rsi
mov r15, rdi
mov rbx, qword ptr [rdx + 16]
cmp rbx, rsi
jb .LBB1_2
mov qword ptr [r14 + 16], r12
mov rbx, r12
.LBB1_2:
test rbx, rbx
je .LBB1_4
mov rdi, qword ptr [r14]
lea rdx, [4*rbx]
mov rsi, r15
call qword ptr [rip + memcpy@GOTPCREL]
.LBB1_4:
lea rsi, [r15 + 4*rbx]
sub r12, rbx
mov rdi, r14
mov rdx, r12
add rsp, 8
pop rbx
pop r12
pop r14
pop r15
jmp alloc::vec::Vec<T>::extend_from_slice
example::clone_into_split:
push r15
push r14
push rbx
mov r14, rdx
mov rbx, rsi
mov rsi, rdi
mov rdx, qword ptr [rdx + 16]
cmp rdx, rbx
jb .LBB2_2
mov qword ptr [r14 + 16], rbx
mov rdx, rbx
.LBB2_2:
lea r15, [rsi + 4*rdx]
sub rbx, rdx
test rdx, rdx
je .LBB2_4
mov rdi, qword ptr [r14]
shl rdx, 2
call qword ptr [rip + memcpy@GOTPCREL]
.LBB2_4:
mov rdi, r14
mov rsi, r15
mov rdx, rbx
pop rbx
pop r14
pop r15
jmp alloc::vec::Vec<T>::extend_from_slice |
@bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit 28791f6 with merge c462c8500772945a0f68a265dc1f8decf6b42e99... |
☀️ Try build successful - checks-azure |
Queued c462c8500772945a0f68a265dc1f8decf6b42e99 with parent 1057dc9, future comparison URL. |
Finished benchmarking try commit c462c8500772945a0f68a265dc1f8decf6b42e99, comparison URL. |
Seems neutral to compiler performance. I think |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
@bors r+ |
📌 Commit 28791f6 has been approved by |
Small tweaks in ToOwned::clone_into - `<[T]>::clone_into` is slightly more optimized. - `CStr::clone_into` is new, letting it reuse its allocation. - `OsStr::clone_into` now forwards to the underlying slice/`Vec`.
Failed in rollup #70237 (comment) @bors r- |
The new test in c_str.rs fails on Windows x86_64-msvc-1: ---- ffi::c_str::tests::test_c_str_clone_into stdout ----
thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
left: `0x210c5e05800`,
right: `0x210c5e05830`', src\libstd\ffi\c_str.rs:1527:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
failures:
ffi::c_str::tests::test_c_str_clone_into
test result: FAILED. 743 passed; 1 failed; 2 ignored; 0 measured; 0 filtered out |
Hmm, maybe that test is assuming too much about the allocator. I could try it with equal lengths, but I'll confirm on Windows before I resubmit. |
It could just be there is #[cfg(windows)] codepath that is missing a clone_into function. |
It appears to codegen slightly more efficiently with `split_at` taking two slices at once, rather than slicing across different calls.
It can try to keep its allocation by converting the inner `Box` to `Vec`, using `clone_into` on the bytes, then convert back to `Box`.
Despite OS differences, they're all just `Vec<u8>` inside, so we can just forward `clone_into` calls to that optimized implementation.
OK, I was able to confirm that failure on Windows, and now it passes when the strings are equal length. |
@bors r+ |
📌 Commit f854070 has been approved by |
Small tweaks in ToOwned::clone_into - `<[T]>::clone_into` is slightly more optimized. - `CStr::clone_into` is new, letting it reuse its allocation. - `OsStr::clone_into` now forwards to the underlying slice/`Vec`.
⌛ Testing commit f854070 with merge 83177ff73c809ab897cbc450cf19aee95d48f94a... |
@bors retry (yield) |
…ievink Rollup of 5 pull requests Successful merges: - rust-lang#70201 (Small tweaks in ToOwned::clone_into) - rust-lang#70762 (Miri leak check: memory reachable through globals is not leaked) - rust-lang#70846 (Keep codegen units unmerged when building compiler builtins) - rust-lang#70854 (Use assoc int submodules) - rust-lang#70857 (Don't import integer and float modules, use assoc consts 2) Failed merges: r? @ghost
<[T]>::clone_into
is slightly more optimized.CStr::clone_into
is new, letting it reuse its allocation.OsStr::clone_into
now forwards to the underlying slice/Vec
.