`BytesMut::freeze` is not always zero-cost as documented #587

bk2204 · 2023-01-27T16:00:21Z

When converting a BytesMut into Bytes via freeze, the operation can cause a re-allocation to occur because we call into_boxed_slice. This is unacceptable in some hot codepaths.

The documentation says that “[t]he conversion is zero cost.” I don't believe that an allocation can be considered zero cost.

Example code:

use bytes::BytesMut;

fn main() {
    let mut b = BytesMut::with_capacity(10);
    b.extend(b"abcdefg");
    let b = b.freeze();
    println!("{:?}", b);
}

If you compile this program and set a breakpoint on realloc, you'll see this (after the startup code):

(gdb) bt
#0  __GI___libc_realloc (oldmem=0x5555555a5ba0, bytes=7) at ./malloc/malloc.c:3394
#1  0x000055555555da2b in alloc::vec::Vec<T,A>::into_boxed_slice ()
#2  0x000055555555cc00 in <bytes::bytes::Bytes as core::convert::From<alloc::vec::Vec<u8>>>::from ()
#3  0x000055555555c762 in bytes_test::main ()
#4  0x000055555555cb03 in std::sys_common::backtrace::__rust_begin_short_backtrace ()
#5  0x000055555555cad9 in _ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17hcadd4fe5b7b3f353E.llvm.6218250527406152195 ()
#6  0x00005555555703fb in core::ops::function::impls::{impl#2}::call_once<(), (dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> ()
    at library/core/src/ops/function.rs:286
#7  std::panicking::try::do_call<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> () at library/std/src/panicking.rs:483
#8  std::panicking::try<i32, &(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> () at library/std/src/panicking.rs:447
#9  std::panic::catch_unwind<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> () at library/std/src/panic.rs:137
#10 std::rt::lang_start_internal::{closure#2} () at library/std/src/rt.rs:148
#11 std::panicking::try::do_call<std::rt::lang_start_internal::{closure_env#2}, isize> () at library/std/src/panicking.rs:483
#12 std::panicking::try<isize, std::rt::lang_start_internal::{closure_env#2}> () at library/std/src/panicking.rs:447
#13 std::panic::catch_unwind<std::rt::lang_start_internal::{closure_env#2}, isize> () at library/std/src/panic.rs:137
#14 std::rt::lang_start_internal () at library/std/src/rt.rs:148
#15 0x000055555555c915 in main ()

I believe in this case that's because the variant of the BytesMut is KIND_VEC and we're then recreating the Vec and doing an into, which calls into_boxed_slice. This is also the case if one substitutes BytesMut in the code for Vec, but in that case, there's no guarantee of that being zero-cost.

I'm not sure what the best way forward here is, but it would be great if someone could propose an idea where this really is zero cost. I have a server where I need to read some data into a buffer and then transfer it using a Bytes via Hyper, and since this is a very hot codepath, I really want to avoid a reallocation, since the reallocation shows up in my performance graphs.

The text was updated successfully, but these errors were encountered:

Darksonn · 2023-01-27T20:53:52Z

I don't think the realloc is necessary in the Vec<u8> -> Bytes conversion, since a Bytes can be subsliced.

seanmonstar · 2023-01-27T21:52:08Z

The details aren't fresh in my mind, but I think the reason it does into_boxed_slice is because there was no room to store the length and capacity.

bk2204 · 2023-01-28T16:15:29Z

I think this can be pretty easily implemented by turning the Vec<u8> into a shared variant. If folks want a patch, I'm happy to send one, and I've even added a few tests.

seanmonstar · 2023-01-28T16:33:55Z

Making it shared does require a new allocation too. Would it be worth checking if the length and capacity are the same, and so into_boxed_slice won't need to shrink?

bk2204 · 2023-01-28T17:02:45Z

That's true, but it requires a much smaller allocation of a handful of bytes, which I think will be acceptable here. Reallocating my buffer of 64 KiB requires substantially more space and a much larger memcpy as well.

I can certainly see if the length and capacity are the same and use into_boxed_slice. However, in my case, it's definitely not.

When we freeze a BytesMut, we turn it into a Vec, and then convert that to a Bytes. Currently, this happen using Vec::into_boxed_slice, which reallocates to a slice of the same length as the Vev if the length and the capacity are not equal. This can pose a performance problem if the Vec is large or if this happens many times in a loop. Instead, let's compare the length and capacity, and if they're the same, continue to handle this using into_boxed_slice. Otherwise, since we have a type of vtable which can handle a separate capacity, the shared vtable, let's turn our Vec into that kind of Bytes. While this does not avoid allocation altogether, it performs a fixed size allocation and avoids any need to memcpy.

Darksonn mentioned this issue Jan 31, 2023

Avoid large reallocations when freezing BytesMut #592

Merged

Darksonn closed this as completed in #592 Jan 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`BytesMut::freeze` is not always zero-cost as documented #587

`BytesMut::freeze` is not always zero-cost as documented #587

bk2204 commented Jan 27, 2023 •

edited

Loading

Darksonn commented Jan 27, 2023

seanmonstar commented Jan 27, 2023

bk2204 commented Jan 28, 2023

seanmonstar commented Jan 28, 2023

bk2204 commented Jan 28, 2023

BytesMut::freeze is not always zero-cost as documented #587

BytesMut::freeze is not always zero-cost as documented #587

Comments

bk2204 commented Jan 27, 2023 • edited Loading

Darksonn commented Jan 27, 2023

seanmonstar commented Jan 27, 2023

bk2204 commented Jan 28, 2023

seanmonstar commented Jan 28, 2023

bk2204 commented Jan 28, 2023

`BytesMut::freeze` is not always zero-cost as documented #587

`BytesMut::freeze` is not always zero-cost as documented #587

bk2204 commented Jan 27, 2023 •

edited

Loading