Vec's reallocation strategy needs settling

# Background

Currently, Vec's documentation frequently contains the caveat that "more space may be reserved than requested". This is primarily in response to the fact that jemalloc (or any other allocator) can actually reserve more space than you requested because it relies on fixed size-classes to more effeciently dole out memory. (see the table [here](https://web.archive.org/web/20150529223415/http://www.canonware.com/download/jemalloc/jemalloc-latest/doc/jemalloc.html))

Jemalloc itself exposes a `malloc_usable_size` function which can be used to determine how much capacity was _actually_ allocated for a pointer, as well as `nallocx` which can be used to determine how much capacity _will_ be allocated for a given size and alignment. Vec can in principle query one of these methods and update its capacity field with the result.

The question at hand is: _is this ever worth it to check `usable_capacity`, and is it ever undesirable?_

This issue was kicked off by Firefox devs who have experience doing exactly this optimization, and claim it being profitable. Facebook has also claimed excellent dividends in making its allocation strategy more jemalloc friendly, but to my knowledge do not actually query usable_capacity.

Currently our alloction strategy is almost completely naive. One can see the source [here](http://doc.rust-lang.org/1.4.0/src/alloc/raw_vec.rs.html#191-202), which boils down to:

``` rust
let new_cap = if self.cap == 0 { 
    4
} else { 
    2 * self.cap 
}
```

To the best of my knowledge, [this](https://github.com/facebook/folly/blob/badc3ebe7ef9d370e0d4b9d110f5dad6f6284d7e/folly/FBVector.h#L1127-L1156) is all the capacity logic for FBVector, which boils down to:

``` rust
let new cap = if self.cap == 0 {
    max(64 / size_of_elem, 1)   // empty
} else if self.cap < 4096 / size_of_elem) {
    self.cap * 2                // small
} else if self.cap > 4096 * 32 / size_of_elem) {
    self.cap * 2                // huge
} else {
    (self.cap * 3 + 1) / 2      // moderate
}
```

The only major deviation from Rust today being a 1.5 growth factor for "moderate" sized allocations. Note that there _is_ a [path](https://github.com/facebook/folly/blob/26d9f3f3cc34d30f7e73a6f12a3bae5102c7512e/folly/small_vector.h#L636-L644) in their small_vector type that queries malloc_usable_size.

Unfortunately I've been unable to find good information on what Firefox does here. The best I could find is [this discussion](https://bugzilla.mozilla.org/show_bug.cgi?id=419131) which demonstrates big wins --  6.5% less calls to malloc and 11% less memory usage. However the effort seems to peter out when it is asserted that this is obsoleted by Gecko just using power-of-two capacities. mxr and dxr seem to suggest it's only used for statistics. 

Hopefully Firefox and Facebook devs can chime in on any experiences.

# Going Forward

I have several outstanding concerns before we push further on this matter.

Rust is not C++, so I am partially suspicious of any benchmarks that demonstrate value in C++. In particular, the proliferation of `size_hint` and `extend` could completely blast away any need to be clever with capacities for most programs. I would also expect a large Rust application to frob the allocator less in general just because we default to move semantics, and don't have implicit assignment/copy constructors. Also, Rust programs are much more "reckless" with just passing around pointers into buffers because the borrow checker will always catch misuse. Slices in particular make this incredibly ergonomic. We need Rust-specific benchmarks. I'm sure the Hyper, Servo, Glium, Serde, and Rust devs all have some interesting workloads to look at. 

Rust's Vec is basically the community's `malloc`/`calloc`, because actual malloc is unstable and unsafe. We explicitly support extracting the Vec's pointer and taking control of the allocation. In that regard I believe it is desirable for it to be maximally well-behaved and performant for "good" cases (knowing exactly the capacity you want, having no use for extra capacity). There's a certain value in requesting a certain capacity, and getting the capacity. Both `nallocx` and `malloc_usable_size` are virtual function calls with non-trivial logic, and may have unacceptable overhead for responsible users of Vec.

Note that anything we do must enable `Vec` and `Box<[T]>` to reliably roundtrip through each other without reallocating in certain cases. If I invoke `shrink_to_fit` or `with_capacity`, it ought to not reallocate when I try to convert to a `Box<[T]>`. As far as I can tell, this should be possible to uphold even when using `malloc_usable_size` because jemalloc is "fuzzy" and only requires that the given size is somewhere between the requested one and the usable one. 

Anything we do must also work with allocators that aren't jemalloc. This may be as simple as setting `usable_size = requested_size` for every allocator that isn't jemalloc.

CC @pnkfelix @erickt @reem @seanmonstar @tomaka @SimonSapin @larsberg @pcwalton @Swatinem @nnethercote

CC #29848 #29847 #27627

CC @rust-lang/libs 

EDIT(workingjubilee, 2023-05-06): Originally one of the links touched on the master branch of folly instead of a permalink to a specific commit. Linkrot claimed the one to jemalloc's docs. These have been updated with a permalink and a Wayback Machine link, respectively.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vec's reallocation strategy needs settling #29931

Background

Going Forward

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Vec's reallocation strategy needs settling #29931

Description

Background

Going Forward

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions