Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alloc: make RawVec round up allocation sizes to next power-of-two #29848

Closed
wants to merge 2 commits into from

Conversation

Swatinem
Copy link
Contributor

The system allocator may give more memory than requested, usually rounding up to the next power-of-two.
This PR uses a similar logic inside RawVec to be able to make use of this excess memory.

NB: this is not used for with_capacity and shrink_to_fit pending discussion in #29931

r? @gankro

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @nikomatsakis (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@Swatinem
Copy link
Contributor Author

Looking at the run-pass/allocator-override.rs failure, I think the failure is due to this abort: https://github.com/rust-lang/rust/blob/master/src/test/auxiliary/allocator-dummy.rs#L54

Would it be sufficient to just return size there? (doing a test run with that right now)

@petrochenkov
Copy link
Contributor

Is it actually useful?
Can't allocator just reallocate in place on reallocation request from vector if it has extra space?
I'd prefer to keep vector's concerns and allocator's concerns separated if possible.
Vec::with_capacity(x).capacity() == x is a useful guarantee to have even unofficially (into_boxed_slice is a good example why), I wouldn't like to lose it. As an example, C++ didn't provide this guarantee for years, but now they are moving to providing it after all, because the freedom to have extra capacity is not actually used by implementations in practice.

@Swatinem
Copy link
Contributor Author

You are right. Vec::with_capacity() does not mention in the docs that the implementation might choose to over-allocate. However shrink_to_fit() does mention that in the docs explicitly, even though its not used by the current code: http://doc.rust-lang.org/std/vec/struct.Vec.html#method.shrink_to_fit
That’s what #29847 is all about.

The other possibility is to make sure that with_capacity() and shrink_to_fit() really use exact capacity (removing the note from the docs) and just make the implicit growth and reserve() code use alloc_size. reserve_exact() docs also say that the allocator might reserve more than requested.

Depending on your usage pattern, this might actually be a win.
Like if you reserve space for 1025 u32 values, the allocator actually gives you enough memory for 2048 u32 values. But yes, that is a very contrived example.

@Gankra
Copy link
Contributor

Gankra commented Nov 15, 2015

I remember hearing that the "extra capacity" in jemalloc was always basically "round up to a multiple of the requested alignment", but since all non-repr(packed) types have a size that's a multiple of alignment, that means this is just a waste of time?

Note that e.g. Facebook's FBVector doesn't bother requesting usable_size, and it's specifically designed to work really well jemalloc. CC #27627

@Gankra
Copy link
Contributor

Gankra commented Nov 15, 2015

The "may have more cap than requested" stuff is IIRC a reservation for when we get custom local allocators, which strongly favour much more restricted allocation patterns.

@Gankra Gankra assigned Gankra and unassigned nikomatsakis Nov 15, 2015
@nnethercote
Copy link
Contributor

I remember hearing that the "extra capacity" in jemalloc was always basically "round up to a multiple of the requested alignment", but since all non-repr(packed) types have a size that's a multiple of alignment, that means this is just a waste of time?

It's worth confirming these sorts of things, because this isn't true at all. From the jemalloc docs:

Assuming 2 MiB chunks, 4 KiB pages, and a 16-byte quantum on a
64-bit system, the size classes in each category are

[8]
[16, 32, 48, 64, 80, 96, 112, 128]
[160, 192, 224, 256]
[320, 384, 448, 512]
[640, 768, 896, 1024]
[1280, 1536, 1792, 2048]
[2560, 3072, 3584, 4096]
[5 KiB, 6 KiB, 7 KiB, 8 KiB]
[10 KiB, 12 KiB, 14 KiB]
[16 KiB]
[20 KiB, 24 KiB, 28 KiB, 32 KiB]
[40 KiB, 48 KiB, 54 KiB, 64 KiB]
[80 KiB, 96 KiB, 112 KiB, 128 KiB]
[160 KiB, 192 KiB, 224 KiB, 256 KiB]
[320 KiB, 384 KiB, 448 KiB, 512 KiB]
[640 KiB, 768 KiB, 896 KiB, 1 MiB]
[1280 KiB, 1536 KiB, 1792 KiB]
[2 MiB]
[2560 KiB, 3 MiB, 3584 KiB, 4 MiB]
[5 MiB, 6 MiB, 7 MiB, 8 MiB]
[10 MiB, 12 MiB, 14 MiB, 16 MiB]
[20 MiB, 24 MiB, 28 MiB, 32 MiB]
[40 MiB, 48 MiB, 56 MiB, 64 MiB]

If you request a size that isn't one of these, jemalloc will round the request up to the next size class. The exact sizes depend on the jemalloc configuration, but this kind of steadily increasing spaces between size classes will always be present with jemalloc.

FWIW, we have found it worthwhile to modify multiple vector-like classes in Firefox to maximize capacity in the way suggested by this PR.

@Gankra
Copy link
Contributor

Gankra commented Nov 15, 2015

Awesome, thanks! (I did try to verify it but couldn't find anything)

@Gankra
Copy link
Contributor

Gankra commented Nov 15, 2015

@nnethercote are you familiar with the reallocation strategies used by FBVector? They make claims of explicitly optimizing for jemalloc. When I looked into their impl it didn't seem to check usable size (but was optimized to work with the size classes to some extent).

@bluss
Copy link
Member

bluss commented Nov 15, 2015

It sounds like we don't need exactly capacity == len, as long as jemalloc gets a capacity in the right size class, but we can't special case for only jemalloc in RawVec, we support several allocators.

@nnethercote
Copy link
Contributor

I don't know anything about FBVector beyond what https://github.com/facebook/folly/blob/master/folly/docs/FBVector.md says. It's vague about how it integrates with jemalloc.

@alexcrichton
Copy link
Member

It's probably best to keep discussion about this topic centralized, so if we're talking about whether or not the change at hand should be made let's do that on the issue and if we're talking about the technical details then let's go here instead.

Just trying to reduce the number of threads we need to follow!

@nagisa
Copy link
Member

nagisa commented Dec 2, 2015

I remember there being a claim that this doesn’t (or shouldn’t) really matter if allocation is a power-of-two, however e.g. String seems to not be using power-of-two, but rather multiplies its current capacity by two. It might be a good idea to see what usable_size works out to be in case of String and possibly other data structures based on Vec/RawVec.

@Swatinem
Copy link
Contributor Author

Swatinem commented Dec 2, 2015

Before actually coming up with this PR (which lazy me should update already!), I made a little testcase here: https://gist.github.com/Swatinem/34ac667b170a069b326d

The results show that for strings the with smaller alignment, you get totally different usable sizes (npot) than for u32 for example.

* adds a note to String::shrink_to_fit to match the same note on Vec regarding
  additional capacity
* change asserts in docs to treat capacity as a lower bound
@Swatinem Swatinem changed the title RFC: alloc: make RawVec round up capacity to usable_size alloc: make RawVec round up allocation sizes to next power-of-two Dec 9, 2015
@Swatinem
Copy link
Contributor Author

Swatinem commented Dec 9, 2015

I just updated the PR as follows:
Instead of using heap::alloc_size which incurs some overhead, I just round up the allocation request to the nearest power-of-two and use that.

I also removed the usage from with_capacity pending the discussion in #29931.

The system allocator may give more memory than requested, usually rounding
up to the next power-of-two.
This PR uses a similar logic inside RawVec to be able to make use of this
excess memory.

NB: this is *not* used for `with_capacity` and `shrink_to_fit` pending
discussion in rust-lang#29931
@bluss
Copy link
Member

bluss commented Dec 10, 2015

Rounding up to power of two sounds like it defeats the purpose of the change: rust programs will use more memory. Using usable_size is a lot better.

@bors
Copy link
Contributor

bors commented Dec 11, 2015

☔ The latest upstream changes (presumably #30148) made this pull request unmergeable. Please resolve the merge conflicts.

@nagisa
Copy link
Member

nagisa commented Dec 29, 2015

This still seems to be relevant but has stalled.

@Swatinem
Copy link
Contributor Author

I would be glad to update the patch once again. I would just need someone with the authority to make the decision which approach (using usable_size or going with next pot) to go forward with.

As a first step, just for the automatic growing.
The next step would be also to take care of the initial sizing and shrinking but that would need other changes as well.

As I said, I would happily implement all that with a little guidance/mentoring. But to be honest, I don’t really have the patience for the bikeshedding involved, since this seems to be quite controversial.

@Gankra
Copy link
Contributor

Gankra commented Dec 30, 2015

I don't think we can move forward without some sample workloads and empirical results.

@alexcrichton
Copy link
Member

Closing due to inactivity, but feel free to resubmit with some numbers as @gankro requested!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants