Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix excessive memory allocation in RawVec::reserve #29454

Merged
merged 1 commit into from
Oct 31, 2015

Conversation

stepancheg
Copy link
Contributor

Before this patch reserve function allocated twice as requested
amount elements (not twice as capacity). It leaded to unnecessary
excessive memory usage in scenarios like this:

let mut v = Vec::new();
v.push(17);
v.extend(0..10);
println!("{}", v.capacity());

Vec allocated 22 elements, while it could allocate just 11.

reserve function must have a property of keeping push operation
cost (which calls reserve) O(1). To achieve this reserve must
exponentialy grow its capacity when it does reallocation.

There's better strategy to implement reserve:

let new_capacity = max(current_capacity * 2, requested_capacity);

This strategy still guarantees that capacity grows at O(1) with
reserve, and fixes the issue with extend.

Patch imlpements this strategy.

@rust-highfive
Copy link
Collaborator

r? @aturon

(rust_highfive has picked a reviewer for you, use r? to override)

@bors
Copy link
Contributor

bors commented Oct 29, 2015

☔ The latest upstream changes (presumably #29452) made this pull request unmergeable. Please resolve the merge conflicts.

@stepancheg
Copy link
Contributor Author

Rebased.

@bluss
Copy link
Member

bluss commented Oct 30, 2015

I like the new logic and can't see any obvious problems with it.

@amaranth
Copy link
Contributor

If we're making changes here I think a growth factor of 1.5 would be better than 2 although I'm not sure if using jemalloc changes things here. This blog post explains the reasoning but basically with 2 you are never able to reuse an old allocation.

@stepancheg
Copy link
Contributor Author

@amaranth grow factor is a separate, completely independent issue (I also think 1.5 would be better).

with 2 you are never able to reuse an old allocation

Only if your application has only one Vec.

@Gankra
Copy link
Contributor

Gankra commented Oct 30, 2015

@amaranth any static growth factor is suboptimal to my knowledge.

2 works fine for small or large vecs. 1.5 is only good for moderate-sized vecs.

#27627

@Gankra
Copy link
Contributor

Gankra commented Oct 30, 2015

Should we run some benches to make sure there aren't any issues with this? This is a no-inline cold path, so I don't think it's really necessary, but I'm not sure.

@stepancheg
Copy link
Contributor Author

@gankro I don't think bench is necessary, because code is straightforward, and it's similar to standard reserve implementations, like in java.util.ArrayList. But if you do, what a bench should look like?

This is a ... cold path

Actually, it is not cold path, for example, when you have thousands of vectors of 2..5 elements. However, I think time spent in malloc it much more than a half of time spent in reserve.

@bluss
Copy link
Member

bluss commented Oct 30, 2015

Using Read::read_to_end() on an unbuffered input file is a vector reallocation benchmark too.

@Gankra
Copy link
Contributor

Gankra commented Oct 30, 2015

r? @bluss

You seem to have a good handle on the issues at play here. I skimmed the code but am a bit fuzzy on if this will have the same crash behaviour when inserting into ZST collections. In particular, they rely on RawVec to crash to elide their own overflow checks for len.

@rust-highfive rust-highfive assigned bluss and unassigned aturon Oct 30, 2015
@stepancheg
Copy link
Contributor Author

@bluss Read::read_to_end is not a benchmark of realloc, because it does very little reallocation. Anyway, I did io::tests::bench_read_to_end benhmark, there's no difference:

test io::tests::bench_read_to_end                ... bench:  32,509,476 ns/iter (+/- 1,662,809) // master
test io::tests::bench_read_to_end                ... bench:  32,834,994 ns/iter (+/- 2,052,005) // with patch applied

@bluss
Copy link
Member

bluss commented Oct 30, 2015

@gankro Ok, way to pass a hot potato! 😄

From my reading, the behavior for ZST is exactly the same. Either the capacity fits and it returns, or it panics for capacity overflow.

I'll leave comments inline.

// If we make it past the first branch then we are guaranteed to
// panic.
let required_cap = used_cap.checked_add(needed_extra_cap)
.expect("capacity overflow");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must move this overflow check and panic case to below the required_cap <= self.cap check. While preserving this panic if it's a ZST element type.

The reason is that the fast path is the section of code before we decide if we need to reallocate or not, and it should be free from a branch to panic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first if in the original code is fine to use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I overlooked it.

Before this patch `reserve` function allocated twice as requested
amount elements (not twice as capacity).  It leaded to unnecessary
excessive memory usage in scenarios like this:

```
let mut v = Vec::new();
v.push(17);
v.extend(0..10);
println!("{}", v.capacity());
```

`Vec` allocated 22 elements, while it could allocate just 11.

`reserve` function must have a property of keeping `push` operation
cost (which calls `reserve`) `O(1)`. To achieve this `reserve` must
exponentialy grow its capacity when it does reallocation.

There's better strategy to implement `reserve`:

```
let new_capacity = max(current_capacity * 2, requested_capacity);
```

This strategy still guarantees that capacity grows at `O(1)` with
`reserve`, and fixes the issue with `extend`.

Patch imlpements this strategy.
@stepancheg
Copy link
Contributor Author

Updated patch with fast-path branch made first as before.

@stepancheg stepancheg force-pushed the vec-reserve branch 2 times, most recently from 474c20c to 46068c9 Compare October 30, 2015 21:31
@bluss
Copy link
Member

bluss commented Oct 30, 2015

Thank you! This is beautiful.

@bors r+

@bors
Copy link
Contributor

bors commented Oct 30, 2015

📌 Commit 46068c9 has been approved by bluss

bors added a commit that referenced this pull request Oct 30, 2015
Before this patch `reserve` function allocated twice as requested
amount elements (not twice as capacity).  It leaded to unnecessary
excessive memory usage in scenarios like this:

```
let mut v = Vec::new();
v.push(17);
v.extend(0..10);
println!("{}", v.capacity());
```

`Vec` allocated 22 elements, while it could allocate just 11.

`reserve` function must have a property of keeping `push` operation
cost (which calls `reserve`) `O(1)`. To achieve this `reserve` must
exponentialy grow its capacity when it does reallocation.

There's better strategy to implement `reserve`:

```
let new_capacity = max(current_capacity * 2, requested_capacity);
```

This strategy still guarantees that capacity grows at `O(1)` with
`reserve`, and fixes the issue with `extend`.

Patch imlpements this strategy.
@bors
Copy link
Contributor

bors commented Oct 30, 2015

⌛ Testing commit 46068c9 with merge 64b0277...

@bors bors merged commit 46068c9 into rust-lang:master Oct 31, 2015
@bluss bluss added the relnotes Marks issues that should be documented in the release notes of the next release. label Oct 31, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes Marks issues that should be documented in the release notes of the next release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants