Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use String::with_capacity in format! #39356

Merged
merged 4 commits into from
Feb 3, 2017
Merged

Conversation

krdln
Copy link
Contributor

@krdln krdln commented Jan 28, 2017

Add an Arguments::estimated_capacity to estimate the length of formatted text and use it in std::fmt::format as the initial capacity of the buffer.

The capacity is calculated based on the literal parts of format string, see the details in the implementation.

Some benches:

empty:       format!("{}", black_box(""))
literal:     format!("Literal")
long:        format!("Hello Hello Hello Hello, {}!", black_box("world"))
long_rev:    format!("{}, hello hello hello hello!", black_box("world"))
long_rev_2:  format!("{}{}, hello hello hello hello!", 1, black_box("world"))
short:       format!("Hello, {}!", black_box("world"))
short_rev:   format!("{}, hello!", black_box("world"))
short_rev_2: format!("{}{}, hello!", 1, black_box("world"))
surround:    format!("aaaaa{}ccccc{}eeeee", black_box("bbbbb"), black_box("eeeee"))
two_spaced:  format!("{} {}", black_box("bbbbb"), black_box("eeeee"))
worst_case:  format!("{} a long piece...", black_box("and even longer argument. not sure why it has to be so long"))
 empty        25            28                      3   12.00% 
 literal      35            29                     -6  -17.14% 
 long         80            46                    -34  -42.50% 
 long_rev     79            45                    -34  -43.04% 
 long_rev_2   111           66                    -45  -40.54% 
 short        73            46                    -27  -36.99% 
 short_rev    74            76                      2    2.70% 
 short_rev_2  107           108                     1    0.93% 
 surround     142           65                    -77  -54.23% 
 two_spaced   111           115                     4    3.60% 
 worst_case   89            101                    12   13.48% 

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @aturon (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@krdln
Copy link
Contributor Author

krdln commented Jan 28, 2017

I'm still not sure if I'm doing testing right and if the benches should go into the repo (I haven't seen any format!-related benches, so I didn't add any).

Also, the capacity estimation might be more ambitious – eg. we could iterate over all str-like arguments and add their length to the capacity, but that would require some virtual calls and more complexity, while I wanted to stay with simple heuristics that will most likely be compiled down to a constant.

Also, I'm not quite sure why the empty_format got slower. The assembly looked similar to me.

assert_eq!(format_args!("Hello").estimated_capacity(), 5);
assert_eq!(format_args!("Hello, {}!", "").estimated_capacity(), 16);
assert_eq!(format_args!("{}, hello!", "World").estimated_capacity(), 16);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test for

format_args!("{} {} {}", "12characters", "13_characters", "14__characters")

I am curious what estimated capacity will.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also measure time.

Copy link
Contributor Author

@krdln krdln Feb 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous algorithm would set the capacity to 8 (two spaces times two, rounded up to eight). I've updated the algorithm so it doesn't try to be smart in your case and also added some more benchmarks.

Also measure time.

You mean to add the benches to repo? I'm not sure where exactly they belong.

@aturon
Copy link
Member

aturon commented Jan 31, 2017

cc @rust-lang/libs, any thoughts on this micro-optimization? There are some benchmark results showing improvement, and the change is pretty light in terms of maintenance burden (given that it's just calculating a capacity).

Personally I'm 👍 on this change.

@BurntSushi
Copy link
Member

@aturon LGTM.

@krdln
Copy link
Contributor Author

krdln commented Jan 31, 2017

I will need to address the point made by @KalitaAlexey and adjust the heuristics slightly – current implementation is likely to reallocate when the format string starts with {}.

let multiplier = if self.args.is_empty() { W(1) } else { W(2) };

let capacity = multiplier * pieces_length;
if multiplier == W(2) && (W(1)..W(8)).contains(capacity) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is .contains here just serving as an alternative to <=? If so, could that be used instead?

pub fn estimated_capacity(&self) -> usize {
// Using wrapping arithmetics in this function, because
// wrong result is highly unlikely and doesn't cause unsafety.
use ::num::Wrapping as W;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is wrapping really necessary here? If we're wrapping around the max value of usize we surely want to just skip estimation entirely, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we'd really wrap, then either the program would fail from OOM anyway, or we'd just underestimate the capacity (in case of multiplication by 2). I've used Wrapping to avoid sprinkling the code with early returns, so it's easier to read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was just struck by the oddity of reaching for a wrapping type. If we wanted to be "correct" I'd expect saturating arithmetic here rather than wrapping. In both cases though something has gone seriously wrong if we wrap at all.

In that sense I'd probably argue that normal arithmetic is what I'd expect here: panic in debug and don't worry about it in optimized mode.

// double in size. So we're pre-doubling it here.
let multiplier = if self.args.is_empty() { W(1) } else { W(2) };

let capacity = multiplier * pieces_length;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why's this multiplying by the number of pieces? Those all have a statically known size, and we're guessing to the size of args, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's multiplying overall length of all pieces by either 1 or 2. The idea is that if there are any arguments, the string buffer will grow on the next push. To avoid this reallocation, we're pre-doubling the capacity. I've changed the code so it's more clear. Should I change pieces_length to something else too?

@alexcrichton
Copy link
Member

@aturon I'm not a huge fan of repeated micro-optimization but I don't have any concrete points against this.

@alexcrichton
Copy link
Member

Thanks for the reorganization! To me at least it's definitely easier to read.

For the last clause here, we're assuming that if there's arguments the length of the format string is going to be twice the length of string literals. I would naively, however, expect something like:

pieces_length + heuristic(self.args.len())

Where we'd assume that all args are like 8 bytes or something like that and then we'd add on the literal length of the pieces. In practice I don't really know how these'd play out, just what I expected.

@krdln
Copy link
Contributor Author

krdln commented Feb 2, 2017

@alexcrichton
I was trying to be conservative here though and tried to think of the simplest solution that won't overallocate – suddenly making someone else's code more memory hungry is a bad thing. I agree that the "2×" heuristics would be usually too small for small pieces_length and too big for long pieces_length, but it never allocates more than two times the necessary memory. It works almost exactly like you'd reserve exactly pieces_length and let backing vector double in size on next push (modulo some edge cases). I don't really have a good idea to make it depend on args.len() and always fit in "2×" range. I'm happy to hear any ideas!

Anyway, here's a quite crazy idea: Always allocate at least about 100 bytes (a line), and if it exceeds the "2×", just trim it afterwards. Or maybe let the format use a small stack-buffer?

Good points regarding the Wrapping, it really seems unnecessary, I'll get rid of them and push.

@krdln
Copy link
Contributor Author

krdln commented Feb 2, 2017

I've still left a checked_add in the "×2" case, since I could image a string taking more than half a usize on architectures where usize is 16 bit. That may be totally premature, but doesn't hurt to do.

@alexcrichton
Copy link
Member

This seems reasonable to me implementation-wise, thanks! I'd also be ok w/ a blank "allocate this many bytes and shrink afterwards", but we'd want to benchmark that to ensure shrinking didn't cause problems.

@krdln
Copy link
Contributor Author

krdln commented Feb 2, 2017

@alexcrichton
I've made a test with "alloc 100 bytes and shrink if result smaller than 50" and it indeed fixes some worst cases, but on the short ones – when the shrink occurs – it slows down by about 65% (compared to this PR; compared to status quo, it's still a small win). So I think it's not worth it. We could definitely experiment with some stack-backed string, but that'd be a lot more complex change.

@aturon
Copy link
Member

aturon commented Feb 2, 2017

This PR looks much better with the recent changes; thank you!

@bors: r+

@bors
Copy link
Contributor

bors commented Feb 2, 2017

📌 Commit ecda7f3 has been approved by aturon

@bors
Copy link
Contributor

bors commented Feb 3, 2017

☔ The latest upstream changes (presumably #39287) made this pull request unmergeable. Please resolve the merge conflicts.

@aturon
Copy link
Member

aturon commented Feb 3, 2017

@bors r+

@bors
Copy link
Contributor

bors commented Feb 3, 2017

📌 Commit 0267529 has been approved by aturon

@bors
Copy link
Contributor

bors commented Feb 3, 2017

⌛ Testing commit 0267529 with merge 86d9ed6...

bors added a commit that referenced this pull request Feb 3, 2017
Use `String::with_capacity` in `format!`

Add an `Arguments::estimated_capacity` to estimate the length of formatted text and use it in `std::fmt::format` as the initial capacity of the buffer.

The capacity is calculated based on the literal parts of format string, see the details in the implementation.

Some benches:
```rust
empty:       format!("{}", black_box(""))
literal:     format!("Literal")
long:        format!("Hello Hello Hello Hello, {}!", black_box("world"))
long_rev:    format!("{}, hello hello hello hello!", black_box("world"))
long_rev_2:  format!("{}{}, hello hello hello hello!", 1, black_box("world"))
short:       format!("Hello, {}!", black_box("world"))
short_rev:   format!("{}, hello!", black_box("world"))
short_rev_2: format!("{}{}, hello!", 1, black_box("world"))
surround:    format!("aaaaa{}ccccc{}eeeee", black_box("bbbbb"), black_box("eeeee"))
two_spaced:  format!("{} {}", black_box("bbbbb"), black_box("eeeee"))
worst_case:  format!("{} a long piece...", black_box("and even longer argument. not sure why it has to be so long"))
```
```
 empty        25            28                      3   12.00%
 literal      35            29                     -6  -17.14%
 long         80            46                    -34  -42.50%
 long_rev     79            45                    -34  -43.04%
 long_rev_2   111           66                    -45  -40.54%
 short        73            46                    -27  -36.99%
 short_rev    74            76                      2    2.70%
 short_rev_2  107           108                     1    0.93%
 surround     142           65                    -77  -54.23%
 two_spaced   111           115                     4    3.60%
 worst_case   89            101                    12   13.48%
```
@bors
Copy link
Contributor

bors commented Feb 3, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: aturon
Pushing 86d9ed6 to master...

@bors bors merged commit 0267529 into rust-lang:master Feb 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants