BTreeMap/BTreeSet::from_iter: use bulk building to improve the performance #88448

xu-cheng · 2021-08-29T00:26:42Z

Bulk building is a common technique to increase the performance of building a fresh btree map. Instead of inserting items one-by-one, we sort all the items beforehand then create the BtreeMap in bulk.

Benchmark

./x.py bench library/alloc --test-args btree::map::from_iter

Before

test btree::map::from_iter_rand_100                      ... bench:       3,694 ns/iter (+/- 840)
test btree::map::from_iter_rand_10_000                   ... bench:   1,033,446 ns/iter (+/- 192,950)
test btree::map::from_iter_seq_100                       ... bench:       5,689 ns/iter (+/- 1,259)
test btree::map::from_iter_seq_10_000                    ... bench:     861,033 ns/iter (+/- 118,815)

After

test btree::map::from_iter_rand_100                      ... bench:       3,033 ns/iter (+/- 707)
test btree::map::from_iter_rand_10_000                   ... bench:     775,958 ns/iter (+/- 105,152)
test btree::map::from_iter_seq_100                       ... bench:       2,969 ns/iter (+/- 336)
test btree::map::from_iter_seq_10_000                    ... bench:     258,292 ns/iter (+/- 29,364)

Bulk building is a common technique to increase the performance of building a fresh btree map. Instead of inserting items one-by-one, we sort all the items beforehand then create the BtreeMap in bulk.

Apply the same optimization as BTreeMap::from_iter.

rust-highfive · 2021-08-29T00:26:45Z

r? @Mark-Simulacrum

(rust-highfive has picked a reviewer for you, use r? to override)

library/alloc/src/collections/btree/map.rs

Mark-Simulacrum · 2021-09-04T16:26:54Z

@bors try @rust-timer queue

Generally this seems OK, though I am mildly wary of the added code complexity. I don't think it's anything too worrisome though and in practice sorting is not typically that complicated.

rust-timer · 2021-09-04T16:26:56Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2021-09-04T16:27:04Z

⌛ Trying commit a03287b with merge 86a2710516f1f19a00691b2c64e6567ecbadca63...

bors · 2021-09-04T17:32:39Z

☀️ Try build successful - checks-actions
Build commit: 86a2710516f1f19a00691b2c64e6567ecbadca63 (86a2710516f1f19a00691b2c64e6567ecbadca63)

rust-timer · 2021-09-04T17:32:41Z

Queued 86a2710516f1f19a00691b2c64e6567ecbadca63 with parent 226e181, future comparison URL.

rust-timer · 2021-09-04T20:41:59Z

Finished benchmarking try commit (86a2710516f1f19a00691b2c64e6567ecbadca63): comparison url.

Summary: This change led to large relevant mixed results 🤷 in compiler performance.

Moderate improvement in instruction counts (up to -0.6% on full builds of ctfe-stress-4)
Large regression in instruction counts (up to 4.6% on incr-patched: println builds of clap-rs)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

Mark-Simulacrum · 2021-09-06T13:58:25Z

It looks like the added codegen for BTree is increasing the work LLVM needs to do by a good bit -- 4.6% is fairly sizeable. While this does increase the performance of long iterators collecting into BTree's, I'm not sure it is worth the compile time cost here. That said, ultimately, it seems like the code being added by this PR isn't doing anything too interesting or unexpected, so I am generally inclined to move forward. The runtime performance improvement is pretty nice.

So, I'm going to say that this regression is justified -- we are improving the runtime performance, and the regression appears to be entirely due to more codegen.

@rustbot label: +perf-regression-triaged

@bors r+

bors · 2021-09-06T13:58:27Z

📌 Commit a03287b has been approved by Mark-Simulacrum

bors · 2021-09-07T02:24:19Z

⌛ Testing commit a03287b with merge ffaf857...

bors · 2021-09-07T04:53:43Z

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing ffaf857 to master...

pnkfelix · 2021-09-08T15:09:17Z

As predicted, during weekly performance triage this was tagged as causing mixed changes to rustc instruction counts.

Its already been appropriately labelled; I'm just adding the link above for ease of reference in future.

tanriol · 2021-09-17T10:14:13Z

Sounds like from_iter's new memory behavior should at least be documented somewhere.

…8472 BTreeSet: avoid intermediate sorting when collecting sorted iterators As [pointed out by droundy](https://users.rust-lang.org/t/question-about-btreeset-implementation/76427), an obvious optimization is to skip the first step introduced by rust-lang#88448 (creation of a vector and sorting) and it's easy to do so for btree's own iterators. Also, exploit `from` in the examples.

xu-cheng added 3 commits August 28, 2021 17:18

add benchmark for BTreeMap::from_iter

6a6885c

BTreeMap::from_iter: use bulk building to improve the performance

cf814d6

Bulk building is a common technique to increase the performance of building a fresh btree map. Instead of inserting items one-by-one, we sort all the items beforehand then create the BtreeMap in bulk.

BTreeSet::from_iter: use bulk building to improve the performance

a03287b

Apply the same optimization as BTreeMap::from_iter.

rust-highfive assigned Mark-Simulacrum Aug 29, 2021

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 29, 2021

the8472 reviewed Aug 29, 2021

View reviewed changes

library/alloc/src/collections/btree/map.rs Show resolved Hide resolved

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 4, 2021

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Sep 4, 2021

rustbot added the perf-regression-triaged The performance regression has been triaged. label Sep 6, 2021

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 6, 2021

bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 7, 2021

bors merged commit ffaf857 into rust-lang:master Sep 7, 2021

rustbot added this to the 1.57.0 milestone Sep 7, 2021

xu-cheng deleted the btree-blk-build branch September 7, 2021 04:54

ssomers mentioned this pull request Mar 6, 2022

BTreeMap::entry: Avoid allocating if no insertion #92962

Merged

ssomers mentioned this pull request Jun 8, 2022

BTreeSet: avoid intermediate sorting when collecting sorted iterators #97868

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BTreeMap/BTreeSet::from_iter: use bulk building to improve the performance #88448

BTreeMap/BTreeSet::from_iter: use bulk building to improve the performance #88448

xu-cheng commented Aug 29, 2021

rust-highfive commented Aug 29, 2021

Mark-Simulacrum commented Sep 4, 2021

rust-timer commented Sep 4, 2021

bors commented Sep 4, 2021

bors commented Sep 4, 2021

rust-timer commented Sep 4, 2021

rust-timer commented Sep 4, 2021

Mark-Simulacrum commented Sep 6, 2021

bors commented Sep 6, 2021

bors commented Sep 7, 2021

bors commented Sep 7, 2021

pnkfelix commented Sep 8, 2021

tanriol commented Sep 17, 2021

BTreeMap/BTreeSet::from_iter: use bulk building to improve the performance #88448

BTreeMap/BTreeSet::from_iter: use bulk building to improve the performance #88448

Conversation

xu-cheng commented Aug 29, 2021

rust-highfive commented Aug 29, 2021

Mark-Simulacrum commented Sep 4, 2021

rust-timer commented Sep 4, 2021

bors commented Sep 4, 2021

bors commented Sep 4, 2021

rust-timer commented Sep 4, 2021

rust-timer commented Sep 4, 2021

Mark-Simulacrum commented Sep 6, 2021

bors commented Sep 6, 2021

bors commented Sep 7, 2021

bors commented Sep 7, 2021

pnkfelix commented Sep 8, 2021

tanriol commented Sep 17, 2021