Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BTreeMap/BTreeSet::from_iter: use bulk building to improve the performance #88448

Merged
merged 3 commits into from
Sep 7, 2021

Conversation

xu-cheng
Copy link
Contributor

Bulk building is a common technique to increase the performance of building a fresh btree map. Instead of inserting items one-by-one, we sort all the items beforehand then create the BtreeMap in bulk.

Benchmark

./x.py bench library/alloc --test-args btree::map::from_iter
  • Before
test btree::map::from_iter_rand_100                      ... bench:       3,694 ns/iter (+/- 840)
test btree::map::from_iter_rand_10_000                   ... bench:   1,033,446 ns/iter (+/- 192,950)
test btree::map::from_iter_seq_100                       ... bench:       5,689 ns/iter (+/- 1,259)
test btree::map::from_iter_seq_10_000                    ... bench:     861,033 ns/iter (+/- 118,815)
  • After
test btree::map::from_iter_rand_100                      ... bench:       3,033 ns/iter (+/- 707)
test btree::map::from_iter_rand_10_000                   ... bench:     775,958 ns/iter (+/- 105,152)
test btree::map::from_iter_seq_100                       ... bench:       2,969 ns/iter (+/- 336)
test btree::map::from_iter_seq_10_000                    ... bench:     258,292 ns/iter (+/- 29,364)

Bulk building is a common technique to increase the performance of
building a fresh btree map. Instead of inserting items one-by-one,
we sort all the items beforehand then create the BtreeMap in bulk.
Apply the same optimization as BTreeMap::from_iter.
@rust-highfive
Copy link
Collaborator

r? @Mark-Simulacrum

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 29, 2021
@Mark-Simulacrum
Copy link
Member

@bors try @rust-timer queue

Generally this seems OK, though I am mildly wary of the added code complexity. I don't think it's anything too worrisome though and in practice sorting is not typically that complicated.

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 4, 2021
@bors
Copy link
Contributor

bors commented Sep 4, 2021

⌛ Trying commit a03287b with merge 86a2710516f1f19a00691b2c64e6567ecbadca63...

@bors
Copy link
Contributor

bors commented Sep 4, 2021

☀️ Try build successful - checks-actions
Build commit: 86a2710516f1f19a00691b2c64e6567ecbadca63 (86a2710516f1f19a00691b2c64e6567ecbadca63)

@rust-timer
Copy link
Collaborator

Queued 86a2710516f1f19a00691b2c64e6567ecbadca63 with parent 226e181, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (86a2710516f1f19a00691b2c64e6567ecbadca63): comparison url.

Summary: This change led to large relevant mixed results 🤷 in compiler performance.

  • Moderate improvement in instruction counts (up to -0.6% on full builds of ctfe-stress-4)
  • Large regression in instruction counts (up to 4.6% on incr-patched: println builds of clap-rs)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Sep 4, 2021
@Mark-Simulacrum
Copy link
Member

It looks like the added codegen for BTree is increasing the work LLVM needs to do by a good bit -- 4.6% is fairly sizeable. While this does increase the performance of long iterators collecting into BTree's, I'm not sure it is worth the compile time cost here. That said, ultimately, it seems like the code being added by this PR isn't doing anything too interesting or unexpected, so I am generally inclined to move forward. The runtime performance improvement is pretty nice.

So, I'm going to say that this regression is justified -- we are improving the runtime performance, and the regression appears to be entirely due to more codegen.

@rustbot label: +perf-regression-triaged

@bors r+

@bors
Copy link
Contributor

bors commented Sep 6, 2021

📌 Commit a03287b has been approved by Mark-Simulacrum

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Sep 6, 2021
@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 6, 2021
@bors
Copy link
Contributor

bors commented Sep 7, 2021

⌛ Testing commit a03287b with merge ffaf857...

@bors
Copy link
Contributor

bors commented Sep 7, 2021

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing ffaf857 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 7, 2021
@bors bors merged commit ffaf857 into rust-lang:master Sep 7, 2021
@rustbot rustbot added this to the 1.57.0 milestone Sep 7, 2021
@xu-cheng xu-cheng deleted the btree-blk-build branch September 7, 2021 04:54
@pnkfelix
Copy link
Member

pnkfelix commented Sep 8, 2021

As predicted, during weekly performance triage this was tagged as causing mixed changes to rustc instruction counts.

Its already been appropriately labelled; I'm just adding the link above for ease of reference in future.

@tanriol
Copy link
Contributor

tanriol commented Sep 17, 2021

Sounds like from_iter's new memory behavior should at least be documented somewhere.

bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 9, 2022
…8472

BTreeSet: avoid intermediate sorting when collecting sorted iterators

As [pointed out by droundy](https://users.rust-lang.org/t/question-about-btreeset-implementation/76427), an obvious optimization is to skip the first step introduced by rust-lang#88448 (creation of a vector and sorting) and it's easy to do so for btree's own iterators. Also, exploit `from` in the examples.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants