Tell LLVM that `partition_point` returns a valid fencepost #102535

scottmcm · 2022-10-01T06:42:16Z

This was already done for a successful binary_search, but this way partition_point can get similar optimizations.

Demonstration that nightly can't do this optimization today, and leaves in the panicking path: https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=e1074cd2faf5f68e49cffd728ded243a

r? @thomcc

This was already done for a successful `binary_search`, but this way `partition_point` can get similar optimizations.

Kobzol · 2022-10-01T07:02:31Z

Recently using partition_point helped the performance of rustc itself, let's see if this has any additional effect.

@bors try @rust-timer queue

rust-timer · 2022-10-01T07:02:32Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-10-01T07:02:40Z

⌛ Trying commit c7af338 with merge 7c6425f66f7ebc2482f0812dfe07a02a67b36030...

scottmcm · 2022-10-01T07:48:06Z

I'd be extremely surprised if this showed up in perf, since the branch is probably perfectly predicted since LLVM knows the panic is cold, and this doesn't help the line lookup case.

The following use is the kind of thing that this PR should improve, but it's probably not hot enough to show up:

rust/compiler/rustc_data_structures/src/sorted_map/index_map.rs

Lines 98 to 99 in 5a7e4c6

    
           let lower_bound = self.idx_sorted_by_item_key.partition_point(|&i| self.items[i].0 < key); 
        
           self.idx_sorted_by_item_key[lower_bound..].iter().map_while(move |&i| {

bors · 2022-10-01T08:53:02Z

☀️ Try build successful - checks-actions
Build commit: 7c6425f66f7ebc2482f0812dfe07a02a67b36030 (7c6425f66f7ebc2482f0812dfe07a02a67b36030)

rust-timer · 2022-10-01T08:53:04Z

Queued 7c6425f66f7ebc2482f0812dfe07a02a67b36030 with parent de341fe, future comparison URL.

rust-timer · 2022-10-01T10:08:22Z

Finished benchmarking commit (7c6425f66f7ebc2482f0812dfe07a02a67b36030): comparison URL.

Overall result: ❌ regressions - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.4%	[1.2%, 1.6%]	6
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	4.7%	[3.6%, 6.4%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Cycles

This benchmark run did not return any relevant results for this metric.

the arithmetic mean of the percent change ↩ ↩²
number of relevant changes ↩ ↩²

Kobzol · 2022-10-01T10:13:37Z

Ok, basically no changes.

thomcc · 2022-10-02T02:03:21Z

This is great, thanks!

@bors r+

bors · 2022-10-02T02:03:22Z

📌 Commit c7af338 has been approved by thomcc

It is now in the queue for this repository.

bors · 2022-10-02T07:11:19Z

⌛ Testing commit c7af338 with merge c2590e6...

bors · 2022-10-02T09:53:02Z

☀️ Test successful - checks-actions
Approved by: thomcc
Pushing c2590e6 to master...

rust-timer · 2022-10-02T13:21:14Z

Finished benchmarking commit (c2590e6): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-3.2%	[-3.2%, -3.2%]	1
Improvements ✅ (secondary)	-4.3%	[-4.3%, -4.3%]	1
All ❌✅ (primary)	-3.2%	[-3.2%, -3.2%]	1

Cycles

This benchmark run did not return any relevant results for this metric.

the arithmetic mean of the percent change ↩
number of relevant changes ↩

…-point, r=thomcc Tell LLVM that `partition_point` returns a valid fencepost This was already done for a successful `binary_search`, but this way `partition_point` can get similar optimizations. Demonstration that nightly can't do this optimization today, and leaves in the panicking path: <https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=e1074cd2faf5f68e49cffd728ded243a> r? `@thomcc`

Tell LLVM that partition_point returns a valid fencepost

c7af338

This was already done for a successful `binary_search`, but this way `partition_point` can get similar optimizations.

This comment was marked as resolved.

Sign in to view

rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Oct 1, 2022

rust-highfive assigned thomcc Oct 1, 2022

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 1, 2022

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 1, 2022

scottmcm mentioned this pull request Oct 1, 2022

Improve bounds check for function that always return in-bounds index #98258

Open

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 1, 2022

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 2, 2022

bors added the merged-by-bors This PR was explicitly merged by bors. label Oct 2, 2022

bors merged commit c2590e6 into rust-lang:master Oct 2, 2022

rustbot added this to the 1.66.0 milestone Oct 2, 2022

scottmcm deleted the optimize-split-at-partition-point branch October 18, 2022 03:46

the8472 mentioned this pull request May 16, 2025

Optimize: core slice binary_search_by #141097

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tell LLVM that `partition_point` returns a valid fencepost #102535

Tell LLVM that `partition_point` returns a valid fencepost #102535

Uh oh!

scottmcm commented Oct 1, 2022

Uh oh!

This comment was marked as resolved.

Kobzol commented Oct 1, 2022

Uh oh!

rust-timer commented Oct 1, 2022

Uh oh!

bors commented Oct 1, 2022

Uh oh!

scottmcm commented Oct 1, 2022

Uh oh!

bors commented Oct 1, 2022

Uh oh!

rust-timer commented Oct 1, 2022

Uh oh!

rust-timer commented Oct 1, 2022

Uh oh!

Kobzol commented Oct 1, 2022

Uh oh!

thomcc commented Oct 2, 2022

Uh oh!

bors commented Oct 2, 2022

Uh oh!

bors commented Oct 2, 2022

Uh oh!

bors commented Oct 2, 2022

Uh oh!

rust-timer commented Oct 2, 2022

Uh oh!

Uh oh!

Tell LLVM that partition_point returns a valid fencepost #102535

Tell LLVM that partition_point returns a valid fencepost #102535

Uh oh!

Conversation

scottmcm commented Oct 1, 2022

Uh oh!

This comment was marked as resolved.

Kobzol commented Oct 1, 2022

Uh oh!

rust-timer commented Oct 1, 2022

Uh oh!

bors commented Oct 1, 2022

Uh oh!

scottmcm commented Oct 1, 2022

Uh oh!

bors commented Oct 1, 2022

Uh oh!

rust-timer commented Oct 1, 2022

Uh oh!

rust-timer commented Oct 1, 2022

Overall result: ❌ regressions - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Footnotes

Uh oh!

Kobzol commented Oct 1, 2022

Uh oh!

thomcc commented Oct 2, 2022

Uh oh!

bors commented Oct 2, 2022

Uh oh!

bors commented Oct 2, 2022

Uh oh!

bors commented Oct 2, 2022

Uh oh!

rust-timer commented Oct 2, 2022

Overall result: no relevant changes - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Footnotes

Uh oh!

Uh oh!

Tell LLVM that `partition_point` returns a valid fencepost #102535

Tell LLVM that `partition_point` returns a valid fencepost #102535