Remove some redundant checks from BufReader #98748

saethlin · 2022-07-01T02:05:07Z

The implementation of BufReader contains a lot of redundant checks. While any one of these checks is not particularly expensive to execute, especially when taken together they dramatically inhibit LLVM's ability to make subsequent optimizations by confusing data flow increasing the code size of anything that uses BufReader.

In particular, these changes have a ~2x increase on the benchmark that this adds a black_box to. I'm adding that black_box here just in case LLVM gets clever enough to remove the reads entirely. Right now it can't, but these optimizations are really setting it up to do so.

We get this optimization by factoring all the actual buffer management and bounds-checking logic into a new module inside bufreader with a new Buffer type. This makes it much easier to ensure that we have correctly encapsulated the management of the region of the buffer that we have read bytes into, and it lets us provide a new faster way to do small reads. Buffer::consume_with lets a caller do a read from the buffer with a single bounds check, instead of the double-check that's required to use buffer + consume.

Unfortunately I'm not aware of a lot of open-source usage of BufReader in perf-critical environments. Some time ago I tweaked this code because I saw BufReader in a profile at work, and I contributed some benchmarks to the bincode crate which exercise BufReader::buffer. These changes appear to help those benchmarks at little, but all these sorts of benchmarks are kind of fragile so I'm wary of quoting anything specific.

rustbot · 2022-07-01T02:05:11Z

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

Stabilizing library features
Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
Changing public documentation in ways that create new stability guarantees
Changing observable runtime behavior of library APIs

rust-highfive · 2022-07-01T02:05:11Z

r? @Mark-Simulacrum

(rust-highfive has picked a reviewer for you, use r? to override)

library/std/src/io/buffered/bufreader.rs

Mark-Simulacrum · 2022-07-14T00:40:09Z

@rustbot author

saethlin · 2022-07-18T01:17:49Z

@rustbot ready

library/std/src/io/buffered/bufreader/buffer.rs

library/std/src/io/buffered/bufreader.rs

The implementation of BufReader contains a lot of redundant checks. While any one of these checks is not particularly expensive to execute, especially when taken together they dramatically inhibit LLVM's ability to make subsequent optimizations.

Mark-Simulacrum · 2022-07-27T00:47:11Z

@bors try @rust-timer queue

r=me, though would be great to update PR description as well.

rust-timer · 2022-07-27T00:47:13Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-07-27T00:47:21Z

⌛ Trying commit 5fa1926 with merge 6973e88d0ee38239c9a7dd0268fc3a0e2e2f4bf8...

saethlin · 2022-07-27T00:52:58Z

Updated. Let me know if you think there's any other adjustment that should be made.

Mark-Simulacrum · 2022-07-27T02:05:12Z

Looks great! I want to wait for the perf run to come back -- though I expect it to be mostly neutral, the compiler isn't really I/O heavy for most things, hopefully :)

bors · 2022-07-27T02:38:46Z

☀️ Try build successful - checks-actions
Build commit: 6973e88d0ee38239c9a7dd0268fc3a0e2e2f4bf8 (6973e88d0ee38239c9a7dd0268fc3a0e2e2f4bf8)

rust-timer · 2022-07-27T02:38:47Z

Queued 6973e88d0ee38239c9a7dd0268fc3a0e2e2f4bf8 with parent 4d6d601, future comparison URL.

rust-timer · 2022-07-27T04:55:48Z

Finished benchmarking commit (6973e88d0ee38239c9a7dd0268fc3a0e2e2f4bf8): comparison url.

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

Primary benchmarks: 😿 relevant regressions found
Secondary benchmarks: mixed results

	mean¹	max	count²
Regressions 😿 (primary)	2.0%	2.3%	8
Regressions 😿 (secondary)	3.0%	3.0%	1
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	-5.1%	-5.2%	2
All 😿🎉 (primary)	2.0%	2.3%	8

Cycles

Results

Primary benchmarks: 🎉 relevant improvement found
Secondary benchmarks: 😿 relevant regression found

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	2.3%	2.3%	1
Improvements 🎉 (primary)	-6.5%	-6.5%	1
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	-6.5%	-6.5%	1

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

the arithmetic mean of the percent change ↩ ↩²
number of relevant changes ↩ ↩²

Mark-Simulacrum · 2022-07-27T09:35:21Z

@bors r+

Looks like mostly noise.

bors · 2022-07-27T09:35:22Z

📌 Commit 5fa1926 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

bors · 2022-07-27T09:49:09Z

⌛ Testing commit 5fa1926 with merge 50166d5...

bors · 2022-07-27T12:32:48Z

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing 50166d5 to master...

rust-timer · 2022-07-27T14:35:27Z

Finished benchmarking commit (50166d5): comparison url.

Instruction count

Primary benchmarks: no relevant changes found
Secondary benchmarks: 🎉 relevant improvement found

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	N/A	N/A	0
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	-0.3%	-0.3%	1
All 😿🎉 (primary)	N/A	N/A	0

Max RSS (memory usage)

Results

Primary benchmarks: 😿 relevant regressions found
Secondary benchmarks: mixed results

	mean¹	max	count²
Regressions 😿 (primary)	1.9%	3.3%	2
Regressions 😿 (secondary)	2.6%	2.6%	2
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	-5.0%	-5.0%	2
All 😿🎉 (primary)	1.9%	3.3%	2

Cycles

This benchmark run did not return any relevant results for this metric.

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

the arithmetic mean of the percent change ↩ ↩²
number of relevant changes ↩ ↩²

…k-Simulacrum Avoid repeated re-initialization of the BufReader buffer Fixes rust-lang#102727 We accidentally removed this in rust-lang#98748. It looks so redundant. But it isn't. The default `Read::read_buf` will defensively initialize the whole buffer, if any of it is indicated to be uninitialized. In uses where reads from the wrapped `Read` impl completely fill the `BufReader`, `initialized` and `filled` are the same, and this extra member isn't required. But in the reported issue, the `BufReader` wraps a `Read` impl which will _never_ fill the whole buffer. So the default `Read::read_buf` implementation repeatedly re-initializes the extra space in the buffer. This adds back the extra `initialized` member, which ensures that the default `Read::read_buf` only zero-initialized the buffer once, and I've tried to add a comment which explains this whole situation.

Avoid repeated re-initialization of the BufReader buffer Fixes rust-lang/rust#102727 We accidentally removed this in rust-lang/rust#98748. It looks so redundant. But it isn't. The default `Read::read_buf` will defensively initialize the whole buffer, if any of it is indicated to be uninitialized. In uses where reads from the wrapped `Read` impl completely fill the `BufReader`, `initialized` and `filled` are the same, and this extra member isn't required. But in the reported issue, the `BufReader` wraps a `Read` impl which will _never_ fill the whole buffer. So the default `Read::read_buf` implementation repeatedly re-initializes the extra space in the buffer. This adds back the extra `initialized` member, which ensures that the default `Read::read_buf` only zero-initialized the buffer once, and I've tried to add a comment which explains this whole situation.

rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Jul 1, 2022

rust-highfive assigned Mark-Simulacrum Jul 1, 2022

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 1, 2022

saethlin commented Jul 1, 2022

View reviewed changes

library/std/src/io/buffered/bufreader.rs Outdated Show resolved Hide resolved

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 14, 2022

saethlin force-pushed the optimize-bufreader branch from 45e3cac to 0cff23d Compare July 17, 2022 21:23

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 18, 2022

Mark-Simulacrum reviewed Jul 23, 2022

View reviewed changes

Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 23, 2022

saethlin added 3 commits July 24, 2022 12:50

Allow Buffer methods to inline

b9497be

Rename and document the new BufReader internals

5e5ce43

saethlin force-pushed the optimize-bufreader branch from ee0e126 to 5e5ce43 Compare July 24, 2022 17:51

Add Buffer::consume_with to enable direct buffer access with one check

5fa1926

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 27, 2022

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jul 27, 2022

bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 27, 2022

bors merged commit 50166d5 into rust-lang:master Jul 27, 2022

rustbot added this to the 1.64.0 milestone Jul 27, 2022

bors mentioned this pull request Jul 27, 2022

std::io: migrate ReadBuf to BorrowedBuf/BorrowedCursor #97015

Merged

saethlin deleted the optimize-bufreader branch September 3, 2022 23:26

apiraino mentioned this pull request Oct 6, 2022

Performance regression in 1.64+ when BufReader inner reader doesn't fill the buffer #102727

Closed

saethlin mentioned this pull request Oct 7, 2022

Avoid repeated re-initialization of the BufReader buffer #102760

Merged

Remove some redundant checks from BufReader #98748

Remove some redundant checks from BufReader #98748

Uh oh!

Conversation

saethlin commented Jul 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Jul 1, 2022

Uh oh!

rust-highfive commented Jul 1, 2022

Uh oh!

Uh oh!

Mark-Simulacrum commented Jul 14, 2022

Uh oh!

saethlin commented Jul 18, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Mark-Simulacrum commented Jul 27, 2022

Uh oh!

rust-timer commented Jul 27, 2022

Uh oh!

bors commented Jul 27, 2022

Uh oh!

saethlin commented Jul 27, 2022

Uh oh!

Mark-Simulacrum commented Jul 27, 2022

Uh oh!

bors commented Jul 27, 2022

Uh oh!

rust-timer commented Jul 27, 2022

Uh oh!

rust-timer commented Jul 27, 2022

Instruction count

Max RSS (memory usage)

Cycles

Footnotes

Uh oh!

Mark-Simulacrum commented Jul 27, 2022

Uh oh!

bors commented Jul 27, 2022

Uh oh!

bors commented Jul 27, 2022

Uh oh!

bors commented Jul 27, 2022

Uh oh!

rust-timer commented Jul 27, 2022

Instruction count

Max RSS (memory usage)

Cycles

Footnotes

Uh oh!

Uh oh!

saethlin commented Jul 1, 2022 •

edited

Loading