Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove some redundant checks from BufReader #98748

Merged
merged 4 commits into from
Jul 27, 2022

Conversation

saethlin
Copy link
Member

@saethlin saethlin commented Jul 1, 2022

The implementation of BufReader contains a lot of redundant checks. While any one of these checks is not particularly expensive to execute, especially when taken together they dramatically inhibit LLVM's ability to make subsequent optimizations by confusing data flow increasing the code size of anything that uses BufReader.

In particular, these changes have a ~2x increase on the benchmark that this adds a black_box to. I'm adding that black_box here just in case LLVM gets clever enough to remove the reads entirely. Right now it can't, but these optimizations are really setting it up to do so.

We get this optimization by factoring all the actual buffer management and bounds-checking logic into a new module inside bufreader with a new Buffer type. This makes it much easier to ensure that we have correctly encapsulated the management of the region of the buffer that we have read bytes into, and it lets us provide a new faster way to do small reads. Buffer::consume_with lets a caller do a read from the buffer with a single bounds check, instead of the double-check that's required to use buffer + consume.

Unfortunately I'm not aware of a lot of open-source usage of BufReader in perf-critical environments. Some time ago I tweaked this code because I saw BufReader in a profile at work, and I contributed some benchmarks to the bincode crate which exercise BufReader::buffer. These changes appear to help those benchmarks at little, but all these sorts of benchmarks are kind of fragile so I'm wary of quoting anything specific.

@rustbot rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Jul 1, 2022
@rustbot
Copy link
Collaborator

rustbot commented Jul 1, 2022

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@rust-highfive
Copy link
Collaborator

r? @Mark-Simulacrum

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 1, 2022
@Mark-Simulacrum
Copy link
Member

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 14, 2022
@saethlin saethlin force-pushed the optimize-bufreader branch from 45e3cac to 0cff23d Compare July 17, 2022 21:23
@saethlin
Copy link
Member Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 18, 2022
@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 23, 2022
saethlin added 3 commits July 24, 2022 12:50
The implementation of BufReader contains a lot of redundant checks.
While any one of these checks is not particularly expensive to execute,
especially when taken together they dramatically inhibit LLVM's ability
to make subsequent optimizations.
@saethlin saethlin force-pushed the optimize-bufreader branch from ee0e126 to 5e5ce43 Compare July 24, 2022 17:51
@Mark-Simulacrum
Copy link
Member

@bors try @rust-timer queue

r=me, though would be great to update PR description as well.

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 27, 2022
@bors
Copy link
Contributor

bors commented Jul 27, 2022

⌛ Trying commit 5fa1926 with merge 6973e88d0ee38239c9a7dd0268fc3a0e2e2f4bf8...

@saethlin
Copy link
Member Author

Updated. Let me know if you think there's any other adjustment that should be made.

@Mark-Simulacrum
Copy link
Member

Looks great! I want to wait for the perf run to come back -- though I expect it to be mostly neutral, the compiler isn't really I/O heavy for most things, hopefully :)

@bors
Copy link
Contributor

bors commented Jul 27, 2022

☀️ Try build successful - checks-actions
Build commit: 6973e88d0ee38239c9a7dd0268fc3a0e2e2f4bf8 (6973e88d0ee38239c9a7dd0268fc3a0e2e2f4bf8)

@rust-timer
Copy link
Collaborator

Queued 6973e88d0ee38239c9a7dd0268fc3a0e2e2f4bf8 with parent 4d6d601, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (6973e88d0ee38239c9a7dd0268fc3a0e2e2f4bf8): comparison url.

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results
  • Primary benchmarks: 😿 relevant regressions found
  • Secondary benchmarks: mixed results
mean1 max count2
Regressions 😿
(primary)
2.0% 2.3% 8
Regressions 😿
(secondary)
3.0% 3.0% 1
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
-5.1% -5.2% 2
All 😿🎉 (primary) 2.0% 2.3% 8

Cycles

Results
  • Primary benchmarks: 🎉 relevant improvement found
  • Secondary benchmarks: 😿 relevant regression found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
2.3% 2.3% 1
Improvements 🎉
(primary)
-6.5% -6.5% 1
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) -6.5% -6.5% 1

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2

  2. number of relevant changes 2

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jul 27, 2022
@Mark-Simulacrum
Copy link
Member

@bors r+

Looks like mostly noise.

@bors
Copy link
Contributor

bors commented Jul 27, 2022

📌 Commit 5fa1926 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 27, 2022
@bors
Copy link
Contributor

bors commented Jul 27, 2022

⌛ Testing commit 5fa1926 with merge 50166d5...

@bors
Copy link
Contributor

bors commented Jul 27, 2022

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing 50166d5 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 27, 2022
@bors bors merged commit 50166d5 into rust-lang:master Jul 27, 2022
@rustbot rustbot added this to the 1.64.0 milestone Jul 27, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (50166d5): comparison url.

Instruction count

  • Primary benchmarks: no relevant changes found
  • Secondary benchmarks: 🎉 relevant improvement found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
-0.3% -0.3% 1
All 😿🎉 (primary) N/A N/A 0

Max RSS (memory usage)

Results
  • Primary benchmarks: 😿 relevant regressions found
  • Secondary benchmarks: mixed results
mean1 max count2
Regressions 😿
(primary)
1.9% 3.3% 2
Regressions 😿
(secondary)
2.6% 2.6% 2
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
-5.0% -5.0% 2
All 😿🎉 (primary) 1.9% 3.3% 2

Cycles

This benchmark run did not return any relevant results for this metric.

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2

  2. number of relevant changes 2

@saethlin saethlin deleted the optimize-bufreader branch September 3, 2022 23:26
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this pull request Oct 7, 2022
…k-Simulacrum

Avoid repeated re-initialization of the BufReader buffer

Fixes rust-lang#102727

We accidentally removed this in rust-lang#98748. It looks so redundant. But it isn't.

The default `Read::read_buf` will defensively initialize the whole buffer, if any of it is indicated to be uninitialized. In uses where reads from the wrapped `Read` impl completely fill the `BufReader`, `initialized` and `filled` are the same, and this extra member isn't required. But in the reported issue, the `BufReader` wraps a `Read` impl which will _never_ fill the whole buffer. So the default `Read::read_buf` implementation repeatedly re-initializes the extra space in the buffer.

This adds back the extra `initialized` member, which ensures that the default `Read::read_buf` only zero-initialized the buffer once, and I've tried to add a comment which explains this whole situation.
thomcc pushed a commit to tcdi/postgrestd that referenced this pull request Feb 10, 2023
Avoid repeated re-initialization of the BufReader buffer

Fixes rust-lang/rust#102727

We accidentally removed this in rust-lang/rust#98748. It looks so redundant. But it isn't.

The default `Read::read_buf` will defensively initialize the whole buffer, if any of it is indicated to be uninitialized. In uses where reads from the wrapped `Read` impl completely fill the `BufReader`, `initialized` and `filled` are the same, and this extra member isn't required. But in the reported issue, the `BufReader` wraps a `Read` impl which will _never_ fill the whole buffer. So the default `Read::read_buf` implementation repeatedly re-initializes the extra space in the buffer.

This adds back the extra `initialized` member, which ensures that the default `Read::read_buf` only zero-initialized the buffer once, and I've tried to add a comment which explains this whole situation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants