Fix `inflate_fast_help` loop bound #260

folkertdev · 2024-12-09T18:14:14Z

When you finally spot the thing... 🤦 we are now faster than zlib-ng for all chunk sizes.

It's only a marginal improvement for chunk size 2**4, but quite significant for what are probably the most common chunk sizes (very small chunk sizes likely indicate io-bound operation).

After fixing the bug of exiting the loop too early (first commit), I then made the bound a bit more accurate, and finally refactored the output type of len_and_friends. The final commit has no performance impact.

codecov · 2024-12-09T18:16:11Z

Codecov Report

Attention: Patch coverage is 76.66667% with 7 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
zlib-rs/src/inflate.rs	74.07%	7 Missing ⚠️

Files with missing lines	Coverage Δ
zlib-rs/src/inflate/bitreader.rs	`91.04% <100.00%> (+0.20%)`	⬆️
zlib-rs/src/inflate.rs	`91.05% <74.07%> (-0.19%)`	⬇️

... and 3 files with indirect coverage changes

bjorn3 · 2024-12-10T08:37:05Z

zlib-rs/src/inflate/bitreader.rs

@@ -77,6 +77,11 @@ impl<'a> BitReader<'a> {
        self.end as usize - self.ptr as usize
    }

+    #[inline(always)]
+    pub fn bytes_remaining_including_buffer(&self) -> usize {
+        (self.end as usize - self.ptr as usize) + (self.bits_used as usize >> 3)


Using / 8 instead of >> 3 may be a bit clearer and will optimize to the same assembly.

we can make a pass over the repo at some point. I think zlib-ng never updated this, and the original zlib that it forked was written at a time when apparently that optimization was not (effectively) guaranteed.

bjorn3

Nice!

fix the loop bound in inflate_fast_help

0ea8362

folkertdev requested a review from bjorn3 December 9, 2024 18:14

folkertdev added 2 commits December 9, 2024 19:42

make loop bound in inflate_fast_help slightly more accurate

69dfcd5

fn len_and_friends: return ControlFlow instead of Option

8ed2c63

folkertdev force-pushed the fix-fast-loop-bound branch from ad248e5 to 8ed2c63 Compare December 9, 2024 18:44

bjorn3 reviewed Dec 10, 2024

View reviewed changes

bjorn3 approved these changes Dec 10, 2024

View reviewed changes

folkertdev merged commit afcf420 into main Dec 10, 2024
20 checks passed

folkertdev deleted the fix-fast-loop-bound branch December 10, 2024 09:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `inflate_fast_help` loop bound #260

Fix `inflate_fast_help` loop bound #260

folkertdev commented Dec 9, 2024

codecov bot commented Dec 9, 2024 •

edited

Loading

bjorn3 Dec 10, 2024

folkertdev Dec 10, 2024

bjorn3 left a comment

Fix inflate_fast_help loop bound #260

Fix inflate_fast_help loop bound #260

Conversation

folkertdev commented Dec 9, 2024

codecov bot commented Dec 9, 2024 • edited Loading

Codecov Report

bjorn3 Dec 10, 2024

Choose a reason for hiding this comment

folkertdev Dec 10, 2024

Choose a reason for hiding this comment

bjorn3 left a comment

Choose a reason for hiding this comment

Fix `inflate_fast_help` loop bound #260

Fix `inflate_fast_help` loop bound #260

codecov bot commented Dec 9, 2024 •

edited

Loading