-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix compression ratio inefficiency #1709
Conversation
…-size inputs at levels 1-3
to detect gross CR variations after a patch. Tests normal and dictionary compression.
invalid on void ptr
Do you want to fix the rest of the strategies in this diff as well? The repro that exposes the bad behavior for every level is:
|
Yes, I will |
restore dictionary compression ratio
to reduce chances of differences between 32 and 64-bit fuzzer tests
All strategies fixed, able to access the entire dictionary as long as at least one byte is within window size. Still need to understand why the new size-control test fails on gcc-6 + 32-bit. Also worthwhile to note : |
to produce same content on both 32 and 64-bit platforms by removing floating from literal table determination. also : added checksum trace in compression control test, so that it's easier to determine if test fails as a consequence of compressing a different sample.
The failing test due to lack of reproducibility between 32 and 64-bit builds is now fixed. And now, another completely unrelated test fails later in the serie. Great outcome, such a joy .... |
just for clarity, for the currently failing unit test
I'd love to switch these tests to gtest, given the time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one comment, but I'll go ahead and accept so you can merge when its fixed.
lib/compress/zstd_lazy.c
Outdated
U32 const lowestValid = ms->window.lowLimit; | ||
U32 const withinWindow = (current - lowestValid > maxDistance) ? current - maxDistance : lowestValid; | ||
U32 const isDictionary = (ms->loadedDictEnd != 0); | ||
U32 const windowLow = isDictionary ? lowestValid : withinWindow; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only variable you need for the rest of the function is windowLow
right?
Can you extract this out into a function since it is duplicated in every match finder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's a good point
it's an unsupported scenario.
as suggested by @terrelln
Last item completed, with |
Fix compression ratio inefficiency
introduced in #1624
under following scenario :
attachDict
threshold)The patch restore compression ratio to previous level.
A new compression efficiency test is added into
fuzzer/unitTests
in order to control gross negative variations in compression ratio, playing all levels with normal compression and flat dictionary compression.