-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Break when best possible filter result found #648
Conversation
11885ac
to
e753810
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, it looks great now! Feel free to merge after rebasing and once CI passes 🤞
b805ad4
to
c80d273
Compare
Sorry- if this seems obvious- but how is the best possible result determined? |
I think it's determined from the Shannon's source coding theorem, assuming that the entropy is as ideal as it can get, which is the case for rows with a single constant value. See also this random Reddit topic with less mathy discussion about it. Edit: to be more clear, not all heuristic filter selection policies implemented in Oxipng are based on entropy. In any case, the best result is chosen to be the optimal size value that can be computed according to the heuristic. |
I love the concept (& sheer nerdliness) of this — thanks, @andrews05 & @AlexTMjugador! — but is there way to cheaply determine & disable automatically when this shouldn't be used? (E.g., perhaps when random/static or photographic images?) Or is the overhead truly so negligible that's not relevant? |
For each delta filter, the heuristic strategies apply the filter to the current line and then run an algorithm to produce a single value. They then choose the filter based on which one produced the "best" value. MinSum chooses the one with the smallest value and for this algorithm the smallest possible value for any line is 0. To be clear, the best possible value is always deterministic and it's not possible for this change to have any impact on the output file size. It's purely a performance gain that will occur if all bytes in the line are zero. (In retrospect, I could have just checked for this up front and picked the None filter automatically. I might play around with that...) |
#648 may have been a bit hasty - I realised afterward that there's a simpler way to achieve the same thing, and include the Brute filter as well. This reverts #648 and instead just picks None up front if the line is all zeros. This is guaranteed to be the chosen filter for MinSum, Entropy, Bigrams and BigEnt. It's almost certainly true for Brute as well but this is harder to prove. I've tested this across hundreds of images and found no change in output.
I had this sudden idea for a tiny optimisation: For heuristic filter strategies, skip checking remaining filters if we find we have the best possible result. We can do this for MinSum, Entropy, Bigrams, and BigEnt, but not Brute.
For most images this will have little-to-no impact, but for some such as the test file "palette_1_should_be_palette_1.png" (where most lines are all zeros) we can see a 20% performance improvement at o2 and 10% at o5.