Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comprehensive benchmark of image presets #63

Open
rgaudin opened this issue Sep 15, 2020 · 7 comments · Fixed by #67
Open

Comprehensive benchmark of image presets #63

rgaudin opened this issue Sep 15, 2020 · 7 comments · Fixed by #67
Assignees
Labels
question Further information is requested stale

Comments

@rgaudin
Copy link
Member

rgaudin commented Sep 15, 2020

Testing Webp support on youtube showed that WebpHigh doesn't produce high quality thumbnail.
As these image presets are going to be used everywhere, it's important that, now that the rest works and before we actually roll it out everywhere, we run a benchmark of all presets.

What I envision is a table of a small (10?) list of images used in our scrapers for each format, with, side-by-side: the original, and the three presets version.

That should help us validate or revise the presets variable. It would also serve as a reference in the future, when we have to choose what preset to use for a scraper.

http://tmp.kiwix.org/youtube/report.html can serve as an inspiration

@rgaudin rgaudin added the question Further information is requested label Sep 15, 2020
@rgaudin
Copy link
Member Author

rgaudin commented Sep 15, 2020

Note to self: {"lossless": False, "quality": 100, "method": 4} works well for a good preset (stole that from imagemagick's defaults). We may want to add a lossless preset? As already discussed, in many scenarios, lossless increases file size (especially on youtube's webp which are probably well optimized already)

@satyamtg
Copy link
Contributor

satyamtg commented Sep 15, 2020

I did a small script to check and prepared a set of images of different sizes/types. I took 10 images of each of the different types of formats. Here are all the images that I used - https://drive.google.com/file/d/1KJF1wJvsWMxWK0PNcLPsMzuhvuGrkzyB/view?usp=sharing

Also, here's the output (it contains all images and a report.html file which can be viewed for the table representation) - https://drive.google.com/file/d/1O0LV_mUl6MF8Eenmkbap92XDhpSuRlpQ/view?usp=sharing

Here's the python script that I used - https://gist.github.com/satyamtg/31975ae1400e61633f0fbadd6f042c0c

The main thing that I see is for JPEG and PNG presets, the medium and low ones give images of the same size, which may be due to the fact that we have 256 color image (for PNG) in both, probably due to the requirement of default values by optimize_images. For JPEG, we might need to investigate the case.

The results can be viewed here directly - http://tmp.kiwix.org/imgbench/report.html or http://tmp.kiwix.org/imgbench/report-small.html

@rgaudin
Copy link
Member Author

rgaudin commented Sep 21, 2020

Thank you for this.

As you wrote, on JPEG, low and medium are exactly the same. It's a problem.
PNG low/medium are also similar. I noticed the limited colors on medium as well (on the molecular thing) which I think we don't want on medium. We could change that. There's a chance that we get similar to High results though.

So I checked in detail and figured that for the quality param to be used, you need to enable fast_mode

Please fix that for JPEG and don't reduce colors on PNG medium ; and also change the WebpHigh to what I proposed above.

@satyamtg
Copy link
Contributor

I changed the presets as you mentioned in #67. However, the WebP preset is a bit different as the one you proposed didn't yield a smaller file size on any of the images I tested with. So, I've used {quality: 90, mode: 6, lossless: False}. Mode 6 uses stronger compression.
Here's the result with what I used - https://drive.google.com/file/d/1jXNulMPfD1P3SMdd0VdVyll3XDeBAB_Z/view?usp=sharing
Here's the result with what you proposed earlier - https://drive.google.com/file/d/1qXTSM6HqhiAUyBFyvPqeapYOPM9hB-c2/view?usp=sharing
I think that there's decrease in file size without significant loss in quality. What are your thoughts @rgaudin ?

@rgaudin
Copy link
Member Author

rgaudin commented Sep 22, 2020

OK, tested it on what triggered my attention and it's not noticeable. Nice work!

@rgaudin
Copy link
Member Author

rgaudin commented Jun 1, 2022

Reopening after discussions with BSF as we may use their knowledge to redo a benchmark and choose a better quality/compression ratio

@rgaudin rgaudin reopened this Jun 1, 2022
@stale
Copy link

stale bot commented Aug 13, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

@stale stale bot added the stale label Aug 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested stale
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants