Optimize ImageStat.Stat.extrema #7593

florath · 2023-12-01T19:07:03Z

The optimized function improves the performance. The original function always runs through the complete histogram of length 256 even if it is possible to exit the loop early (break).

Running performance tests improvements of factor >10 were measured depending on the images under test.

Changes proposed in this pull request:

Performance optimized ImageStat _getextrema function

The optimzed function improves the performance. The original function always runs through the complete historgram of length 256 even if it is possible to exit the loop early (break). Running some tests I found performance improvements of factor >10 depending on the image. Signed-off-by: Andreas Florath <andreas@florath.net>

Signed-off-by: Andreas Florath <andreas@florath.net>

florath · 2023-12-01T19:24:03Z

Hmmm.
The failing test case looks completely unrelated to the patch. Any idea?

hugovk · 2023-12-01T21:22:44Z

I restarted the AppVeyor job, let's see how it goes. Also approved the GitHub Actions to run.

Please could you share the performance tests you ran and their results?

radarhere · 2023-12-02T00:17:45Z

Yes, the test failures are also happening in main. I've created #7594 to fix them.

florath · 2023-12-02T07:25:24Z

Please could you share the performance tests you ran and their results?

I created a virtualenv with pillow and added the new function side by side with the original, like:

    def _getextrema_orig(self):
        """Get min/max values for each band in the image"""

        def minmax(histogram):
            n = 255
            x = 0
            for i in range(256):
                if histogram[i]:
                    n = min(n, i)
                    x = max(x, i)
            return n, x  # returns (255, 0) if there's no data in the histogram

        v = []
        for i in range(0, len(self.h), 256):
            v.append(minmax(self.h[i:]))
        return v

    def _getextrema(self):
        """Get min/max values for each band in the image"""

        def minmax(histogram):
            res_min, res_max = 255, 0
            for i in range(256):
                if histogram[i]:
                    res_min = i
                    break
            for i in range(255, -1, -1):
                if histogram[i]:
                    res_max = i
                    break
            if res_max >= res_min:
                return res_min, res_max
            else:
                return (255, 0)

        v = []
        for i in range(0, len(self.h), 256):
            v.append(minmax(self.h[i:i+256]))
        return v

Then I used a collection of images (ImageNet) and run a script to compare the original and the optimized function. I adapted the script slightly so that you can run the test also on the Pillow test images (have to add the try/except block here because there are some broken images in the Pillow test image directory). BTW: this also checks if the original and optimized versions return the same results.

import pathlib
import timeit
from PIL import Image, ImageStat

IMAGEDIR="../Pillow/Tests/images"

testdir = pathlib.Path(IMAGEDIR)

NUMBER=1000
REPEAT=10

for image_file_name in testdir.rglob("*"):
    # Skip broken images
    try:
        img = Image.open(image_file_name)
        stat = ImageStat.Stat(img)
    except Exception as ex:
        continue

    # Check for correctness
    res_orig = stat._getextrema_orig()
    res_opt = stat._getextrema()
    assert res_orig == res_opt

    # Measure improvement factor
    exec_times_orig = timeit.repeat(
        stmt=stat._getextrema_orig, repeat=REPEAT, number=NUMBER)
    exec_times_opt = timeit.repeat(
        stmt=stat._getextrema, repeat=REPEAT, number=NUMBER)

    print("%10.4f - %s" % (
        min(exec_times_orig) / min(exec_times_opt), image_file_name))

For the ImageNet about 80-90% of the images had speedup factor of more than 10. A lot even more than factor 40.
For the Pillow test images I get the following output (partial).

    3.3938 - ../Pillow/Tests/images/itxt_chunks.png
   32.8581 - ../Pillow/Tests/images/tiff_wrong_bits_per_sample_3.tiff
   49.2568 - ../Pillow/Tests/images/hopper.iccprofile.tif
   24.9033 - ../Pillow/Tests/images/mmap_error.bmp
   47.1822 - ../Pillow/Tests/images/hopper.dds
   12.4054 - ../Pillow/Tests/images/uncompressed_rgb.png
    3.3469 - ../Pillow/Tests/images/imagedraw_polygon_kite_L.png
   12.8065 - ../Pillow/Tests/images/colr_bungee.png
    1.8857 - ../Pillow/Tests/images/imagedraw_rounded_rectangle_corners_yyny.png
   33.7577 - ../Pillow/Tests/images/clipboard_target.png
   11.1593 - ../Pillow/Tests/images/bc5s.png
   47.2684 - ../Pillow/Tests/images/hopper.sgi
   29.2110 - ../Pillow/Tests/images/pil123rgba.qoi
    5.5825 - ../Pillow/Tests/images/imagedraw_polygon_kite_RGB.png
   22.7400 - ../Pillow/Tests/images/dispose_bgnd_transparency.gif
    9.0826 - ../Pillow/Tests/images/palette_sepia.png
    1.8943 - ../Pillow/Tests/images/imagedraw_rounded_rectangle_corners_ynyn.png
   45.2440 - ../Pillow/Tests/images/balloon.jpf
    1.0019 - ../Pillow/Tests/images/no_palette_with_transparency.gif
   27.4538 - ../Pillow/Tests/images/imagedraw2_text.png
   47.9036 - ../Pillow/Tests/images/test_anchor_multiline_mm_right.png
   47.4315 - ../Pillow/Tests/images/hopper_orientation_2.webp

Please note that the first number is the speedup-factor: the factor how much faster the proposed function is measured against the original.

src/PIL/ImageStat.py

Simplification of return statement Co-authored-by: Andrew Murray <3112309+radarhere@users.noreply.github.com>

florath · 2023-12-04T09:30:51Z

Thanks a lot for the hint!
In deed, the additional check is not needed. (I tested some versions of the function during the optimization process and an earlier needed the check - but this one can just return the values.)

homm · 2023-12-04T10:00:30Z

src/PIL/ImageStat.py


        v = []
        for i in range(0, len(self.h), 256):
-            v.append(minmax(self.h[i:]))
+            v.append(minmax(self.h[i : i + 256]))
        return v


you can also use list comprehension

My idea was to have a minimal patch set.

If you want, we could go with

return [minmax(self.h[i:i+256]) for i in range(0, len(self.h), 256)]

My tests showed that from the performance point of view the two solutions are mostly the same. But if my other patch gets accepted #7599 the list comprehension might be the way to go to have the same style.

What do you think?

#7597 has actually just been merged to use list comprehension here. Would you like to resolve the conflict?

hugovk

Thank you!

hugovk · 2023-12-05T16:10:14Z

Let's also mention this and the other one in the release notes (there'll be a template once #6486 is merged). Can be in a followup PR.

radarhere · 2023-12-06T23:14:32Z

I've created #7605 to include this in the release notes.

florath added 2 commits December 1, 2023 18:53

Fixed spacing in _getextrema method

7762dd3

Signed-off-by: Andreas Florath <andreas@florath.net>

radarhere added the Performance label Dec 2, 2023

hugovk mentioned this pull request Dec 2, 2023

Enable updating PRs from main #7595

Closed

Merge branch 'main' into ImageStat_getextrema_opt

96fe0a1

radarhere reviewed Dec 4, 2023

View reviewed changes

src/PIL/ImageStat.py Outdated Show resolved Hide resolved

Update src/PIL/ImageStat.py

ac47b75

Simplification of return statement Co-authored-by: Andrew Murray <3112309+radarhere@users.noreply.github.com>

florath mentioned this pull request Dec 4, 2023

Optimized ImageStat.Stat.count #7599

Merged

homm approved these changes Dec 4, 2023

View reviewed changes

radarhere added the Needs Rebase label Dec 5, 2023

Merge branch 'main' into ImageStat_getextrema_opt

ed03954

hugovk approved these changes Dec 5, 2023

View reviewed changes

radarhere removed the Needs Rebase label Dec 5, 2023

radarhere merged commit e9afaee into python-pillow:main Dec 6, 2023
56 checks passed

radarhere changed the title ~~Optimize ImageStat.Stat._getextrema function~~ Optimize ImageStat.Stat.extrema Dec 6, 2023

radarhere added a commit to radarhere/Pillow that referenced this pull request Dec 6, 2023

Added release notes for python-pillow#7599 and python-pillow#7593

df13c61

radarhere added a commit to radarhere/Pillow that referenced this pull request Dec 7, 2023

Added release notes for python-pillow#7599 and python-pillow#7593

a7f339d

radarhere added a commit to radarhere/Pillow that referenced this pull request Dec 7, 2023

Added release notes for python-pillow#7599 and python-pillow#7593

afae568

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize ImageStat.Stat.extrema #7593

Optimize ImageStat.Stat.extrema #7593

florath commented Dec 1, 2023

florath commented Dec 1, 2023

hugovk commented Dec 1, 2023

radarhere commented Dec 2, 2023

florath commented Dec 2, 2023 •

edited by radarhere

Loading

florath commented Dec 4, 2023

homm Dec 4, 2023

florath Dec 4, 2023

radarhere Dec 5, 2023

florath Dec 5, 2023

hugovk left a comment

hugovk commented Dec 5, 2023

radarhere commented Dec 6, 2023

Optimize ImageStat.Stat.extrema #7593

Optimize ImageStat.Stat.extrema #7593

Conversation

florath commented Dec 1, 2023

florath commented Dec 1, 2023

hugovk commented Dec 1, 2023

radarhere commented Dec 2, 2023

florath commented Dec 2, 2023 • edited by radarhere Loading

florath commented Dec 4, 2023

homm Dec 4, 2023

Choose a reason for hiding this comment

florath Dec 4, 2023

Choose a reason for hiding this comment

radarhere Dec 5, 2023

Choose a reason for hiding this comment

florath Dec 5, 2023

Choose a reason for hiding this comment

hugovk left a comment

Choose a reason for hiding this comment

hugovk commented Dec 5, 2023

radarhere commented Dec 6, 2023

florath commented Dec 2, 2023 •

edited by radarhere

Loading