Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java2dProcessor performance on tiled pyramidal (jpeg) TIFFs #296

Closed
jbarth-ubhd opened this issue Jul 5, 2019 · 25 comments
Closed

Java2dProcessor performance on tiled pyramidal (jpeg) TIFFs #296

jbarth-ubhd opened this issue Jul 5, 2019 · 25 comments

Comments

@jbarth-ubhd
Copy link

jbarth-ubhd commented Jul 5, 2019

Dear reader, we have a tiled, pyramidal internally jpeg compressed tif of 9557×36619 px size.

When requesting region of x=2671 y=15179 w=6170 h=3471 at pct:31, cantaloupe is correctly getting the 50%-size image within the tiff, getting a 3085×1735 size image which has now to be scaled (bicubic) to 1913×1076px to match the »pct:31« above.

This takes 2.5 seconds according to log on an Intel Core i7-4790 CPU @ 3.60GHz.
Imagemagick requires about 0.1 s for this (Q16, no hdri), and this snippet in java does take 0.21 s for downsampling:

    BufferedImage before=null;
    try { before=ImageIO.read(new File("test.bmp")); // GIF, PNG, JPEG, BMP, and WBMP
    } catch(IOException e) {
      System.out.println(e);
    }
    double fx=1913./3085;
    double fy=1076./1735;

    AffineTransform at = new AffineTransform();
    at.scale(fx, fy);
    AffineTransformOp scaleOp = new AffineTransformOp(at, AffineTransformOp.TYPE_BICUBIC);
    long start=System.nanoTime();
    BufferedImage after = scaleOp.filter(before, null);
    long stop=System.nanoTime();
    System.out.println((stop-start)*1e-9 + " s");
    try {
      System.out.println(after.getWidth());
      File outfile=new File("saved.png"); // JPEG, PNG, GIF, BMP and WBMP
      ImageIO.write(after, "png", outfile);
    } catch(IOException e) {
      System.out.println(e);
    }

What could be done to speed up Java2dProcessor?

ImageMagickProcessor takes about 97 s, because identify reads in the whole image,
GraphicsMagickProcessor takes about 4 s...
and JaiProcessor does use NearestNeighbour because of a Jai bug...

Kind regards,
Jochen

@adolski
Copy link
Contributor

adolski commented Jul 22, 2019

Hi @jbarth-ubhd,

Can you share the TIFF file you are using?

@jbarth-ubhd
Copy link
Author

jbarth-ubhd commented Jul 23, 2019

vips-64.tif

generated with vips im_vips2tiff ~/Tafel_24.tif vips-64.tif:jpeg:80,tile:64x64,pyramid using vips-8.4.5-Wed Jan 18 12:36:44 UTC 2017

Link to our presentation: Piranesi1770 Trofeo... Tafel_24

I've chosen 64×64px wide tiles because

We recommend choosing TileWidth and TileLength such that the resulting tiles are about 4K to 32K bytes before compression. This seems to be a reasonable value for most applications and compression schemes

@adolski
Copy link
Contributor

adolski commented Jul 23, 2019

Thanks, I've got it and will try to take a look soon.

@kaij
Copy link
Contributor

kaij commented Mar 3, 2020

We see the same problem with pyramidal tiffs. Further analysis indicates, that this seems not directly related to the performance of scaling.

Using source JPG with TurboJpegProcessor gives the following output (bicubic scaling takes 273ms, overall processing <500ms):

DEBUG e.i.l.c.r.ImageRepresentation - Derivative cache not available; writing directly to the response
DEBUG e.i.l.c.p.Java2DUtil - scale(): scaled 2390x3466 image to 2000x2900 using a BiCubic filter in 273 msec
DEBUG e.i.l.c.r.ImageRepresentation - TurboJpegProcessor processed in 369 msec: 002_2000_000_0005.jpg_1:1_scale:2000,,bicubic_encode:jpg_JPEG_90_#FFFFFF_8

Using a pyramidal TIFF (converted from the jpeg) results in this (bicubic scaling takes >500ms, overall processing >1s):

DEBUG e.i.l.c.r.ImageRepresentation - Derivative cache not available; writing directly to the response
DEBUG e.i.l.c.p.c.TIFFImageReader - Acquiring region 0,0/2390x3466 from 2390x3466 image (256x256 tile size)
DEBUG e.i.l.c.p.Java2DUtil - scale(): scaled 2390x3466 image to 2000x2900 using a BiCubic filter in 552 msec
DEBUG e.i.l.c.p.c.JPEGImageWriter - ImageIO plugin preferences: com.sun.imageio.plugins.jpeg.JPEGImageWriter
DEBUG e.i.l.c.p.c.JPEGImageWriter - Using com.sun.imageio.plugins.jpeg.JPEGImageWriter
DEBUG e.i.l.c.p.c.JPEGImageWriter - Quality: 90; progressive: false
DEBUG e.i.l.c.r.ImageRepresentation - Java2dProcessor processed in 1013 msec: 002_2000_000_0005.tif_1:1_scale:2000,,bicubic_encode:jpg_JPEG_90_#FFFFFF_8

I then compared performance of the same pyramidal tiff to IIPImage Server (using bilinear "interpolated" scaling). Which results in overall processing of <500ms (Factor 2!)

It seems that without JAI, cantaloupe can't handle pyramidal tiffs anymore in an efficient way. I'm using a dockerized image based on OpenJDK-11 and Hotspot.

Any ideas on this?

@adolski
Copy link
Contributor

adolski commented Mar 3, 2020

I looked into this briefly a long time ago and my recollection is that it had something to do with the way the GeoSolutions TIFF Image I/O plugin creates BufferedImages, which ties into the idiosyncrasies of java.awt.image.BufferedImage, in particular its "types" and their suitability for Graphics2D processing. Basically the plugin tries to produce a BufferedImage with a ColorModel and SampleModel that are faithful to the source TIFF, but in many/most cases, this results in an image of "custom" type which Graphics2D is not optimized or able to process, requiring costly internal conversion.

JAI doesn't suffer from this problem because it uses a different processing engine.

A solution might involve either forking the plugin or moving to a different TIFF reading technique, but both of these will require time that I don't have currently.

@kaij
Copy link
Contributor

kaij commented Mar 4, 2020

Thanks @adolski for this very helpful insight! Now it makes sense to me that the scaling takes longer... I will have a look at it, but don't know if I have time for it either.

I hope pyramidal tiffs can soon be replaced with HTJ2K now that Grok supports it fully (see https://github.com/GrokImageCompression/grok/releases - I wonder whether I can maybe just use Cantaloupe's OpenJPEG integration with grk_decompress?)

@kaij
Copy link
Contributor

kaij commented Mar 4, 2020

Actually went ahead and tried the just released version 4.1 of Grok (custom compile, symlinked opj_compress, commented out the "quiet" flag in the OpenJpegProcessor.java in Cantaloupe).

Looks very promising on first impression - however I only compared on JPG sources for the moment (lossy HT "grk_compress -q 40 -M 32 -i 001_1974_000_0831.jpg -o 001_1974_000_0831.jp2"). Similar performance to TurboJpeg when using region display, a bit slower when displaying the full image. Decoding speed seems to depend heavily on compression parameters.

Even though it's writing out the intermediate bmp, it's still much faster than using pyramidal tiff (with scaling the problem described above). So maybe it's a longer-term option to integrate grok :-)

DEBUG e.i.l.c.p.OpenJpegProcessor - Invoking /usr/bin/opj_decompress -i /var/lib/cantaloupe/images/001_1974_000_0831.jp2 -o /tmp/OpenJpegProcessor-f6cd0d7b-96a1-4b57-be64-375addc4c133.bmp
DEBUG e.i.l.c.p.c.b.BMPImageReader - Using com.sun.imageio.plugins.bmp.BMPImageReader
DEBUG e.i.l.c.p.Java2DUtil - scale(): scaled 2735x3845 image to 1400x1968 using a BiCubic filter in 120 msec
DEBUG e.i.l.c.p.c.ImageWriterFacade - Writing with edu.illinois.library.cantaloupe.processor.codec.jpeg.TurboJPEGImageWriter
DEBUG e.i.l.c.r.ImageRepresentation - OpenJpegProcessor processed in 517 msec: 001_1974_000_0831.jp2_1:1_scalebypixels:1400,,bicubic_encode:jpg_JPEG_90_#FFFFFF_8_e95198c48292fa4c8fc0617e850c1f6d

@adolski
Copy link
Contributor

adolski commented Mar 4, 2020

@kaij I'm not very familiar with Grok, but if it works, then great.

Longer-term I would like to call directly into the OpenJPEG library rather than relying on opj_decompress, but if Grok is maintaining API compatibility with OpenJPEG, then there is a good chance that will work too.

@jbarth-ubhd
Copy link
Author

jbarth-ubhd commented Apr 9, 2021

Still the same problem (cantaloupe 4.1.5): We'll have an image, size 26315×19908 px, pyramidal tiff:

root@serv7:/srv/assets/assets/orig2/0/605000/605909# tiffinfo 1392039ec03494ade4b39060c02d64ecd1761951|grep -i width
  Image Width: 26315 Image Length: 19908
  Tile Width: 96 Tile Length: 96
  Image Width: 13157 Image Length: 9954
  Tile Width: 96 Tile Length: 96
  Image Width: 6578 Image Length: 4977
  Tile Width: 96 Tile Length: 96
  Image Width: 3289 Image Length: 2488
  Tile Width: 96 Tile Length: 96
  Image Width: 1644 Image Length: 1244
  Tile Width: 96 Tile Length: 96
  Image Width: 822 Image Length: 622
  Tile Width: 96 Tile Length: 96
  Image Width: 411 Image Length: 311
  Tile Width: 96 Tile Length: 96
  Image Width: 205 Image Length: 155
  Tile Width: 96 Tile Length: 96
  Image Width: 102 Image Length: 77
  Tile Width: 96 Tile Length: 96
  Image Width: 51 Image Length: 38
  Tile Width: 96 Tile Length: 96

Contains 1/1 … 1/512 image size. Now the requests (abbreviated + reformatted) from Firefox (Mirador viewer) with timings and image size fraction:

…/    0,    0,26315,19908/412,/… 2.0s 1/63.871359
…/    0,    0,24576,19908/768,/… 2.0s 1/32.000000
…/24576,    0, 1739,19908/ 55,/… 2.4s 1/31.618182
…/12288,    0,12288,12288/768,/… 2.7s 1/16.000000
…/    0,    0,12288,12288/768,/… 2.7s 1/16.000000
…/12288,12288,12288, 7620/768,/… 2.5s 1/16.000000
…/    0,12288,12288, 7620/768,/… 2.2s 1/16.000000
…/24576,    0, 1739,12288/109,/… 1.9s 1/15.954128
…/24576,12288, 1739, 7620/109,/… 1.7s 1/15.954128
…/12288, 6144, 6144, 6144/768,/… 2.2s 1/8.000000
…/ 6144, 6144, 6144, 6144/768,/… 2.3s 1/8.000000
…/12288,12288, 6144, 6144/768,/… 2.7s 1/8.000000
…/ 6144,12288, 6144, 6144/768,/… 2.5s 1/8.000000
…/12288,    0, 6144, 6144/768,/… 2.6s 1/8.000000
…/ 6144,    0, 6144, 6144/768,/… 2.5s 1/8.000000
… … …

Multiple seconds, even at 1/1.000000 zoom.

PS: changed to "wait" timings.

@DiegoPino
Copy link
Contributor

@jbarth-ubhd can you share how much memory you have allocated for Cantaloupe? What caching is in place? Running this via jetty or in Tomcat? Which Java? Also version 4.1.5 is bit behind so I would compare at least with 4.1.7 if you are not ready for 5.0 (new configuration options/settings needed).

There can be many factors in place too, from a slow Filesystem to many other concurrent connections. I would love to test your source image with other processors in one our system to compare.

I'm not sure why (really not) but we had some TIFF speed issues (but not pyramidal ones) with Mirador v/s Open Sea dragon. It may be related to the tile sizes requested but since I have not explanation yet this is more like a comment without background.

@jbarth-ubhd
Copy link
Author

jbarth-ubhd commented Apr 13, 2021

# Ubuntu 18.04
/usr/lib/jvm/java-1.11.0-openjdk-amd64/bin/java -Dcantaloupe.config=/usr/local/diglit/iiif/cantaloupe/config/cantaloupe⸗
.properties -Xmx2g -jar /usr/local/cantaloupe/cantaloupe-4.1.8.war

Caching should not be the problem because requested zoom level exists in pyramidal tiff. Jetty. 4.1.8: the same timings.

Did enable "debug" logging:

… e.i.l.c.p.Java2DUtil [Java2DUtil.java:1004] scale(): scaled 1536x1536 image to 768x768 ⸗
using a BiCubic filter in 1228 msec
… e.i.l.c.p.Java2DUtil [Java2DUtil.java:1004] scale(): scaled 1536x1536 image to 768x768 ⸗
using a BiCubic filter in 1234 msec
… …

But 1536 / 768 = 2 — should be in pyramidal tiff.

@jbarth-ubhd
Copy link
Author

jbarth-ubhd commented Apr 13, 2021

What means this: … e.i.l.c.s.DelegateProxy [DelegateProxy.java:367] invokeUncached(): httpsource_resource_info() returned {"uri"=>"https://heidicon.ub.uni-heidelberg.de/eas/partitions/3/0/605000/605909/1392039ec03494ade4b39060c02d64ecd1761951/image/tiff/Rubens-tile94-deflate.tif"} for args: (none) in 2844 msec ?

PS: does this mean that returning the uri from DelegateProxy took 2844 msec?

@jbarth-ubhd
Copy link
Author

Pyramidal tiff (for test purposes) is here: https://digi.ub.uni-heidelberg.de/diglitData/v/Rubens-tile94-deflate.tif

@jbarth-ubhd
Copy link
Author

jbarth-ubhd commented Apr 13, 2021

Oops... our delegate needs indeed multiple seconds.

@jbarth-ubhd
Copy link
Author

besides the very slow response times of my delegate.rb, delegate_script.cache.enabled = true does not seem to work, the id is resolved multiple times. Is there something to consider in delegate.rb to make sure caching works?

PS: checked caching setting with endpoint.api.enabled = true.

PS2: why is caching removed from configuration in cantaloupe 5?

@adolski
Copy link
Contributor

adolski commented Apr 13, 2021

@jbarth-ubhd The delegate method invocation cache was added some time ago when the delegate methods relied on arguments instead of the context. Adding the context broke it and it has probably been broken throughout the 4.1 series. This was just something that fell off my radar.

In 5.0, the added richness of the context makes computing reliable cache keys difficult. I decided that I didn't want to support this feature anymore so I pulled it. If you need to employ caching, I recommend implementing your own cache.

@jbarth-ubhd
Copy link
Author

Did implement some primitive caching, now timings are much better:

…/1139791:605909/    0,    0,26315,19908/412,/… 0.2s 1/63.871359 [200]
…/1139791:605909/    0,    0,24576,19908/768,/… 0.3s 1/32.000000 [200]
…/1139791:605909/24576,    0, 1739,19908/ 55,/… 0.2s 1/31.618182 [200]
…/1139791:605909/12288,    0,12288,12288/768,/… 0.6s 1/16.000000 [200]
…/1139791:605909/    0,    0,12288,12288/768,/… 0.7s 1/16.000000 [200]
…/1139791:605909/12288,12288,12288, 7620/768,/… 0.7s 1/16.000000 [200]
…/1139791:605909/    0,12288,12288, 7620/768,/… 0.5s 1/16.000000 [200]
…/1139791:605909/24576,    0, 1739,12288/109,/… 0.6s 1/15.954128 [200]
…/1139791:605909/24576,12288, 1739, 7620/109,/… 0.3s 1/15.954128 [200]
…/1139791:605909/ 6144, 6144, 6144, 6144/768,/… 0.7s 1/8.000000 [200]
…/1139791:605909/12288, 6144, 6144, 6144/768,/… 0.6s 1/8.000000 [200]
…/1139791:605909/ 6144,12288, 6144, 6144/768,/… 0.6s 1/8.000000 [200]
…/1139791:605909/12288,12288, 6144, 6144/768,/… 0.8s 1/8.000000 [200]
…/1139791:605909/    0, 6144, 6144, 6144/768,/… 0.6s 1/8.000000 [200]
…/1139791:605909/18432, 6144, 6144, 6144/768,/… 0.5s 1/8.000000 [200]
…/1139791:605909/    0,12288, 6144, 6144/768,/… 0.5s 1/8.000000 [200]
… … …

@jbarth-ubhd
Copy link
Author

Cantaloupe still scales from 1536 to 768 (÷2), which is in pyramidal tiff(?)

But the requested size was 3072×3072... what has 1536 to do with this?

2021-04-14 09:21:46,713 INFO [qtp586617651-27] e.i.l.c.r.i.v.ImageResource [⸗
  AbstractResource.java:117] Handling GET /iiif/2/1139791:605909/0,3072,3072,3072⸗
  /768,/0/default.jpg
…
2021-04-14 09:21:47,939 DEBUG [qtp586617651-27] e.i.l.c.p.Java2DUtil [⸗
  Java2DUtil.java:1004] scale(): scaled 1536x1536 image to 768x768 using a ⸗
  BiCubic filter in 82 msec
…
2021-04-14 09:21:47,967 DEBUG [qtp586617651-27] e.i.l.c.r.ImageRepresentation [⸗
  ImageRepresentation.java:192] Java2dProcessor processed in 1197 msec: 1139791:⸗
  605909_1:1_cropbypixels:0,3072,3072,3072_scale:768,,bicubic_encode:⸗
  jpg_JPEG_80_interlace_#FFFFFF_8
2021-04-14 09:21:47,967 DEBUG [qtp586617651-27] e.i.l.c.r.HandlerServlet [⸗
  HandlerServlet.java:124] Responded to GET /iiif/2/1139791:605909/0,3072,3072,⸗
  3072/768,/0/default.jpg with HTTP 200 in 1255 msec

@adolski
Copy link
Contributor

adolski commented Apr 14, 2021

@jbarth-ubhd The dimensions of the individual pyramid levels should be logged in there somewhere, if you can set log.application.level = trace. If that doesn't show anything interesting, could you post the file so i can view it?

@jbarth-ubhd
Copy link
Author

jbarth-ubhd commented Apr 14, 2021

@adolski: #296 (comment)

PS: instead of own cache implementation I'll use this tomorrow:

require 'lru_redux'
$cache=LruRedux::TTL::ThreadSafeCache.new(maxItems, maxTTL)
print $cache.getset('kkkey'){
  sleep 1 # something complicated
  1 # return value
}

@jbarth-ubhd
Copy link
Author

@adolski: we'll change tile size from 96 to 256. Let's see …

@jbarth-ubhd
Copy link
Author

jbarth-ubhd commented Apr 15, 2021

How do I add additional ruby libraries?

Installed lru_redux with Gemfile + bundle install --path vendor/bundle.

Tried java -cp /usr/local/.../iiif/cantaloupe/config/vendor/bundle/ruby/2.5.0/gems/lru_redux-1.1.0/lib and java -Dorg.jruby.embed.class.path=/usr/local/.../iiif/cantaloupe/config/vendor/bundle/ruby/2.5.0/gems/lru_redux-1.1.0/lib - neither worked (not even searched for lru_redux.rb in this path, checked with strace -f open,openat,stat ...)

@jbarth-ubhd
Copy link
Author

jbarth-ubhd commented Apr 28, 2021

It seems that the slower requests come from from requests, where the "tile" of the requested region is not a integer number (!!!). Median 90 msec for "integer" requests, median 271 msec for non-"integer":

 62 msec; tile 128.63, 25.00;     …-1832] …cropbypixels:26112,12800,203,512_scale:203,,bicubic…
 64 msec; tile  12.00, 20.00;     …-1939] …cropbypixels:6144,10240,512,512_scale:512,,bicubic…
 64 msec; tile  12.00, 21.00;     …-1965] …cropbypixels:6144,10752,512,512_scale:512,,bicubic…
 64 msec; tile  44.00, 20.00;     …-1880] …cropbypixels:22528,10240,512,512_scale:512,,bicubic…
...
500 msec; tile  12.00,  5.00;     …-1880] …cropbypixels:24576,10240,1024,1024_scale:512,,bicubic…
501 msec; tile   0.00,  0.50; !!! …-1969] …cropbypixels:0,1024,1024,1024_scale:512,,bicubic…
502 msec; tile  10.50,  8.00; !!! …-1832] …cropbypixels:21504,16384,1024,1024_scale:512,,bicubic…
507 msec; tile   3.53,  1.25; !!! …-1831] …cropbypixels:24576,10240,1739,2048_scale:435,,bicubic…
516 msec; tile   1.00,  1.55; !!! …-1831] …cropbypixels:2048,2048,1024,662_scale:512,,bicubic…
518 msec; tile   0.50,  0.75; !!! …-1965] …cropbypixels:4096,6144,2048,2048_scale:512,,bicubic…
524 msec; tile   1.00,  0.50; !!! …-1936] …cropbypixels:2048,1024,1024,1024_scale:512,,bicubic…
527 msec; tile   0.33,  0.00; !!! …-1937] …cropbypixels:2048,0,1552,2048_scale:388,,bicubic…
529 msec; tile   1.00,  0.00;     …-1935] …cropbypixels:2048,0,1024,1024_scale:512,,bicubic…
537 msec; tile  10.50,  7.50; !!! …-1811] …cropbypixels:21504,15360,1024,1024_scale:512,,bicubic…
538 msec; tile   2.50,  4.00; !!! …-1811] …cropbypixels:5120,8192,1024,1024_scale:512,,bicubic…
541 msec; tile   0.50,  1.55; !!! …-1965] …cropbypixels:1024,2048,1024,662_scale:512,,bicubic…
546 msec; tile   0.75,  0.75; !!! …-1969] …cropbypixels:6144,6144,2048,2048_scale:512,,bicubic…
562 msec; tile  11.00,  8.00;     …-1935] …cropbypixels:22528,16384,1024,1024_scale:512,,bicubic…
565 msec; tile  12.00,  5.50; !!! …-1969] …cropbypixels:24576,11264,1024,1024_scale:512,,bicubic…

@jbarth-ubhd
Copy link
Author

jbarth-ubhd commented Apr 29, 2021

Let's take the last request on the list above:

2021-04-16 08:23:21,562 INFO [qtp586617651-1969] e.i.l.c.r.i.v.ImageResource [AbstractResource.java:117] Handling GET /⸗
iiif/2/1139791:708192/24576,11264,1024,1024/512,/0/default.jpg
...
2021-04-16 08:23:21,581 DEBUG [qtp586617651-1969] e.i.l.c.s.HttpSource [HTTPStreamFactory.java:84] newSeekableStream():⸗
 using 524288-byte chunks
2021-04-16 08:23:21,582 DEBUG [qtp586617651-1969] e.i.l.c.p.c.TIFFImageReader [AbstractIIOImageReader.java:93] Using it⸗
.geosolutions.imageioimpl.plugins.tiff.TIFFImageReader
2021-04-16 08:23:21,582 DEBUG [qtp586617651-1969] e.i.l.c.p.c.TIFFImageReader [AbstractIIOImageReader.java:106] Ignorin⸗
g metadata? true
2021-04-16 08:23:21,634 DEBUG [qtp586617651-1969] e.i.l.c.p.c.TIFFImageReader [AbstractIIOImageReader.java:506] Acquiri⸗
ng region 24576,11264/1024x1024 from 26315x19908 image (256x256 tile size)
2021-04-16 08:23:22,130 DEBUG [qtp586617651-1969] e.i.l.c.p.Java2DUtil [Java2DUtil.java:1004] scale(): scaled 1024x1024⸗
 image to 512x512 using a BiCubic filter in 233 msec

Why does cantaloupe scale? The image in the second tiff directory has the wanted scale, and the request perfectly matches the 256×256 tiles in this image.

@jbarth-ubhd
Copy link
Author

I'll open the last question in a new case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants