Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very slow rendering of two-layered high-resolution PDF #17427

Open
mmatela opened this issue Dec 15, 2023 · 7 comments
Open

Very slow rendering of two-layered high-resolution PDF #17427

mmatela opened this issue Dec 15, 2023 · 7 comments

Comments

@mmatela
Copy link

mmatela commented Dec 15, 2023

Attach (recommended) or Link to PDF file here: https://sbc.org.pl/Content/403918/PDF/539_2671.pdf

Configuration:

  • Web browser and its version: Recent Chrome
  • Operating system and its version: Windows 11, up to date
  • PDF.js version: 4.0.269 [f4b396f]
  • Is a browser extension: no

Steps to reproduce the problem:

  1. Just open the file

What is the expected behavior?
The document should be displayed in a timely manner.

What went wrong?
It takes around 20 seconds on my high-end laptop to display the document. It feels to be around 50 times slower than Chrome's built-in PDF browser. Changing zoom also takes a lot of time.

Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension):
https://sbc.org.pl/formats/pdf/web/viewer.html?file=https%3A%2F%2Fsbc.org.pl%2FContent%2F403918%2FPDF%2F539_2671.pdf%3Fhandler%3Dpdf#locale=pl

@forensmatt
Copy link

The image is slow because it uses JPEG2000 and JBIG2 compression. If you recompress to JPEG you will see a dramatic improvement in loading speed.

@slaFFik
Copy link

slaFFik commented Apr 21, 2024

@forensmatt Can I please ask you, how you identified that the PDF file contains images in that file format? I think I sometimes have this issue as well and currently trying to understand how to mitigate this problem, or at least instruct users.

@forensmatt
Copy link

I use the preflight tools in Acrobat Pro (2020). I'm pretty sure you can get similar info through Foxit Editor, which has a free trial https://www.foxit.com/pdf-editor/ .

@slaFFik
Copy link

slaFFik commented Apr 27, 2024

Thank you, @forensmatt.

I have also found that with this tool https://www.metadata2go.com/view-metadata you can get an understanding of images encoding inside the PDF as well.

If you upload the file (obviously, not a confidential one), and on the results page you scroll down to the pdf_images section - there you can check the values for the encoding attribute.

If it's jpx or jbig2 - that PDF file may benefit from optimizations.
Ideally, the value should be jpeg with interpolation=false I guess.

@forensmatt
Copy link

Thank you @slaFFik for the link to that tool. Very helpful. I am looking forward to see how much the integration of the OpenJPEG decoder will improve things and reduce or even eliminate the need for optimization , as JBIG2 and JPEG2000 are far more efficient than jpg (see the discussion here #17946). Not sure if these changes have been implemented in the pdf.js web demo (it says version 4.2.53, while latest release is 4.1.392 and does appear to not have these changes ), but a quick test appears to show significant speed improvements over the older implementation of PDF.js used in Zotero 7 Beta.

@mmatela
Copy link
Author

mmatela commented May 6, 2024

I've tried the 4.2.67 release and it still takes very long to render my example PDF :/

@forensmatt
Copy link

@mmatela, perhaps your file has a mask layer. I believe I saw comments by the devs that this update did not address the slowness of masks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants