-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support gzip in range request / Explicitly set accept-encoding: identity #11027
Comments
This appears to link to
Does it work correctly if you use the PDF.js library directly, to rule out a bug in the "react-pdf" library?
Yes, assuming that the server both supports Range Requests and is correctly configured.[1]
Most likely not, since this isn't really relevant to using Range Requests and would also require you to write/implement all the code to actually provide the necessary PDF data to the PDF.js API.[2] [1] With the exception of files smaller than [2] |
Thanks for your quick reply! I've updated the URL not to link to localhost 🙈. Here's a codepen that uses It, too, waits for the entire PDF to be downloaded before it renders anything. I don't see range headers in the requests by pdf.js. |
When putting a break-point at pdf.js/src/display/network_utils.js Line 47 in d909b86
pdf.js file), the result is null which would thus suggest that the server isn't configured properly. Hence this indicates a problem with the server, which we cannot really support, rather than the PDF.js library itself.
Also, in your example you're using an older and non-official version of the PDF.js library and it's recommended to always use the latest available release. |
Closing as answered since most likely PDF.js is not at fault here. If you can show it is, then please provide more details and we'll reopen. |
Thanks for the quick replies and the help! I've tested the server, but it seems to support the headers: HEAD https://storage.googleapis.com/ori-static/api.notubiz.nl/document/6131301 HTTP/1.1 HTTP/1.1 200 OK
....
Accept-Ranges: bytes
.... I've updated the codepen example to use the latest version of PDF.js (2.2.228). It seems to have the same behavior as before. When I use the network inspector, I don't see a HEAD request at all, the only request to the PDF file is a simple GET, without any Data-Range headers. I'd love to see a working example that successfully shows the first page before the document is loaded. |
It actually appears that CodePen somehow adds to the problem here, since it appears to strip some of the relevant response headers!? When opening https://storage.googleapis.com/ori-static/api.notubiz.nl/document/6131301 directly in Firefox with the built-in PDF Viewer, which is based on PDF.js, Range Requests aren't supported either. However, here it's at least possible to debug properly and this is what I've found:
|
Nice find, thanks! Perhaps pdf.js should set an Unfortunately, google's Storage API does not handle the Accept-Encoding header correctly - it ignores the Range header when Accept-Encoding is set to identity. But that is not an issue with pdf.js, of course. EDIT: I've submitted this to Google Cloud Issue tracker. To clarify: this bug does not cause the issue of this topic, but it might cause another one that could appear after this issue is solved.
|
It looks like the So I think PDF.js either should support gzip, or it should set @timvandermeij Now it does seem like pdf.js might be at fault here. Could you perhaps re-open the issue? |
Searching through the PDF.js library for the string "Accept-Encoding" yields zero results, so I honestly don't understand how the PDF.js library could be at fault here!? |
It appears that the
The lack of an explicit Accept-Encoding, along with no support for |
The amount of work, and the added complexity, required to actually support Whether it's desirable to modify the PDF.js library to explicitly set the However, as you're hopefully already aware of it's already possible to provide custom Line 133 in be70ee2
|
If that IRC response is true:
Than there is even more reason to explicitly set the |
Let's reopen this. @yurydelendik @brendandahl Do you have any comments on #11027 (comment)? |
According to the specifications, see e.g. the XMLHttpRequest specification which links to the relevant part of the Fetch specification, the With regards to adding support for All-in-all, it seems that WONTFIX probably is the appropriate resolution for this issue. |
I'm using
react-pdf
, which in turn usespdf.js
(awesome library, guys!), for a web application that shows open governmental meeting documents (demo).I noticed that for large PDF files, loading takes a while, and it seems like
pdf.js
downloads the entire file before rendering the first page.The docs mention that PDF.js fetches only the necessary data, if the server supports Content-Range header / Range requests.
The server (Google Cloud Storage) seems to support ranges (example file).
The maintainer of
react-pdf
told me it should work If we simply pass a URL topdfjs.getDocument
, but it does not.Is it true that getDocument should use the Range header if only a URL is passed?
Am I missing something? Should I manually create a
PDFDataRangeTransport
? Thanks in advance for your time!The text was updated successfully, but these errors were encountered: