-
-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed the support for byte range requests #360
Conversation
The second component of a byte range, if present, designates the index of the last byte to be included in the partial response.
In case of a partial response the size of the response is different from the served entry size.
After this commit valid ranges of the form "bytes=firstbyte-lastbyte" should be handled correctly.
Codecov Report
@@ Coverage Diff @@
## master #360 +/- ##
=======================================
Coverage 57.19% 57.19%
=======================================
Files 48 48
Lines 3614 3614
Branches 1809 1809
=======================================
Hits 2067 2067
Misses 1538 1538
Partials 9 9 Continue to review full report at Codecov.
|
Also renamed the `range_pair` data member of `RequestContext` to `byteRange_`
MHD_HTTP_RANGE_NOT_SATISFIABLE is not defined in the older version of libmicrohttpd (that is used under CI/Linux native_dyn).
ERROR is a macro under Windows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is maybe a discussion about your question, but otherwise the code is good.
src/server/response.cpp
Outdated
// XXX: Which of the Accept-Encoding and Range headers has precedence if both | ||
// XXX: are present? Can partial content be compressed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Searching (a bit) the internet, I've found nothing about this.
I would say that yes, partial content can be compressed.
I think we should resolve the byteRange first and then compress the content if needed.
However, our use case here is that we need byteRange for media (video) and they are not compressible.
I don't know if we should handle both case in the same time (but we must be sure to answer with the correct headers anyway)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum... I'm not sure what I've reviewed. I've just comment this lines and github say me this is outdated...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Searching (a bit) the internet, I've found nothing about this.
Same for me. I guess I better have mentioned that in the PR description.
I think we should resolve the byteRange first and then compress the content if needed.
The interaction of the Accept-Encoding
and Range
is a little tricky (and it's strange that there is no formal specification of what the correct interpretation should be). Suppose that the client issues a regular (non-range) request for a large compressible resource (e.g. a huge JSON or XML document). The resource is compressed and streamed to the client but the connection is interrupted at 95%. The client issues a range request to fetch the remaining 5% of the compressed representation of the resource, with the purpose of recombining it with whatever was sent in the first session and being able to decompress it. In such an scenario (which is different from your point of view of how Range
should play with Accept-Encoding
) the load on the server can be excessive - entire resource must be compressed with the purpose of sending only a small part of it.
Note that for partial/interrupted download of a compressed resource the client has no chance to decompress the received part, find out the byte-range for the uncompressed version of the resource and issue a range request for the remaining part.
That's why the safest bet is to not combine compression with partial responses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that for partial/interrupted download of a compressed resource the client has no chance to decompress the received part, find out the byte-range for the uncompressed version of the resource and issue a range request for the remaining part
Why ? That's what I was thinking that what it has to be done.
For me the range is about the uncompressed content. If a connection is lost at 95%, client must uncompress the data it received and ask for what is left to have (in uncompressed range). This range will be compressed again in a new stream. But it cannot ask for the last 5% of the compressed stream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My assumption was that in general it is impossible to decompress a partial archive. That assumption is probably wrong for the compression algorithms/formats used in HTTP. Still, I doubt that if we implement that in kiwix-serve the handling on the client side will be like we expect. Therefore I think that we should not spend any time on this until we know of authoritative formal specification of how it must work.
@mgautierfr I re-requested your review so that you can provide your feedback or approve the latest version. |
I've reviewed the full PR. The code is good. We still have this discussion about |
Will close kiwix/kiwix-tools#148
This PR enables handling of partial content requests with a single byte-range. Requests for two or more byte ranges (even if they effectively constitute a single continuous range) are rejected with a 416 (Range Not Satisfiable) error response. Such behaviour complies with somewhat liberal interpretation of the spec):
Support for
If-Range
conditional requests is not implemented.Partial requests work only for content (entries) from the served ZIM file(s). If both
Range:
andAccept-Encoding: deflate
headers are present in the request, the former has precedence and uncompressed partial content is sent in a 206 response (even if a 200 response to a respective non-range request would have its content compressed).The changes have been validated with new test cases in the server unit-test (as well as some manual experiments with kiwix-serve and curl).
This PR should be easy to review in its final state though some issues with the state of the current code in master are more obvious when looking at the first several commits in the PR history. I didn't spend any time cleaning up the development history, since I believe that this PR can be merged as a single squashed commit.