-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Backport 2.9] Fix range reads in respository-s3 (#9516) #9525
[Backport 2.9] Fix range reads in respository-s3 (#9516) #9525
Conversation
The `readBlob(String, long, long)` method in the S3 repository has been broken since the upgrade to AWS SDK v2. The cause is that the SDK v2 returns the content range length details in a string formatting per the [RFC 9110][1] spec. For example: ``` bytes 0-100/200 ``` However, the code was attempting to parse it as: ``` bytes=0-100 ``` The fix here is to not parse this string at all and instead use `GetObjectResponse#contentLength`. Note that the incorrect format matches how a range is specified in a _request_ per the [byte ranges][2] section of the RFC and that is likely the source of the confusion. We lack any dedicated integration testing of this method so the bug was not caught by any tests. Additionally, the range read is only used by the searchable snapshot feature currently and we have no automated integration testing with the different repository implementations. One other complicating factor is that due to a fallback path that returns `Long.MAX_VALUE - 1` when the range is failed to be parsed, the bug will only manifest as a long overflow error when requesting past the first block and therefore wasn't caught with very simple manual testing. [1]: https://www.rfc-editor.org/rfc/rfc9110.html#name-content-range [2]: https://www.rfc-editor.org/rfc/rfc9110.html#name-byte-ranges Signed-off-by: Andrew Ross <andrross@amazon.com> (cherry picked from commit 0c839c3)
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
@@ Coverage Diff @@
## 2.9 #9525 +/- ##
============================================
- Coverage 70.57% 70.54% -0.04%
+ Complexity 57072 57070 -2
============================================
Files 4746 4746
Lines 270726 270710 -16
Branches 39952 39948 -4
============================================
- Hits 191070 190975 -95
- Misses 63212 63348 +136
+ Partials 16444 16387 -57
|
Note that this PR should only be merged if we get consensus to issue a 2.9.1 patch and that this fix should be included. |
Sorry to ask, will it be included in 2.10 if there is no patch release? A patch release would give me the opportunity to fix the cluster. Happy to help testing btw. Thanks :) |
@ict-one-nl Yes, it will AFAIK. |
Backports 0c839c3 to
2.9
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.