Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid string out of bounds error in snapshot delete #12337

Merged

Conversation

andrross
Copy link
Member

Test failure #8771 shows cases where certain random seeds trigger this case. The bug is clear: the substring() call should happen after the startsWith() check in case the blob name is shorter than the prefix length being used as the start index of the substring call. I don't yet know if/how this manifests in real deployments.

This bug is deterministically reproducible with the following seed, which also verifies the fix:

./gradlew ':server:test' --tests "org.opensearch.snapshots.SnapshotResiliencyTests.testConcurrentSnapshotDeleteAndDeleteIndex" -Dtests.seed=E816C7242330BF50

Related Issues

Resolves #8771

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@andrross
Copy link
Member Author

FYI @sohami @harishbhakuni @kasundra07

Copy link
Contributor

❌ Gradle check result for 977e69e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❕ Gradle check result for 20aaa8d: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.remotestore.RemoteIndexPrimaryRelocationIT.testPrimaryRelocationWhileIndexing

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Feb 15, 2024

Codecov Report

Attention: 25 lines in your changes are missing coverage. Please review.

Comparison is base (f73c82a) 71.35% compared to head (09778ad) 71.44%.
Report is 8 commits behind head on main.

Files Patch % Lines
...ch/repositories/blobstore/BlobStoreRepository.java 3.84% 24 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #12337      +/-   ##
============================================
+ Coverage     71.35%   71.44%   +0.09%     
+ Complexity    59818    59801      -17     
============================================
  Files          4959     4959              
  Lines        281129   281125       -4     
  Branches      40857    40854       -3     
============================================
+ Hits         200590   200855     +265     
+ Misses        63908    63602     -306     
- Partials      16631    16668      +37     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@andrross andrross force-pushed the shallow-snapshot-substring-error branch from d05e5fa to a3e7376 Compare February 15, 2024 19:16
Copy link
Contributor

❌ Gradle check result for d05e5fa: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Andrew Ross <andrross@amazon.com>
@andrross andrross force-pushed the shallow-snapshot-substring-error branch from a3e7376 to 09778ad Compare February 15, 2024 20:22
Copy link
Contributor

❌ Gradle check result for a3e7376: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 09778ad: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@andrross
Copy link
Member Author

❌ Gradle check result for 09778ad: FAILURE

#9208
#9191
#10673

Copy link
Contributor

❌ Gradle check result for 09778ad: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@andrross
Copy link
Member Author

❌ Gradle check result for 09778ad: FAILURE

#9208
#9191

Copy link
Contributor

❌ Gradle check result for 09778ad:

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@andrross andrross added the backport 2.x Backport to 2.x branch label Feb 18, 2024
Copy link
Contributor

❕ Gradle check result for 09778ad: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.remotestore.RemoteIndexPrimaryRelocationIT.testPrimaryRelocationWhileIndexing

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@andrross andrross merged commit f86bfa8 into opensearch-project:main Feb 19, 2024
34 of 35 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Feb 19, 2024
* Avoid string out of bounds error in snapshot delete

Test failure #8771 shows cases where certain random seeds trigger this
case. The bug is clear: the substring() call should happen after the
startsWith() check in case the blob name is shorter than the prefix
length being used as the start index of the substring call. I don't yet
know if/how this manifests in real deployments.

Signed-off-by: Andrew Ross <andrross@amazon.com>

* Extract common UUID parsing method

Signed-off-by: Andrew Ross <andrross@amazon.com>

---------

Signed-off-by: Andrew Ross <andrross@amazon.com>
(cherry picked from commit f86bfa8)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@andrross andrross deleted the shallow-snapshot-substring-error branch February 19, 2024 16:14
andrross pushed a commit that referenced this pull request Feb 22, 2024
* Avoid string out of bounds error in snapshot delete

Test failure #8771 shows cases where certain random seeds trigger this
case. The bug is clear: the substring() call should happen after the
startsWith() check in case the blob name is shorter than the prefix
length being used as the start index of the substring call. I don't yet
know if/how this manifests in real deployments.

Signed-off-by: Andrew Ross <andrross@amazon.com>

* Extract common UUID parsing method

Signed-off-by: Andrew Ross <andrross@amazon.com>

---------

Signed-off-by: Andrew Ross <andrross@amazon.com>
(cherry picked from commit f86bfa8)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
peteralfonsi pushed a commit to peteralfonsi/OpenSearch that referenced this pull request Mar 1, 2024
…ct#12337)

* Avoid string out of bounds error in snapshot delete

Test failure opensearch-project#8771 shows cases where certain random seeds trigger this
case. The bug is clear: the substring() call should happen after the
startsWith() check in case the blob name is shorter than the prefix
length being used as the start index of the substring call. I don't yet
know if/how this manifests in real deployments.

Signed-off-by: Andrew Ross <andrross@amazon.com>

* Extract common UUID parsing method

Signed-off-by: Andrew Ross <andrross@amazon.com>

---------

Signed-off-by: Andrew Ross <andrross@amazon.com>
andrross pushed a commit that referenced this pull request Mar 14, 2024
* Avoid string out of bounds error in snapshot delete

Test failure #8771 shows cases where certain random seeds trigger this
case. The bug is clear: the substring() call should happen after the
startsWith() check in case the blob name is shorter than the prefix
length being used as the start index of the substring call. I don't yet
know if/how this manifests in real deployments.



* Extract common UUID parsing method



---------


(cherry picked from commit f86bfa8)

Signed-off-by: Andrew Ross <andrross@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
rayshrey pushed a commit to rayshrey/OpenSearch that referenced this pull request Mar 18, 2024
…ct#12337)

* Avoid string out of bounds error in snapshot delete

Test failure opensearch-project#8771 shows cases where certain random seeds trigger this
case. The bug is clear: the substring() call should happen after the
startsWith() check in case the blob name is shorter than the prefix
length being used as the start index of the substring call. I don't yet
know if/how this manifests in real deployments.

Signed-off-by: Andrew Ross <andrross@amazon.com>

* Extract common UUID parsing method

Signed-off-by: Andrew Ross <andrross@amazon.com>

---------

Signed-off-by: Andrew Ross <andrross@amazon.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…ct#12337)

* Avoid string out of bounds error in snapshot delete

Test failure opensearch-project#8771 shows cases where certain random seeds trigger this
case. The bug is clear: the substring() call should happen after the
startsWith() check in case the blob name is shorter than the prefix
length being used as the start index of the substring call. I don't yet
know if/how this manifests in real deployments.

Signed-off-by: Andrew Ross <andrross@amazon.com>

* Extract common UUID parsing method

Signed-off-by: Andrew Ross <andrross@amazon.com>

---------

Signed-off-by: Andrew Ross <andrross@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch bug Something isn't working flaky-test Random test failure that succeeds on second run skip-changelog Storage:Snapshots
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

[Flaky Test] SnapshotResiliencyTests.testConcurrentSnapshotDeleteAndDeleteIndex if flaky
6 participants