Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky test testDropPrimaryDuringReplication. #8715

Merged
merged 1 commit into from
Jul 17, 2023

Conversation

mch2
Copy link
Member

@mch2 mch2 commented Jul 17, 2023

Description

This change fixes flaky test testDropPrimaryDuringReplication. This test would hit an edge case where after an updateSegments call on NRTReplicationReaderManager the reader is not actually refreshed because of another concurrent refresh call. Fixes by using a blocking refresh during updateSegments to ensure a refresh has happened. This also removes unnecessary synchronization on the updateSegments method. This is to avoid a deadlock case where the concurrent refresh picks up the new segments but is unable to acquire the object monitor to refresh internally in ReferenceManager.swapReference.

Related Issues

resolves #8059

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

This change fixes testDropPrimaryDuringReplication and an edge case where after an updateSegments call on
NRTReplicationReaderManager the reader is not actually refreshed because of another concurrent refresh call.
Fixes by using a blocking refresh and removes unnecessary synchronization around the updatesegments method to avoid
a deadlock case where the concurrent refresh picks up the new segments but is unable to acquire the object monitor to refresh internally in ReferenceManager.swapReference.

Signed-off-by: Marc Handalian <handalm@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@mch2
Copy link
Member Author

mch2 commented Jul 17, 2023

Gradle Check (Jenkins) Run Completed with:

#7643

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.remotestore.RemoteStoreIT.testStaleCommitDeletionWithInvokeFlush

@dreamer-89 dreamer-89 merged commit 7642e43 into opensearch-project:main Jul 17, 2023
10 of 32 checks passed
@dreamer-89 dreamer-89 added the backport 2.x Backport to 2.x branch label Jul 17, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jul 17, 2023
This change fixes testDropPrimaryDuringReplication and an edge case where after an updateSegments call on
NRTReplicationReaderManager the reader is not actually refreshed because of another concurrent refresh call.
Fixes by using a blocking refresh and removes unnecessary synchronization around the updatesegments method to avoid
a deadlock case where the concurrent refresh picks up the new segments but is unable to acquire the object monitor to refresh internally in ReferenceManager.swapReference.

Signed-off-by: Marc Handalian <handalm@amazon.com>
(cherry picked from commit 7642e43)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@mch2 mch2 deleted the 8059 branch July 17, 2023 20:24
mch2 pushed a commit that referenced this pull request Jul 17, 2023
This change fixes testDropPrimaryDuringReplication and an edge case where after an updateSegments call on
NRTReplicationReaderManager the reader is not actually refreshed because of another concurrent refresh call.
Fixes by using a blocking refresh and removes unnecessary synchronization around the updatesegments method to avoid
a deadlock case where the concurrent refresh picks up the new segments but is unable to acquire the object monitor to refresh internally in ReferenceManager.swapReference.


(cherry picked from commit 7642e43)

Signed-off-by: Marc Handalian <handalm@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
suranjay pushed a commit to suranjay/OpenSearch that referenced this pull request Jul 18, 2023
…8715)

This change fixes testDropPrimaryDuringReplication and an edge case where after an updateSegments call on
NRTReplicationReaderManager the reader is not actually refreshed because of another concurrent refresh call.
Fixes by using a blocking refresh and removes unnecessary synchronization around the updatesegments method to avoid
a deadlock case where the concurrent refresh picks up the new segments but is unable to acquire the object monitor to refresh internally in ReferenceManager.swapReference.

Signed-off-by: Marc Handalian <handalm@amazon.com>
baba-devv pushed a commit to baba-devv/OpenSearch that referenced this pull request Jul 29, 2023
…8715)

This change fixes testDropPrimaryDuringReplication and an edge case where after an updateSegments call on
NRTReplicationReaderManager the reader is not actually refreshed because of another concurrent refresh call.
Fixes by using a blocking refresh and removes unnecessary synchronization around the updatesegments method to avoid
a deadlock case where the concurrent refresh picks up the new segments but is unable to acquire the object monitor to refresh internally in ReferenceManager.swapReference.

Signed-off-by: Marc Handalian <handalm@amazon.com>
kaushalmahi12 pushed a commit to kaushalmahi12/OpenSearch that referenced this pull request Sep 12, 2023
…8715)

This change fixes testDropPrimaryDuringReplication and an edge case where after an updateSegments call on
NRTReplicationReaderManager the reader is not actually refreshed because of another concurrent refresh call.
Fixes by using a blocking refresh and removes unnecessary synchronization around the updatesegments method to avoid
a deadlock case where the concurrent refresh picks up the new segments but is unable to acquire the object monitor to refresh internally in ReferenceManager.swapReference.

Signed-off-by: Marc Handalian <handalm@amazon.com>
Signed-off-by: Kaushal Kumar <ravi.kaushal97@gmail.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…8715)

This change fixes testDropPrimaryDuringReplication and an edge case where after an updateSegments call on
NRTReplicationReaderManager the reader is not actually refreshed because of another concurrent refresh call.
Fixes by using a blocking refresh and removes unnecessary synchronization around the updatesegments method to avoid
a deadlock case where the concurrent refresh picks up the new segments but is unable to acquire the object monitor to refresh internally in ReferenceManager.swapReference.

Signed-off-by: Marc Handalian <handalm@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch skip-changelog
Projects
None yet
2 participants