-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix][ml] Topic load timeout due to ml data ledger future never finishes #23772
[fix][ml] Topic load timeout due to ml data ledger future never finishes #23772
Conversation
/pulsarbot rerun-failure-checks |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #23772 +/- ##
============================================
+ Coverage 73.57% 74.19% +0.62%
+ Complexity 32624 32159 -465
============================================
Files 1877 1853 -24
Lines 139502 143373 +3871
Branches 15299 16276 +977
============================================
+ Hits 102638 106382 +3744
+ Misses 28908 28624 -284
- Partials 7956 8367 +411
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks you should fix the checkAndCompleteLedgerOpTask()
:
protected boolean checkAndCompleteLedgerOpTask(int rc, LedgerHandle lh, Object ctx) {
if (ctx instanceof CompletableFuture) {
// ledger-creation is already timed out and callback is already completed so, delete this ledger and return.
if (((CompletableFuture) ctx).complete(lh)) {
return false;
} else {
if (rc == BKException.Code.OK) {
log.warn("[{}]-{} ledger creation timed-out, deleting ledger", this.name, lh.getId());
asyncDeleteLedger(lh.getId(), DEFAULT_LEDGER_DELETE_RETRIES);
return true; // changed
}
return false; // changed
}
}
return false;
}
testmocks/src/main/java/org/apache/bookkeeper/client/PulsarMockBookKeeper.java
Outdated
Show resolved
Hide resolved
+1 |
It seems no need to change here, the following steps will handle the case of |
…hes (#23772) ### Motivation **Background** There is a mechanism that repeatedly prevents the callback of ML data ledger creation: - Start a scheduled task to check whether the creation will be timeout. - Received a callback - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: it means the creation has timeout before the creation is completed - Otherwise: it is a real callback from BK. **Issue:** But the timeout event will call the same callback as above, then the steps are as follows, which you ca reproduce by the test `testCreateDataLedgerTimeout`: - Start creating a data ledger - Call `BK.createAsync` - Timeout - Mark the future(`@param ctx` of `BK.createAsync`) as completed exceptionally. - Trigger the callback related to ledger creation. - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: do nothing. - Creation is compelled. - Trigger the callback related to ledger creation. - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: do nothing. - Issue: The callback for ledger creation will never be called. ![Screenshot 2024-12-24 at 00 14 38](https://github.com/user-attachments/assets/44ed19d2-7238-45a4-9186-c127f6ed14f7) ![Screenshot 2024-12-24 at 00 14 08](https://github.com/user-attachments/assets/349f39ff-7e98-4a09-9af2-f80082339592) ### Modifications Fix the issue ### Documentation <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. --> - [ ] `doc` <!-- Your PR contains doc changes. --> - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later --> - [x] `doc-not-needed` <!-- Your PR changes do not impact docs --> - [ ] `doc-complete` <!-- Docs have been already added --> ### Matching PR in forked repository PR in forked repository: x (cherry picked from commit 9699dc2)
…hes (#23772) ### Motivation **Background** There is a mechanism that repeatedly prevents the callback of ML data ledger creation: - Start a scheduled task to check whether the creation will be timeout. - Received a callback - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: it means the creation has timeout before the creation is completed - Otherwise: it is a real callback from BK. **Issue:** But the timeout event will call the same callback as above, then the steps are as follows, which you ca reproduce by the test `testCreateDataLedgerTimeout`: - Start creating a data ledger - Call `BK.createAsync` - Timeout - Mark the future(`@param ctx` of `BK.createAsync`) as completed exceptionally. - Trigger the callback related to ledger creation. - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: do nothing. - Creation is compelled. - Trigger the callback related to ledger creation. - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: do nothing. - Issue: The callback for ledger creation will never be called. ![Screenshot 2024-12-24 at 00 14 38](https://github.com/user-attachments/assets/44ed19d2-7238-45a4-9186-c127f6ed14f7) ![Screenshot 2024-12-24 at 00 14 08](https://github.com/user-attachments/assets/349f39ff-7e98-4a09-9af2-f80082339592) ### Modifications Fix the issue ### Documentation <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. --> - [ ] `doc` <!-- Your PR contains doc changes. --> - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later --> - [x] `doc-not-needed` <!-- Your PR changes do not impact docs --> - [ ] `doc-complete` <!-- Docs have been already added --> ### Matching PR in forked repository PR in forked repository: x (cherry picked from commit 9699dc2)
…hes (#23772) ### Motivation **Background** There is a mechanism that repeatedly prevents the callback of ML data ledger creation: - Start a scheduled task to check whether the creation will be timeout. - Received a callback - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: it means the creation has timeout before the creation is completed - Otherwise: it is a real callback from BK. **Issue:** But the timeout event will call the same callback as above, then the steps are as follows, which you ca reproduce by the test `testCreateDataLedgerTimeout`: - Start creating a data ledger - Call `BK.createAsync` - Timeout - Mark the future(`@param ctx` of `BK.createAsync`) as completed exceptionally. - Trigger the callback related to ledger creation. - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: do nothing. - Creation is compelled. - Trigger the callback related to ledger creation. - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: do nothing. - Issue: The callback for ledger creation will never be called. ![Screenshot 2024-12-24 at 00 14 38](https://github.com/user-attachments/assets/44ed19d2-7238-45a4-9186-c127f6ed14f7) ![Screenshot 2024-12-24 at 00 14 08](https://github.com/user-attachments/assets/349f39ff-7e98-4a09-9af2-f80082339592) ### Modifications Fix the issue ### Documentation <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. --> - [ ] `doc` <!-- Your PR contains doc changes. --> - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later --> - [x] `doc-not-needed` <!-- Your PR changes do not impact docs --> - [ ] `doc-complete` <!-- Docs have been already added --> ### Matching PR in forked repository PR in forked repository: x (cherry picked from commit 9699dc2)
…hes (apache#23772) ### Motivation **Background** There is a mechanism that repeatedly prevents the callback of ML data ledger creation: - Start a scheduled task to check whether the creation will be timeout. - Received a callback - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: it means the creation has timeout before the creation is completed - Otherwise: it is a real callback from BK. **Issue:** But the timeout event will call the same callback as above, then the steps are as follows, which you ca reproduce by the test `testCreateDataLedgerTimeout`: - Start creating a data ledger - Call `BK.createAsync` - Timeout - Mark the future(`@param ctx` of `BK.createAsync`) as completed exceptionally. - Trigger the callback related to ledger creation. - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: do nothing. - Creation is compelled. - Trigger the callback related to ledger creation. - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: do nothing. - Issue: The callback for ledger creation will never be called. ![Screenshot 2024-12-24 at 00 14 38](https://github.com/user-attachments/assets/44ed19d2-7238-45a4-9186-c127f6ed14f7) ![Screenshot 2024-12-24 at 00 14 08](https://github.com/user-attachments/assets/349f39ff-7e98-4a09-9af2-f80082339592) ### Modifications Fix the issue ### Documentation <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. --> - [ ] `doc` <!-- Your PR contains doc changes. --> - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later --> - [x] `doc-not-needed` <!-- Your PR changes do not impact docs --> - [ ] `doc-complete` <!-- Docs have been already added --> ### Matching PR in forked repository PR in forked repository: x (cherry picked from commit 9699dc2) (cherry picked from commit 2189b60)
…hes (apache#23772) ### Motivation **Background** There is a mechanism that repeatedly prevents the callback of ML data ledger creation: - Start a scheduled task to check whether the creation will be timeout. - Received a callback - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: it means the creation has timeout before the creation is completed - Otherwise: it is a real callback from BK. **Issue:** But the timeout event will call the same callback as above, then the steps are as follows, which you ca reproduce by the test `testCreateDataLedgerTimeout`: - Start creating a data ledger - Call `BK.createAsync` - Timeout - Mark the future(`@param ctx` of `BK.createAsync`) as completed exceptionally. - Trigger the callback related to ledger creation. - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: do nothing. - Creation is compelled. - Trigger the callback related to ledger creation. - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: do nothing. - Issue: The callback for ledger creation will never be called. ![Screenshot 2024-12-24 at 00 14 38](https://github.com/user-attachments/assets/44ed19d2-7238-45a4-9186-c127f6ed14f7) ![Screenshot 2024-12-24 at 00 14 08](https://github.com/user-attachments/assets/349f39ff-7e98-4a09-9af2-f80082339592) ### Modifications Fix the issue ### Documentation <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. --> - [ ] `doc` <!-- Your PR contains doc changes. --> - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later --> - [x] `doc-not-needed` <!-- Your PR changes do not impact docs --> - [ ] `doc-complete` <!-- Docs have been already added --> ### Matching PR in forked repository PR in forked repository: x (cherry picked from commit 9699dc2) (cherry picked from commit 2189b60)
…hes (apache#23772) ### Motivation **Background** There is a mechanism that repeatedly prevents the callback of ML data ledger creation: - Start a scheduled task to check whether the creation will be timeout. - Received a callback - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: it means the creation has timeout before the creation is completed - Otherwise: it is a real callback from BK. **Issue:** But the timeout event will call the same callback as above, then the steps are as follows, which you ca reproduce by the test `testCreateDataLedgerTimeout`: - Start creating a data ledger - Call `BK.createAsync` - Timeout - Mark the future(`@param ctx` of `BK.createAsync`) as completed exceptionally. - Trigger the callback related to ledger creation. - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: do nothing. - Creation is compelled. - Trigger the callback related to ledger creation. - Check whether the future(`@param ctx` of `BK.createAsync`) has been done or not. - If done: do nothing. - Issue: The callback for ledger creation will never be called. ![Screenshot 2024-12-24 at 00 14 38](https://github.com/user-attachments/assets/44ed19d2-7238-45a4-9186-c127f6ed14f7) ![Screenshot 2024-12-24 at 00 14 08](https://github.com/user-attachments/assets/349f39ff-7e98-4a09-9af2-f80082339592) ### Modifications Fix the issue ### Documentation <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. --> - [ ] `doc` <!-- Your PR contains doc changes. --> - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later --> - [x] `doc-not-needed` <!-- Your PR changes do not impact docs --> - [ ] `doc-complete` <!-- Docs have been already added --> ### Matching PR in forked repository PR in forked repository: x (cherry picked from commit 9699dc2) (cherry picked from commit 2189b60)
Motivation
Background
There is a mechanism that repeatedly prevents the callback of ML data ledger creation:
@param ctx
ofBK.createAsync
) has been done or not.Issue:
But the timeout event will call the same callback as above, then the steps are as follows, which you ca reproduce by the test
testCreateDataLedgerTimeout
:BK.createAsync
@param ctx
ofBK.createAsync
) as completed exceptionally.@param ctx
ofBK.createAsync
) has been done or not.@param ctx
ofBK.createAsync
) has been done or not.Modifications
Fix the issue
Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: x