-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[BACKPORT 2024.2.0][#24829] CDC: Fix GetChanges WAL segment reading a…
…nd computation of safe hybrid time for GetChanges resposne Summary: **Backport description:** No merge conflicts encountered. Note - The default value of flag `cdc_read_wal_segment_by_segment` has been again changed back to true as this diff fixes the data loss issue observed with the changes under this flag. **Original description:** Original commit: 103ca31 / D39925 During a CDC QA run, we encountered a data loss scenario caused by an interaction between transaction commit timing and WAL segment rollover. WAL segment 'X' was active when transaction 'T' was still a running transaction. Due to concurrent segment rollover, the transaction's APPLY operation was appended in the subsequent segment 'X+1'. The following sequence of events led to data loss: 1. Initial GetChanges Request: - Requested opid index present in segment 'X' - No CDC related WAL OPs were found in segment 'X' between [requested op id index, last index in 'X'] - `committed_op_id_index` (pointed to last index of segment 'X') lagged behind `majority_replicated_op_id_index`, hence we did not scan segment 'X+1'. - Finally, we returned empty response with incorrect safe hybrid time (=tablet leader safe time) 2. Subsequent GetChanges Request: - Successfully read segment 'X+1' ( as committed_op_id_index had advanced to segment 'X+1') - UPDATE_TXN_OP of transaction 'T' was read but CDC filtered it out as we do not ship WAL OPs having commit_time < previously returned safe hybrid time - Resulted in transaction 'T' never being shipped. Root Cause: We violated a critical invariant: the tablet leader safe time should only be returned in GetChanges responses after reading the entire WAL. But in the 1st Getchanges call, we returned the tablet leader safe time having only read till segment 'X', leading to improper transaction filtering and subsequent data loss. This diff makes the following changes to fix the above mentioned data loss issue: 1. When reading the WAL segment by segment, we will now wait for the `committed_op_id_index` to be >= last index in the segment. If the deadline is reached while waiting, we will return 0 WAL ops and indicate to wait for WAL update. 2. Identification of active segment in GetConsistentWALRecords has been changed. Earlier, we were inferring active segment by the value of `consistent_stream_safe_time_footer`. If it was kInvalid, then we concluded its an active segment. But because of changes in 1), that conclusion no longer holds true. If we reach the deadline while waiting for `committed_op_id_index` to be >= last index in segment, then the value of `consistent_stream_safe_time_footer` will still be invalid. Hence, we are now using a dedicated variable (`read_entire_wal`) that will notify if we actually read the active segment in `ReadReplicatedMessagesInSegmentForCDC()`. 3. Value of `safe_hybrid_time` returned in GetChanges response. We have the following scenarios which will decide the safe_hybrid_time sent in the response: - **Case-1**: Read the active segment -> return the last record's commit_time if valid records are found. Else, return tablet leader safe time (existing logic). - **Case-2**: Deadline reached while waiting for `committed_op_id_index` to be >= last index in the WAL segment. There are 2 ways in which we can reach deadline. For both these ways, we want to wait for WAL to be updated. - Deadline reached after reading multiple segments -> return request safe hybrid time. - Deadline reached on the 1st segment read in GetChanges call -> return request safe hybrid time. - **Case-3:** requested index (from_op_id_index) >= `committed_op_id_index` - If all RAFT replicated Ops are not yet APPLIED (i.e. `committed_op_id_index` != `majority_replicated_op_id_index`), then we indicate to wait for WAL update and return request safe hybrid time. - If all RAFT replicated OPs are also APPLIED (i.e. `committed_op_id_index` == `majority_replicated_op_id_index`), then we have following scenarios: - If the condition in case-3 was hit after reading multiple segments, then return footer value (`min_start_time_running_txns`) of last read segment. - If the condition in case-3 was hit on the 1st segment read in GetChanges call, then return tablet leader safe time as this particular case implies we have read the entire WAL. Additionally, we have added a WARNING log when the invariant for response safe time is broken. Jira: DB-13935 Test Plan: Jenkins Reviewers: skumar, sumukh.phalgaonkar Reviewed By: sumukh.phalgaonkar Subscribers: ycdcxcluster, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D40325
- Loading branch information
1 parent
dad4e21
commit 60174a3
Showing
7 changed files
with
208 additions
and
59 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.