-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(storagenode): accept SyncInit sent from trimmed source to new destination #470
Conversation
Current dependencies on/for this PR: This comment was auto-generated by Graphite. |
Codecov ReportPatch coverage:
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. Additional details and impacted files@@ Coverage Diff @@
## main #470 +/- ##
==========================================
+ Coverage 62.30% 62.35% +0.04%
==========================================
Files 133 133
Lines 18415 18434 +19
==========================================
+ Hits 11473 11494 +21
- Misses 6363 6369 +6
+ Partials 579 571 -8
☔ View full report in Codecov by Sentry. |
580fa72
to
ecb0a2c
Compare
…tination Storage nodes can be trimmed and synchronized. However, there is some bug in that a new destination replica joined into the log stream rejects SyncInit RPC sent from the trimmed source replica. Those replicas are all empty and have no log entries; however, the source replica has a commit context indicating the last committed LLSN. In this situation, the destination replica must accept SyncInit to receive the commit context from the source replica, but it does not. This PR fixes the above issue. To solve the problem, it changes the condition that the destination replica decides whether they are already synchronized. ```go // Previous code: https://github.com/kakao/varlog/blob/5269481c0e80c2eebf8214116a2d1544a26cb443/internal/storagenode/logstream/sync.go#L297-L302 // // NOTE: When the replica has all log entries, it returns its range of logs and non-error results. // In this case, this replica remains executorStateSealing. // Breaking change: previously it returns ErrExist when the replica has all log entries to replicate. if dstLastCommittedLLSN == srcRange.LastLLSN && !invalid { return snpb.SyncRange{}, status.Errorf(codes.AlreadyExists, "already synchronized") } ``` Since both replicas have no log entries, the condition `dstLastCommittedLLSN == srcRange.LastLLSN` is not enough. This PR changed the condition to be `dstLastCommittedLLSN == srcLastCommittedLLSN && dstLastCommittedLLSN == srcRange.LastLLSN`. Since the `srcLastCommittedLLSN` is valid regardless of log entries in the source replica, the destination replica will accept the SyncInit. Resolve #478
ecb0a2c
to
5b24f99
Compare
@hungryjang, I added a follow-up commit, 110ef19. It makes SyncInit more obvious to check the need for synchronization. |
What this PR does
Storage nodes can be trimmed and synchronized. However, there is some bug in that a new destination
replica joined into the log stream rejects SyncInit RPC sent from the trimmed source replica. Those
replicas are all empty and have no log entries; however, the source replica has a commit context
indicating the last committed LLSN. In this situation, the destination replica must accept SyncInit
to receive the commit context from the source replica, but it does not.
This PR fixes the above issue. To solve the problem, it changes the condition that the destination
replica decides whether they are already synchronized.
Since both replicas have no log entries, the condition
dstLastCommittedLLSN == srcRange.LastLLSN
is not enough. This PR changed the condition to be
dstLastCommittedLLSN == srcLastCommittedLLSN && dstLastCommittedLLSN == srcRange.LastLLSN
. Since thesrcLastCommittedLLSN
is valid regardless oflog entries in the source replica, the destination replica will accept the SyncInit.
Which issue(s) this PR resolves
Resolves #478