-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return segments in COMMITTING status in the pauseStatus API for pauseless enabled tables #14908
base: master
Are you sure you want to change the base?
Conversation
…less enabled tables
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #14908 +/- ##
============================================
+ Coverage 61.75% 63.72% +1.97%
- Complexity 207 1472 +1265
============================================
Files 2436 2708 +272
Lines 133233 151575 +18342
Branches 20636 23406 +2770
============================================
+ Hits 82274 96595 +14321
- Misses 44911 47719 +2808
- Partials 6048 7261 +1213
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
// instead of relying solely on the ideal state. | ||
// A segment in COMMITTING state is treated as consuming for pauseStatus. | ||
if (PauselessConsumptionUtils.isPauselessEnabled(getTableConfig(tableNameWithType))) { | ||
getLatestSegmentZKMetadataMap(tableNameWithType).values() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should have a null check here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, it is better to simply make change in findConsumingSegments
method. That would help others who are using it as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. This function is being called by forceCommit, pause and resume API's.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should have a null check here
The function getLatestSegmentZKMetadataMap returns an empty map.
Have added the check for future proofing,
.filter(segmentZKMetadata -> !consumingSegments.contains(segmentZKMetadata.getSegmentName())) | ||
.filter(segmentZKMetadata -> segmentZKMetadata.getStatus() == Status.COMMITTING) | ||
.map(SegmentZKMetadata::getSegmentName) | ||
.forEach(consumingSegments::add); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor) probably we should add this check in findConsumingSegments
method itself (or override findConsumingSegments).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have addressed this above
18501e3
to
16ed5e4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@@ -1974,6 +1975,22 @@ private Set<String> findConsumingSegments(IdealState idealState) { | |||
} | |||
} | |||
}); | |||
// For pauseless tables, a segment marked ONLINE in the ideal state may not have been committed yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we consider adding a new method findCommittingSegments
to return the committing segments for pauseless table? I don't want to mix consuming segment and committing segment because this way it can potentially return 2 segments per partition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have addressed this below: #14908 (comment)
// A segment in COMMITTING state is treated as consuming for pauseStatus. | ||
String tableNameWithType = idealState.getResourceName(); | ||
if (PauselessConsumptionUtils.isPauselessEnabled(getTableConfig(tableNameWithType))) { | ||
Map<Integer, SegmentZKMetadata> metadataMap = getLatestSegmentZKMetadataMap(tableNameWithType); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work? Committing segment won't be the latest segment in each partition. I think this will never find committing segment, but always IN_PROGRESS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started with an assumption that the user has issued a pause request
before the status is checked.
I am not sure whether this assumption is correct.
No new consuming segments will be created for the table when the force commit
message is sent as part of the pause table call.
This ensures that the last segment is :
- Marked
CONSUMING
in theIS
as the commit protocol has not started. - Marked
ONLINE
but has statusCOMMITTING
status as the commit protocol has not finished.
I did not want to add a function findCommittingSegments
here for two reasons:
- Correctness: an older
COMMITTING
segment that failed to complete commit protocol will also be returned in the pause status and I feel that's wrong indication of the pause status. The user is only concerned with the latest segment's commit status. Previously this status was derived from IS and for pauseless it derived from ZK metadata. - Performance: Fetching ZK metadata for all the segments is an expensive operation.
/tables/{tableName}/pauseStatus
returns the status of the pause operation on a table.