-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement]Added a metric for geo replication for tracking replicated subscriptions snapshot timeouts #22381
Conversation
@nikam14 Please add the following content to your PR description and select a checkbox:
|
|
@dao-jun we don't have Otel in use yet. Yes, we can handle this in a proposal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I think that the lack of this metric could be considered a significant problem is replicated subscription observability and should be added to LTS version.
...main/java/org/apache/pulsar/broker/service/persistent/ReplicatedSubscriptionsController.java
Outdated
Show resolved
Hide resolved
Thanks for the contribution @nikam14 ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please help add an unit test to avoid the regression?
...main/java/org/apache/pulsar/broker/service/persistent/ReplicatedSubscriptionsController.java
Show resolved
Hide resolved
FYI @dragosvictor |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #22381 +/- ##
============================================
+ Coverage 73.57% 74.33% +0.75%
- Complexity 32624 34963 +2339
============================================
Files 1877 1952 +75
Lines 139502 147139 +7637
Branches 15299 16197 +898
============================================
+ Hits 102638 109369 +6731
- Misses 28908 29335 +427
- Partials 7956 8435 +479
Flags with carried forward coverage won't be shown. Click here to find out more.
|
…ed subscriptions snapshot timeouts (apache#22381) Co-authored-by: Lari Hotari <lhotari@apache.org>
Fixes #21793
Motivation
Geo replication replicated subscriptions (PIP-33) snapshot creation might time out.
The code contains a debug log message when this happens
When this happens, the subscription state won't be reflected on the remote side and a backlog would build up.
There's no metric to detect this situation.
Modifications
Add a new metric
pulsar_replicated_subscriptions_snapshot_timeouts
which is a counter (that only resets when the broker restarts).Verifying this change
Does this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: