Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify and document region life cycle #227

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

aaronbee
Copy link
Collaborator

  • Removed RegionInfo.SetClient. The client is now set to nil by MarkUnavailable, and then set again with MarkAvailable. This removes bugs and races where a region info may have its Client set when it shouldn't be. A region now has 3 distinct states it may be in:
  1. Unavailable: AvailabilityChan() is non-nil and Client() is nil
  2. Available: AvailabilityChan() is nil and Client() is non-nil
  3. Stale: AvailabilityChan() is nil and Client() is nil
  • Documented RegionInfo type with information about those 3 states and the lifecycle.

  • Documented establishRegion with what it does and what it's requirements are.

  • Removed calls to SetClient(nil)/MarkUnavailable in clientRegionCache methods. That work is the responsibility of the caller.

  • getRegionAndClientForRPC no longer marks a region as unavailable if Client is nil after waiting on the AvailabilityChan. If Client is nil, that indicates the region is stale and a lookup has to be performed. Marking it as unavailable and trying to reestablish it is wasted effort.

  • The above change broke the use of the meta and admin regions, which are created when creating the gohbase Client/AdminClient, but don't have a Client set. These regions were relying on a missing Client to cause the region to be established. Now the meta and admin regions are established as part of NewClient and NewAdminClient, respectively.

closes: #186

* Removed RegionInfo.SetClient. The client is now set to nil by
MarkUnavailable, and then set again with MarkAvailable. This removes
bugs and races where a region info may have its Client set when it
shouldn't be. A region now has 3 distinct states it may be in:

1. Unavailable: AvailabilityChan() is non-nil and Client() is nil
2. Available: AvailabilityChan() is nil and Client() is non-nil
3. Stale: AvailabilityChan() is nil and Client() is nil

* Documented RegionInfo type with information about those 3 states and
the lifecycle.

* Documented establishRegion with what it does and what it's
requirements are.

* Removed calls to SetClient(nil)/MarkUnavailable in clientRegionCache
methods. That work is the responsibility of the caller.

* getRegionAndClientForRPC no longer marks a region as unavailable if
Client is nil after waiting on the AvailabilityChan. If Client is nil,
that indicates the region is stale and a lookup has to be performed.
Marking it as unavailable and trying to reestablish it is wasted
effort.

* The above change broke the use of the meta and admin regions, which
are created when creating the gohbase Client/AdminClient, but don't
have a Client set. These regions were relying on a missing Client to
cause the region to be established. Now the meta and admin regions are
established as part of NewClient and NewAdminClient, respectively.
@codecov-commenter
Copy link

codecov-commenter commented Jul 29, 2023

Codecov Report

Patch coverage: 53.12% and project coverage change: +0.03% 🎉

Comparison is base (4bda353) 70.24% compared to head (288ab9b) 70.27%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #227      +/-   ##
==========================================
+ Coverage   70.24%   70.27%   +0.03%     
==========================================
  Files          27       27              
  Lines        3714     3701      -13     
==========================================
- Hits         2609     2601       -8     
+ Misses        988      986       -2     
+ Partials      117      114       -3     
Files Changed Coverage Δ
hrpc/call.go 70.00% <ø> (ø)
region/info.go 66.66% <0.00%> (-0.60%) ⬇️
rpc.go 85.54% <36.84%> (+1.02%) ⬆️
admin_client.go 53.57% <100.00%> (+1.21%) ⬆️
caches.go 85.58% <100.00%> (-0.27%) ⬇️
client.go 69.27% <100.00%> (+0.65%) ⬆️

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@aaronbee
Copy link
Collaborator Author

aaronbee commented Aug 1, 2023

This change still needs some work.

@aaronbee aaronbee marked this pull request as draft August 15, 2023 22:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

gohbase stuck retrying RPCs after region server temporarily down
2 participants