Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backend, router: fix errors when closing and redirecting concurrently #72

Merged
merged 2 commits into from
Sep 2, 2022

Conversation

djshow832
Copy link
Collaborator

@djshow832 djshow832 commented Sep 1, 2022

What problem does this PR solve?

Issue Number: close #71

Problem Summary:
These things may happen concurrently:

  • The Router sends a redirection signal to the BackendConnMgr
  • The BackendConnMgr closes due to client or backend disconnection.

There are 2 problems:

  • BackendConnMgr.Redirect() reports a panic because it assumes that it's not closed.
  • Router.OnConnClosed() cannot find the connection by the backend address, because the address passed in by BackendConnMgr is the old one, not the new one (just sent by BackendConnMgr.Redirect()).

What is changed and how it works:

  • Fix the problems by getting the latest address from GetRedirectingAddr().
  • Make BackendConnMgr notify the Router asynchronously to:
    • Reduce the latency of session migration. The notification may wait for Router locks. The client doesn't need to wait.
    • Avoid the risk of deadlock. OnRedirectXXX is called within BackendConnMgr.processLock. If it also waits for Router.lock, it's risky. Someone may write deadlock in the future.
  • Clear BackendConnMgr.signal even if redirection fails to avoid deadloop redirection.
  • Make BackendConnMgr more thread-safer, including making eventReceiver atomical and adding wg.
  • Rename some structs.
  • Add comments and logs to Router.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Shut one of the TiDB immediately without graceful shutdown. The Router will send redirection commands because it finds it's unhealthy, but the connection also closes because the backend fails.

Notable changes

  • Has configuration change
  • Has HTTP API interfaces change (Don't forget to add the declarative for API)
  • Has weirctl change
  • Other user behavior changes

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@djshow832 djshow832 requested a review from xhebox September 1, 2022 05:50
@xhebox xhebox merged commit 8d8a09e into pingcap:main Sep 2, 2022
@djshow832 djshow832 deleted the fix_close_redirect branch September 2, 2022 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Panic happens when the backend server shuts down immedicately
2 participants