Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tikv: fix panic when recycle non-superbatch idle conn #16299

Merged
merged 2 commits into from
Apr 12, 2020
Merged

tikv: fix panic when recycle non-superbatch idle conn #16299

merged 2 commits into from
Apr 12, 2020

Conversation

lysu
Copy link
Contributor

@lysu lysu commented Apr 10, 2020

What problem does this PR solve?

Problem Summary:

tidb-tikv will use superbatch conn as default, but "cluster table" will establish non-superbatch connection between tidb-tidb.

enableBatch := req.StoreTp != kv.TiDB

and superbatch due use streaming, it has recycle idle conn logic to avoid goroutine leak.

but "recycle idle conn logic" doesn't work well when conns map mix superbatch and non-superbatch conn, and will meet segv fault.

[2020/04/10 14:56:42.816 +08:00] [ERROR] [coprocessor.go:657] ["copIteratorWork meet panic"] [r="\"invalid memory address or nil pointer dereference\""] ["stack trace"="github.com/pingcap/tidb/store/tikv.(*copIteratorWorker).handleTask.func1\n\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc/go/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:659\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:679\nruntime.panicmem\n\t/usr/local/go/src/runtime/panic.go:199\nruntime.sigpanic\n\t/usr/local/go/src/runtime/signal_unix.go:394\ngithub.com/pingcap/tidb/store/tikv.(*rpcClient).recycleDieConnArray\n\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc/go/src/github.com/pingcap/tidb/store/tikv/client_batch.go:658\ngithub.com/pingcap/tidb/store/tikv.(*rpcClient).recycleDieConnArray\n\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc/go/src/github.com/pingcap/tidb/store/tikv/client_batch.go:658\ngithub.com/pingcap/tidb/store/tikv.(*rpcClient).SendRequest\n\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc/go/src/github.com/pingcap/tidb/store/tikv/client.go:319\ngithub.com/pingcap/tidb/store/tikv.(*RegionRequestSender).sendReqToRegion\n\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc/go/src/github.com/pingcap/tidb/store/tikv/region_request.go:199\ngithub.com/pingcap/tidb/store/tikv.(*RegionRequestSender).SendReqCtx\n\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc/go/src/github.com/pingcap/tidb/store/tikv/region_request.go:162\ngithub.com/pingcap/tidb/store/tikv.(*clientHelper).SendReqCtx\n\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc/go/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:814\ngithub.com/pingcap/tidb/store/tikv.(*copIteratorWorker).handleTaskOnce\n\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc/go/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:729\ngithub.com/pingcap/tidb/store/tikv.(*copIteratorWorker).handleTask\n\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc/go/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:667\ngithub.com/pingcap/tidb/store/tikv.(*copIteratorWorker).run\n\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc/go/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:489"]

What is changed and how it works?

What's Changed:

only recycle superbatch conns

How it Works:

only recycle superbatch conns

Related changes

  • Need to cherry-pick to the release branch

Check List

Tests

  • Manual test (add detailed scripts or steps below)
setup a tidb cluster with multi-tidb instances, and remove a tikv store, then continue give pressure on tidb to trigger "recycle conn logic"

Side effects

  • n/a

Release note

  • tikv: fix panic when recycle non-superbatch idle conn between tidb instances

This change is Reviewable

@lysu
Copy link
Contributor Author

lysu commented Apr 10, 2020

/run-all-tests

@codecov
Copy link

codecov bot commented Apr 10, 2020

Codecov Report

Merging #16299 into master will not change coverage by %.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #16299   +/-   ##
===========================================
  Coverage   80.4598%   80.4598%           
===========================================
  Files           506        506           
  Lines        136135     136135           
===========================================
  Hits         109534     109534           
  Misses        18106      18106           
  Partials       8495       8495           

Copy link
Member

@ngaut ngaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zz-jason zz-jason added status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. and removed status/PTAL labels Apr 11, 2020
@sre-bot
Copy link
Contributor

sre-bot commented Apr 11, 2020

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Apr 11, 2020

@lysu merge failed.

@sre-bot
Copy link
Contributor

sre-bot commented Apr 12, 2020

cherry pick to release-4.0 in PR #16303

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/tikv status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants