Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: fix tso service discovery at the first time for NewClientWithAPIContext #6749

Merged

Conversation

binshi-bing
Copy link
Contributor

@binshi-bing binshi-bing commented Jul 5, 2023

What problem does this PR solve?

Issue Number: Close #6748

What is changed and how does it work?

After NewClientWithAPIContextV2 returns, the keyspace group should be discovered by the passed keyspace name immediately

Check List

Tests

  • Manual Test
    For the detailed result of manual test, see comments below.

Release note

None.

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jul 5, 2023

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • JmPotato
  • rleungx

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot bot added the release-note-none Denotes a PR that doesn't merit a release note. label Jul 5, 2023
@ti-chi-bot ti-chi-bot bot requested review from JmPotato and lhy1024 July 5, 2023 01:29
@ti-chi-bot ti-chi-bot bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jul 5, 2023
@binshi-bing binshi-bing requested a review from rleungx July 5, 2023 01:32
@codecov
Copy link

codecov bot commented Jul 5, 2023

Codecov Report

Merging #6749 (8944815) into master (d0c03a3) will increase coverage by 0.08%.
The diff coverage is 60.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6749      +/-   ##
==========================================
+ Coverage   74.26%   74.34%   +0.08%     
==========================================
  Files         411      411              
  Lines       43002    43013      +11     
==========================================
+ Hits        31934    31977      +43     
+ Misses       8212     8175      -37     
- Partials     2856     2861       +5     
Flag Coverage Δ
unittests 74.34% <60.00%> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

@ti-chi-bot ti-chi-bot bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jul 5, 2023
@rleungx
Copy link
Member

rleungx commented Jul 5, 2023

BTW, how about printing the keyspace ID when the mode changes?

@JmPotato JmPotato changed the title Fix tso service discovery at the first time for NewClientWithAPIContext client: fix tso service discovery at the first time for NewClientWithAPIContext Jul 5, 2023
@ti-chi-bot ti-chi-bot bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jul 5, 2023
Signed-off-by: Bin Shi <binshi.bing@gmail.com>
@binshi-bing binshi-bing force-pushed the bin/fix-tso-first-time-discovery branch from da81f2d to 8944815 Compare July 5, 2023 02:47
@binshi-bing
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jul 5, 2023

@binshi-bing: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jul 5, 2023

This pull request has been accepted and is ready to merge.

Commit hash: 8944815

@ti-chi-bot ti-chi-bot bot added the status/can-merge Indicates a PR has been approved by a committer. label Jul 5, 2023
@ti-chi-bot ti-chi-bot bot merged commit 69c6a34 into tikv:master Jul 5, 2023
@binshi-bing
Copy link
Contributor Author

Before this fix, the issue can be consistently reproed in the staging env.

It can be consistently reproed by pd-tso-bench with -keyspace-name option.

root@tso-bench:/# ./pd-tso-bench -v -duration 250000s -pd "http://serverless-cluster-pd-0.serverless-cluster-pd-peer.tidb-serverless.svc:2379/" -client 1 -c 1 -interval 30s -keyspace-name "2oVzWDxP4sSxu9f"

Start benchmark #0, duration: 250000s
Create 1 client(s) for benchmark
[2023/07/04 23:39:38.981 +00:00] [INFO] [pd_service_discovery.go:590] ["[pd] update member urls"] [old-urls="[http://serverless-cluster-pd-0.serverless-cluster-pd-peer.tidb-serverless.svc:2379]"] [new-urls="[http://serverless-cluster-pd-0.serverless-cluster-pd-peer.tidb-serverless.svc:2379,http://serverless-cluster-pd-1.serverless-cluster-pd-peer.tidb-serverless.svc:2379,http://serverless-cluster-pd-2.serverless-cluster-pd-peer.tidb-serverless.svc:2379]"]
[2023/07/04 23:39:38.981 +00:00] [INFO] [pd_service_discovery.go:616] ["[pd] switch leader"] [new-leader=http://serverless-cluster-pd-0.serverless-cluster-pd-peer.tidb-serverless.svc:2379] [old-leader=]
[2023/07/04 23:39:38.981 +00:00] [INFO] [pd_service_discovery.go:193] ["[pd] init cluster id"] [cluster-id=7158110369020695519]
[2023/07/04 23:39:38.981 +00:00] [INFO] [client.go:590] ["[pd] changing service mode"] [old-mode=UNKNOWN_SVC_MODE] [new-mode=API_SVC_MODE]
[2023/07/04 23:39:38.981 +00:00] [INFO] [tso_service_discovery.go:185] ["created tso service discovery"] [cluster-id=7158110369020695519] [keyspace-id=0] [default-discovery-key=/ms/7158110369020695519/tso/00000/primary]
[2023/07/04 23:39:38.981 +00:00] [INFO] [tso_service_discovery.go:195] ["initializing tso service discovery"] [max-retry-times=100] [retry-interval=1s]
[2023/07/04 23:39:38.982 +00:00] [INFO] [tso_service_discovery.go:594] ["update tso server addresses"] [addrs="[http://pd-tso-server-0.tso-service.tidb-serverless.svc:2379,http://pd-tso-server-1.tso-service.tidb-serverless.svc:2379]"]
[2023/07/04 23:39:38.985 +00:00] [INFO] [tso_service_discovery.go:503] ["[tso] updated keyspace group service discovery info"] [keyspace-id-in-request=0] [tso-server-addr=http://pd-tso-server-0.tso-service.tidb-serverless.svc:2379] [keyspace-group-service="user_kind:"basic" members:<address:"http://pd-tso-server-0.tso-service.tidb-serverless.svc:2379" is_primary:true > members:<address:"http://pd-tso-server-1.tso-service.tidb-serverless.svc:2379" > "]
[2023/07/04 23:39:38.985 +00:00] [INFO] [tso_client.go:230] ["[tso] switch dc tso global allocator serving address"] [dc-location=global] [new-address=http://pd-tso-server-0.tso-service.tidb-serverless.svc:2379]
[2023/07/04 23:39:38.985 +00:00] [INFO] [tso_service_discovery.go:413] ["[tso] switch primary"] [new-primary=http://pd-tso-server-0.tso-service.tidb-serverless.svc:2379] [old-primary=]
[2023/07/04 23:39:38.994 +00:00] [FATAL] [main.go:120] ["get first time tso failed"] [client-number=0] [error="rpc error: code = Unknown desc = [PD:tso:ErrGenerateTimestamp]generate timestamp failed, requested pd is not leader of cluster"] [errorVerbose="rpc error: code = Unknown desc = [PD:tso:ErrGenerateTimestamp]generate timestamp failed, requested pd is not leader of cluster\ngithub.com/tikv/pd/client.(*tsoTSOStream).processRequests\n\t/go/src/github.com/tikv/pd/client/tso_stream.go:204\ngithub.com/tikv/pd/client.(*tsoClient).processRequests\n\t/go/src/github.com/tikv/pd/client/tso_dispatcher.go:716\ngithub.com/tikv/pd/client.(*tsoClient).handleDispatcher\n\t/go/src/github.com/tikv/pd/client/tso_dispatcher.go:449\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_arm64.s:1172\ngithub.com/tikv/pd/client.(*tsoRequest).Wait\n\t/go/src/github.com/tikv/pd/client/tso_dispatcher.go:98\ngithub.com/tikv/pd/client.(*client).GetLocalTS\n\t/go/src/github.com/tikv/pd/client/client.go:843\nmain.bench\n\t/go/src/github.com/tikv/pd/tools/pd-tso-bench/main.go:118\nmain.main\n\t/go/src/github.com/tikv/pd/tools/pd-tso-bench/main.go:97\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_arm64.s:1172"] [stack="main.bench\n\t/go/src/github.com/tikv/pd/tools/pd-tso-bench/main.go:120\nmain.main\n\t/go/src/github.com/tikv/pd/tools/pd-tso-bench/main.go:97\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"]

@binshi-bing
Copy link
Contributor Author

After the fix, the issue is gone and the client log is expected.

root@tso-bench:/# ./pd-tso-bench -v -duration 250000s -pd "http://serverless-cluster-pd-0.serverless-cluster-pd-peer.tidb-serverless.svc:2379/" -client 1 -c 1 -interval 10s -keyspace-name "2oVzWDxP4sSxu9f"

Start benchmark #0, duration: 250000s
Create 1 client(s) for benchmark
[2023/07/05 03:16:47.148 +00:00] [INFO] [pd_service_discovery.go:600] ["[pd] update member urls"] [old-urls="[http://serverless-cluster-pd-0.serverless-cluster-pd-peer.tidb-serverless.svc:2379]"] [new-urls="[http://serverless-cluster-pd-0.serverless-cluster-pd-peer.tidb-serverless.svc:2379,http://serverless-cluster-pd-1.serverless-cluster-pd-peer.tidb-serverless.svc:2379,http://serverless-cluster-pd-2.serverless-cluster-pd-peer.tidb-serverless.svc:2379]"]
[2023/07/05 03:16:47.148 +00:00] [INFO] [pd_service_discovery.go:626] ["[pd] switch leader"] [new-leader=http://serverless-cluster-pd-0.serverless-cluster-pd-peer.tidb-serverless.svc:2379] [old-leader=]
[2023/07/05 03:16:47.148 +00:00] [INFO] [pd_service_discovery.go:195] ["[pd] init cluster id"] [cluster-id=7158110369020695519]
[2023/07/05 03:16:47.152 +00:00] [INFO] [client.go:586] ["[pd] changing service mode"] [old-mode=UNKNOWN_SVC_MODE] [new-mode=API_SVC_MODE]
[2023/07/05 03:16:47.152 +00:00] [INFO] [tso_service_discovery.go:185] ["created tso service discovery"] [cluster-id=7158110369020695519] [keyspace-id=18076] [default-discovery-key=/ms/7158110369020695519/tso/00000/primary]
[2023/07/05 03:16:47.152 +00:00] [INFO] [tso_service_discovery.go:195] ["initializing tso service discovery"] [max-retry-times=100] [retry-interval=1s]
[2023/07/05 03:16:47.152 +00:00] [INFO] [tso_service_discovery.go:589] ["update tso server addresses"] [addrs="[http://pd-tso-server-0.tso-service.tidb-serverless.svc:2379,http://pd-tso-server-1.tso-service.tidb-serverless.svc:2379]"]
[2023/07/05 03:16:47.155 +00:00] [INFO] [tso_service_discovery.go:498] ["[tso] updated keyspace group service discovery info"] [keyspace-id-in-request=18076] [tso-server-addr=http://pd-tso-server-0.tso-service.tidb-serverless.svc:2379] [keyspace-group-service="id:2 user_kind:"basic" members:<address:"http://pd-tso-server-0.tso-service.tidb-serverless.svc:2379" > members:<address:"http://pd-tso-server-1.tso-service.tidb-serverless.svc:2379" is_primary:true > "]
[2023/07/05 03:16:47.155 +00:00] [INFO] [tso_client.go:230] ["[tso] switch dc tso global allocator serving address"] [dc-location=global] [new-address=http://pd-tso-server-1.tso-service.tidb-serverless.svc:2379]
[2023/07/05 03:16:47.155 +00:00] [INFO] [tso_service_discovery.go:408] ["[tso] switch primary"] [new-primary=http://pd-tso-server-1.tso-service.tidb-serverless.svc:2379] [old-primary=]
[2023/07/05 03:16:47.158 +00:00] [INFO] [tso_dispatcher.go:295] ["[tso] tso dispatcher created"] [dc-location=global]
[2023/07/05 03:16:47.158 +00:00] [INFO] [client.go:634] ["[pd] service mode changed"] [old-mode=UNKNOWN_SVC_MODE] [new-mode=API_SVC_MODE]
[2023/07/05 03:16:47.158 +00:00] [INFO] [client.go:511] ["[pd] create pd client with endpoints and keyspace"] [pd-address="[http://serverless-cluster-pd-0.serverless-cluster-pd-peer.tidb-serverless.svc:2379]"] [keyspace-name=2oVzWDxP4sSxu9f] [keyspace-id=18076]

count: 30787, max: 20.4116ms, min: 0.2205ms, avg: 0.3239ms
<1ms: 30747, >1ms: 22, >2ms: 13, >5ms: 3, >10ms: 2, >30ms: 0, >50ms: 0, >100ms: 0, >200ms: 0, >400ms: 0, >800ms: 0, >1s: 0

ti-chi-bot bot pushed a commit that referenced this pull request Jul 5, 2023
ref #6747, ref #6748, ref #6749

Signed-off-by: Ryan Leung <rleungx@gmail.com>
rleungx pushed a commit to rleungx/pd that referenced this pull request Aug 2, 2023
…APIContext (tikv#6749)

close tikv#6748

After NewClientWithAPIContextV2 returns, the keyspace group should be discovered by the passed keyspace name immediately

Signed-off-by: Bin Shi <binshi.bing@gmail.com>
rleungx added a commit to rleungx/pd that referenced this pull request Aug 2, 2023
rleungx pushed a commit to rleungx/pd that referenced this pull request Aug 2, 2023
…APIContext (tikv#6749)

close tikv#6748

After NewClientWithAPIContextV2 returns, the keyspace group should be discovered by the passed keyspace name immediately

Signed-off-by: Bin Shi <binshi.bing@gmail.com>
rleungx pushed a commit to rleungx/pd that referenced this pull request Aug 2, 2023
…APIContext (tikv#6749)

close tikv#6748

After NewClientWithAPIContextV2 returns, the keyspace group should be discovered by the passed keyspace name immediately

Signed-off-by: Bin Shi <binshi.bing@gmail.com>
rleungx added a commit to rleungx/pd that referenced this pull request Aug 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
4 participants