-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: perform local calls in a goroutine #11196
kv: perform local calls in a goroutine #11196
Conversation
I'm going to try this PR on |
13379af
to
88eb869
Compare
Reviewed 1 of 1 files at r1. pkg/kv/transport.go, line 184 at r1 (raw file):
it would be mildly cleaner to share this with the below go routine
Comments from Reviewable |
Review status: 1 of 3 files reviewed at latest revision, 2 unresolved discussions, some commit checks pending. pkg/storage/replica.go, line 1876 at r2 (raw file):
This is actually semi-synchronous as it uses Comments from Reviewable |
7a6dd5c
to
d0bda68
Compare
I'm going to push a build with this second commit to Review status: 1 of 3 files reviewed at latest revision, 2 unresolved discussions. Comments from Reviewable |
Perform local calls in a goroutine so that they can be cancelled and so that they don't block the caller indefinitely. See cockroachdb#10427
d0bda68
to
377e23d
Compare
second commit seems good, but hard to predict what effect it will have! Reviewed 2 of 2 files at r3, 2 of 2 files at r4. pkg/storage/replica.go, line 1669 at r4 (raw file):
this has rotted pkg/storage/replica.go, line 1671 at r4 (raw file):
should this be pkg/storage/replica.go, line 1873 at r4 (raw file):
any reason this doesn't check length? pkg/storage/replica.go, line 4239 at r4 (raw file):
should this be Comments from Reviewable |
2e532e9
to
6e31b79
Compare
Review status: 2 of 5 files reviewed at latest revision, 6 unresolved discussions, some commit checks pending. pkg/kv/transport.go, line 184 at r1 (raw file): Previously, tamird (Tamir Duberstein) wrote…> it would be mildly cleaner to share this with the below go routine > ``` > batchFn := client.client.Batch > if localServer := gt.rpcContext.GetLocalInternalServerForAddr(addr); enableLocalCalls && localServer != nil { > batchFn = localServer.Batch > // Clone the request. At the time of writing, Replica may mutate it > // during command execution which can lead to data races. > // > // TODO(tamird): we should clone all of client.args.Header, but the > // assertions in protoutil.Clone fire and there seems to be no > // reasonable workaround. > origTxn := client.args.Txn > if origTxn != nil { > clonedTxn := origTxn.Clone() > client.args.Txn = &clonedTxn > } > } > go func() { > reply, err := batchFn(..) > ... > }() > ```pkg/storage/replica.go, line 1669 at r4 (raw file): Previously, tamird (Tamir Duberstein) wrote…> this has rottedpkg/storage/replica.go, line 1671 at r4 (raw file): Previously, tamird (Tamir Duberstein) wrote…> should this be `result.Local.detachIntents()`? if so the check above should probably be simplified: `if intents := result.Local.detachIntents(); len(intents) > 0 { ...`pkg/storage/replica.go, line 1873 at r4 (raw file): Previously, tamird (Tamir Duberstein) wrote…> any reason this doesn't check length?pkg/storage/replica.go, line 4239 at r4 (raw file): Previously, tamird (Tamir Duberstein) wrote…> should this be `result.Local.detachIntents()`? if so the check above should probably be simplified: `if intents := result.Local.detachIntents(); len(intents) > 0 { ...`Comments from Reviewable |
Review status: 2 of 5 files reviewed at latest revision, 6 unresolved discussions, some commit checks pending. pkg/kv/transport.go, line 184 at r1 (raw file): Previously, petermattis (Peter Mattis) wrote…> ThatComments from Reviewable |
Reviewed 3 of 3 files at r5. pkg/kv/transport.go, line 184 at r1 (raw file): Previously, petermattis (Peter Mattis) wrote…> `client.client.Batch` and `localServer.Batch` have different signatures.Comments from Reviewable |
@@ -600,22 +609,6 @@ func (r *Replica) handleLocalEvalResult( | |||
// Non-state updates and actions. | |||
// ====================== | |||
|
|||
if originReplica.StoreID == r.store.StoreID() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe a more direct assertion here makes sense. Otherwise we're just going to assert the lResult is empty, and then only if shouldAssert
is true. Might be better to call it out directly and add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call what out directly? That lResult.intents
should be nil
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, suggestion was to make sure it's detached here with an assertion and a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -1869,6 +1869,14 @@ func (r *Replica) tryAddWriteCmd( | |||
// Set endCmds to nil because they have already been invoked | |||
// in processRaftCommand. | |||
endCmds = nil | |||
if len(propResult.Intents) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A problem is that this isn't always the path that a finished request follows. Original proposals which succeed but are re-proposed and cancelled commands will both leave intents to the GC. Maybe that's OK, although knowing our luck, there'll be some degenerate use case that creates massive amounts of intent garbage. After some thought I'm not coming up with any good solutions. However, let's add a comment to the effect that these kinds of requests will generate GC-ready intents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment added.
reply, err := localServer.Batch(gt.opts.ctx, &client.args) | ||
gt.setPending(client.args.Replica, false) | ||
done <- BatchCall{Reply: reply, Err: err} | ||
go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this have any impact on the single-node benchmarks? Just curious; I think this is the right thing to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me check.
@@ -302,26 +265,6 @@ func (ir *intentResolver) maybePushTransactions( | |||
// differently and would be better served by different entry points, | |||
// but combining them simplifies the plumbing necessary in Replica. | |||
func (ir *intentResolver) processIntentsAsync(r *Replica, intents []intentsWithArg) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, we're passing wait=true
here to RunLimitedAsyncTask
means that something which horks the intent queue will starve other work. I think you might consider what's being done DistSender.sendPartialBatchAsync
. There, we pass wait=false
to Stopper.RunLimitedAsyncTask
, and on stop.ErrThrottled
, we synchronously send the partial batch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there isn't much point in not using the blocked goroutine. I'd like to do this in a follow-up PR, though. I'll file an issue.
6e31b79
to
99c8f31
Compare
The change to local call handling is across the board negative for the single-node benchmarks:
I'm actually somewhat surprised the effect is this large. And I expected the effect to diminish with larger batch sizes such as the |
99c8f31
to
2fa544e
Compare
Process intents synchronously on the goroutine which generated them.
2fa544e
to
9858253
Compare
Wow, that's really a nasty performance hit. On the other hand, optimizing this seems like a low priority. |
I'll file an issue about revisiting the performance hit. It is somewhat higher than I expected. And more variable than I expected. I would have expected an across-the-board impact on these benchmarks which are almost always performing either 1PC transactions or read-only operations. Why does |
Reviewed 1 of 1 files at r1, 2 of 3 files at r5, 2 of 2 files at r6. Comments from Reviewable |
Perform local calls in a goroutine so that they can be cancelled and so
that they don't block the caller indefinitely.
See #10427
This change is