Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: return byte batches in ScanResponse #26553

Merged
merged 1 commit into from
Jul 25, 2018

Conversation

jordanlewis
Copy link
Member

Previously, the MVCCScan API call completely deserialized the batch it
got from C++ into a slice of roachpb.KeyValue, and sent that slice of
roachpb.KeyValue over gRPC via ScanResponse. This is needlessly
expensive for several reasons.

  • gRPC must re-serialize what we sent it to a flat byte stream. But, we
    already had a flat byte stream to begin with, before inflating it into
    KeyValues. In effect, we're doing pointless deserialization and
    reserialization.
  • We needed to dynamically allocate a slice of roachpb.KeyValue on every
    scan request, in buildScanResponse. This was the second largest cause of
    allocations in our system, beside the first copy from C++ to Go. But,
    it's pointless, since we're just going to throw that slice away again
    when we either serialize to the network or iterate over it and inflate
    the KeyValues into rows later down the pipe.

Now, MVCCScan can optionally skip this inflation and return the raw
write batch that it got from C++. The txnKVFetcher and rowFetcher are
modified to use this option. They now deserialize keys from the write
batch as necessary.

This results in a large decrease in the number of allocations performed
per scan. When going over the network, only 1 object has to be
marshalled and demarshalled (the batch) instead of the number of
returned keys. Also, we don't have to allocate the initial slice of
[]KeyValue, or any of the slices within Key or Value, to return data.

I haven't delved into modifying the relevant unit tests yet, but logic tests pass and I've been playing around with the resultant binary for some performance testing. I don't see much of a concrete performance change, but pprof reports reduced allocations as I'd expect.

Release note: None

@jordanlewis jordanlewis requested review from andreimatei, petermattis, asubiotto and a team June 8, 2018 23:09
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@jordanlewis
Copy link
Member Author

This is essentially a rework of #18913.

@petermattis
Copy link
Collaborator

A definite simplification to skip the batch repr header to make concatenation easier. I wonder why we're not seeing a perf benefit.


Review status: 0 of 11 files reviewed at latest revision, all discussions resolved, some commit checks failed.


pkg/roachpb/api.go, line 251 at r1 (raw file):

		sr.Rows = append(sr.Rows, otherSR.Rows...)
		sr.IntentRows = append(sr.IntentRows, otherSR.IntentRows...)
		sr.BatchResponse = append(sr.BatchResponse, otherSR.BatchResponse...)

The TODO from my PR still applies: we need to handle one response which has Rows set while the other response has BatchResponse set. Converting Rows into BatchResponse would be the straightforward solution. Also need to make sure this is tested (via a unit test in this package).


pkg/roachpb/api.proto, line 348 at r1 (raw file):

  // If true, the client can handle a ScanResponse with batch_response set
  // instead of rows, and would prefer to see that form of ScanResponse.
  bool accepts_bytes = 4;

bytes is a bit general for my taste. accepts_batch_repr? Though I suppose the response is not precisely batch repr format. Perhaps accepts_batch_response though that suffers from the "bit general" criticism as well.


pkg/roachpb/api.proto, line 365 at r1 (raw file):

  // are num_keys pairs, as defined by the ResponseHeader. If set, rows will not
  // be set and vice versa.
  bytes batch_response = 4;

I think the name here should match the accepts_* name. That is, batch_repr (if we decide to use accepts_batch_repr).


pkg/storage/engine/rocksdb.go, line 2700 at r1 (raw file):

// actually decoding the kvs. Instead, it skips the kv and returns the rest of
// the byte buffer.
func mvccScanSkipKeyValue(repr []byte) ([]byte, error) {

This appears to be unused.


Comments from Reviewable

@danhhz
Copy link
Contributor

danhhz commented Jun 11, 2018

Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/roachpb/api.proto, line 348 at r1 (raw file):

  // If true, the client can handle a ScanResponse with batch_response set
  // instead of rows, and would prefer to see that form of ScanResponse.
  bool accepts_bytes = 4;

I'd highly recommend making this an enum. Somehow bools in proto messages always end up later wanting to have been an enum from the start. In this case, I can easily imagine us wanting to add a columnar format at some point.


Comments from Reviewable

@andreimatei
Copy link
Contributor

this all looks great to me, provided that the response format is not configurable, but is instead dictated by the cluster version


Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/roachpb/api.go, line 251 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

The TODO from my PR still applies: we need to handle one response which has Rows set while the other response has BatchResponse set. Converting Rows into BatchResponse would be the straightforward solution. Also need to make sure this is tested (via a unit test in this package).

not if you gate everything on a cluster version you don't


Comments from Reviewable

@bdarnell
Copy link
Contributor

Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/roachpb/api.go, line 251 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

not if you gate everything on a cluster version you don't

Cluster versions don't flip atomically across the cluster. It'll work if you have the client code that sets the accept-format field controlled by a cluster version, but not if each node is deciding on the server-side what format to send.


pkg/roachpb/api.proto, line 348 at r1 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

I'd highly recommend making this an enum. Somehow bools in proto messages always end up later wanting to have been an enum from the start. In this case, I can easily imagine us wanting to add a columnar format at some point.

+1 to a format enum, but note that it's legal (from a proto wire format perspective) to replace a bool with an enum or an int (true and false become 1 and 0).


Comments from Reviewable

@jordanlewis jordanlewis requested a review from rjnn July 11, 2018 20:10
Copy link
Member Author

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Picking this up again now. I don't think it's feasible to make the response format depend on the cluster version.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/roachpb/api.go, line 251 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Cluster versions don't flip atomically across the cluster. It'll work if you have the client code that sets the accept-format field controlled by a cluster version, but not if each node is deciding on the server-side what format to send.

This code doesn't suffer from the lack of rows-batch merging because the kvfetcher handles both the batch format and the kv format. I don't see a way of avoiding that, do you? And if we still have to do that, what's the point of gating the request or response by cluster version?


pkg/roachpb/api.proto, line 348 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

+1 to a format enum, but note that it's legal (from a proto wire format perspective) to replace a bool with an enum or an int (true and false become 1 and 0).

Done.


pkg/roachpb/api.proto, line 348 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

bytes is a bit general for my taste. accepts_batch_repr? Though I suppose the response is not precisely batch repr format. Perhaps accepts_batch_response though that suffers from the "bit general" criticism as well.

Done.


pkg/roachpb/api.proto, line 365 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I think the name here should match the accepts_* name. That is, batch_repr (if we decide to use accepts_batch_repr).

Done.


pkg/storage/engine/rocksdb.go, line 2700 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

This appears to be unused.

Look again! It's used in both the new API and in the old pathway. Look at buildScanResumeKey.

Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 11 of 11 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/roachpb/api.go, line 251 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

This code doesn't suffer from the lack of rows-batch merging because the kvfetcher handles both the batch format and the kv format. I don't see a way of avoiding that, do you? And if we still have to do that, what's the point of gating the request or response by cluster version?

KVFetcher supports one or the other, but not both at the same time. Without a cluster version, you could see a scan split across servers of different versions, and the combined response would have both Rows and BatchResponse set. If callers can be updated to handle both in the same response, then it should be fine to do this without a cluster version, but I think the cluster version is probably the simpler approach (consider ordering - would the consumer need to merge-sort the two result sets together?)


pkg/roachpb/api.proto, line 376 at r2 (raw file):

  repeated KeyValue intent_rows = 3 [(gogoproto.nullable) = false];

  // If set, the results of the scan in batch format - the key/value pairs are

This format is different from the BatchRepr format defined in rocksdb; I'd like for it to have a different name. This description also looks inaccurate - I don't see any varints in MVCCScanDecodeKeyValue.

While we're changing the on-the-wire format, do we want to think about things like prefix-compressing the keys?


pkg/storage/batcheval/cmd_scan.go, line 46 at r2 (raw file):

		if err != nil {
			return result.Result{}, err
		}

Can we do else engine.MVCCScan here and share the NumKeys/ResumeSpan/ReadConsistency stuff after the two branches?

Or would it make sense to always call ScanToBytes and convert to KVs if required here instead of in MVCCScan?

Copy link
Member Author

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/roachpb/api.go, line 251 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

KVFetcher supports one or the other, but not both at the same time. Without a cluster version, you could see a scan split across servers of different versions, and the combined response would have both Rows and BatchResponse set. If callers can be updated to handle both in the same response, then it should be fine to do this without a cluster version, but I think the cluster version is probably the simpler approach (consider ordering - would the consumer need to merge-sort the two result sets together?)

Good point. So if we make the client set the new format if the cluster version is new enough, then we guarantee that the whole cluster has software version new enough to correctly return the new format.


pkg/roachpb/api.proto, line 376 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

This format is different from the BatchRepr format defined in rocksdb; I'd like for it to have a different name. This description also looks inaccurate - I don't see any varints in MVCCScanDecodeKeyValue.

While we're changing the on-the-wire format, do we want to think about things like prefix-compressing the keys?

batch_response isn't different enough from batchRepr?

Do you think doing prefix compression is worth the computation cost? It seems like a tradeoff - decrease the on-the-wire size, but increase computation cost and code complexity on the send and receive side. I was hoping this change could be a simple win that also paves the way for an easier time introducing new formats. If we think that prefix compression is a good idea, maybe we could add it in a separate effort.


pkg/storage/batcheval/cmd_scan.go, line 46 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Can we do else engine.MVCCScan here and share the NumKeys/ResumeSpan/ReadConsistency stuff after the two branches?

Or would it make sense to always call ScanToBytes and convert to KVs if required here instead of in MVCCScan?

Done.

@jordanlewis jordanlewis changed the title [wip] storage: return byte batches in ScanResponse storage: return byte batches in ScanResponse Jul 13, 2018
@jordanlewis jordanlewis force-pushed the batchscanresponse branch 3 times, most recently from 79a2c31 to e0e8d5f Compare July 13, 2018 22:37
Copy link
Member Author

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFAL

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/roachpb/api.go, line 251 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Good point. So if we make the client set the new format if the cluster version is new enough, then we guarantee that the whole cluster has software version new enough to correctly return the new format.

Done. Note that I added the check inside of dist sender itself, since it's quite painful to plumb a cluster setting into everywhere rowfetcher+kvfetcher are instantiated. This might be too gross to keep, though...

@jordanlewis jordanlewis requested a review from a team July 14, 2018 18:12
Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 16 of 16 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/roachpb/api.go, line 251 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Done. Note that I added the check inside of dist sender itself, since it's quite painful to plumb a cluster setting into everywhere rowfetcher+kvfetcher are instantiated. This might be too gross to keep, though...

I think I'm OK with the hack for one cycle.


pkg/roachpb/api.proto, line 376 at r2 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

batch_response isn't different enough from batchRepr?

Do you think doing prefix compression is worth the computation cost? It seems like a tradeoff - decrease the on-the-wire size, but increase computation cost and code complexity on the send and receive side. I was hoping this change could be a simple win that also paves the way for an easier time introducing new formats. If we think that prefix compression is a good idea, maybe we could add it in a separate effort.

Fine with me.


pkg/storage/engine/mvcc.go, line 1725 at r3 (raw file):

) ([]byte, int64, *roachpb.Span, []roachpb.Intent, error) {
	kvData, numKvs, resumeSpan, intents, err := mvccScanInternal(ctx, engine, iter, key, endKey, max, timestamp, consistent, tombstones, txn, reverse)
	if err != nil || resumeSpan != nil {

This doesn't look right. Can't resume spans be returned with non-empty kvData (which is being discarded here)?

Copy link
Member Author

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/storage/engine/mvcc.go, line 1725 at r3 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

This doesn't look right. Can't resume spans be returned with non-empty kvData (which is being discarded here)?

Not from mvccScanInternal, which only returns the resume span if there were intents encountered. The caller of mvccScanInternal is responsible for determining the resume span in the normal case. I've cleaned up the comments to reflect this fact.

Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/storage/engine/mvcc.go, line 1725 at r3 (raw file):
OK, I was looking at the "before" version of mvcc.go and it wasn't clear that this was a change in this PR.

Not from mvccScanInternal, which only returns the resume span if there were intents encountered.

Not quite true - it also returns a resume span if max == 0. Maybe the caller should be responsible for this too, so that the || resumeSpan != nil is redundant (a resume span will only be set if there is an error). I'm more comfortable treating errors as exceptional cases than having multiple return combinations that have to be considered.

mvccScanInternal will also never return intents with an error or resume span, so by this logic the return line should be return nil, 0, resumeSpan, nil, err.

Copy link
Member Author

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/storage/engine/mvcc.go, line 1725 at r3 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

OK, I was looking at the "before" version of mvcc.go and it wasn't clear that this was a change in this PR.

Not from mvccScanInternal, which only returns the resume span if there were intents encountered.

Not quite true - it also returns a resume span if max == 0. Maybe the caller should be responsible for this too, so that the || resumeSpan != nil is redundant (a resume span will only be set if there is an error). I'm more comfortable treating errors as exceptional cases than having multiple return combinations that have to be considered.

mvccScanInternal will also never return intents with an error or resume span, so by this logic the return line should be return nil, 0, resumeSpan, nil, err.

Good point. The client is now responsible for dealing with max == 0.

@jordanlewis
Copy link
Member Author

@petermattis want to take another look?

@petermattis
Copy link
Collaborator

@petermattis want to take another look?

Yes, just haven't had the chance yet. I'll get to this soon.

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/storage/engine/mvcc.go, line 1731 at r4 (raw file):

		return nil, 0, &roachpb.Span{Key: key, EndKey: endKey}, nil, nil
	}
	kvData, numKvs, resumeSpan, intents, err := mvccScanInternal(ctx, engine, iter, key, endKey, max, timestamp, consistent, tombstones, txn, reverse)

Nit: line length.


pkg/storage/engine/mvcc.go, line 1770 at r4 (raw file):

		return nil, &roachpb.Span{Key: key, EndKey: endKey}, nil, nil
	}
	kvData, numKvs, resumeSpan, intents, err := mvccScanInternal(ctx, engine, iter, key, endKey, max, timestamp, consistent, tombstones, txn, reverse)

Nit: Can you wrap some of these super-long lines? We try to limit line length to 100 characters.

Previously, the MVCCScan API call completely deserialized the batch it
got from C++ into a slice of roachpb.KeyValue, and sent that slice of
roachpb.KeyValue over gRPC via ScanResponse. This is needlessly
expensive for several reasons.

- gRPC must re-serialize what we sent it to a flat byte stream. But, we
already had a flat byte stream to begin with, before inflating it into
KeyValues. In effect, we're doing pointless deserialization and
reserialization.
- We needed to dynamically allocate a slice of roachpb.KeyValue on every
scan request, in buildScanResponse. This was the second largest cause of
allocations in our system, beside the first copy from C++ to Go. But,
it's pointless, since we're just going to throw that slice away again
when we either serialize to the network or iterate over it and inflate
the KeyValues into rows later down the pipe.

Now, MVCCScan can optionally skip this inflation and return the raw
write batch that it got from C++. The txnKVFetcher and rowFetcher are
modified to use this option. They now deserialize keys from the write
batch as necessary.

This results in a large decrease in the number of allocations performed
per scan. When going over the network, only 1 object has to be
marshalled and demarshalled (the batch) instead of the number of
returned keys. Also, we don't have to allocate the initial slice of
[]KeyValue, or any of the slices within Key or Value, to return data.

Release note: None
@jordanlewis
Copy link
Member Author

TFTRs!

@jordanlewis
Copy link
Member Author

bors r+

craig bot pushed a commit that referenced this pull request Jul 25, 2018
26553: storage: return byte batches in ScanResponse r=jordanlewis a=jordanlewis

Previously, the MVCCScan API call completely deserialized the batch it
got from C++ into a slice of roachpb.KeyValue, and sent that slice of
roachpb.KeyValue over gRPC via ScanResponse. This is needlessly
expensive for several reasons.

- gRPC must re-serialize what we sent it to a flat byte stream. But, we
already had a flat byte stream to begin with, before inflating it into
KeyValues. In effect, we're doing pointless deserialization and
reserialization.
- We needed to dynamically allocate a slice of roachpb.KeyValue on every
scan request, in buildScanResponse. This was the second largest cause of
allocations in our system, beside the first copy from C++ to Go. But,
it's pointless, since we're just going to throw that slice away again
when we either serialize to the network or iterate over it and inflate
the KeyValues into rows later down the pipe.

Now, MVCCScan can optionally skip this inflation and return the raw
write batch that it got from C++. The txnKVFetcher and rowFetcher are
modified to use this option. They now deserialize keys from the write
batch as necessary.

This results in a large decrease in the number of allocations performed
per scan. When going over the network, only 1 object has to be
marshalled and demarshalled (the batch) instead of the number of
returned keys. Also, we don't have to allocate the initial slice of
[]KeyValue, or any of the slices within Key or Value, to return data.

I haven't delved into modifying the relevant unit tests yet, but logic tests pass and I've been playing around with the resultant binary for some performance testing. I don't see much of a concrete performance change, but pprof reports reduced allocations as I'd expect.



Release note: None

Co-authored-by: Jordan Lewis <jordanthelewis@gmail.com>
@craig
Copy link
Contributor

craig bot commented Jul 25, 2018

Build succeeded

@craig craig bot merged commit 0a8c55b into cockroachdb:master Jul 25, 2018
@nvanbenschoten
Copy link
Member

Awesome work! Were there benchmark results that went along with this change?

@jordanlewis
Copy link
Member Author

Decent but not amazing gains.

name                                old time/op    new time/op    delta
Scan/Cockroach/count=1/limit=0-8       393µs ± 7%     346µs ± 4%  -11.97%  (p=0.000 n=9+10)
Scan/Cockroach/count=1/limit=1-8       408µs ± 1%     363µs ± 1%  -11.18%  (p=0.000 n=8+10)
Scan/Cockroach/count=1/limit=10-8      406µs ± 2%     362µs ± 1%  -10.64%  (p=0.000 n=10+10)
Scan/Cockroach/count=1/limit=100-8     407µs ± 2%     362µs ± 1%  -11.03%  (p=0.000 n=10+10)

name                                old alloc/op   new alloc/op   delta
Scan/Cockroach/count=1/limit=0-8      43.8kB ± 0%    43.6kB ± 0%   -0.49%  (p=0.000 n=9+9)
Scan/Cockroach/count=1/limit=1-8      46.6kB ± 0%    46.3kB ± 0%   -0.69%  (p=0.000 n=9+8)
Scan/Cockroach/count=1/limit=10-8     46.6kB ± 0%    46.2kB ± 0%   -0.70%  (p=0.000 n=8+8)
Scan/Cockroach/count=1/limit=100-8    46.5kB ± 0%    46.1kB ± 0%   -0.84%  (p=0.000 n=8+8)

name                                old allocs/op  new allocs/op  delta
Scan/Cockroach/count=1/limit=0-8         445 ± 1%       442 ± 0%   -0.77%  (p=0.000 n=9+6)
Scan/Cockroach/count=1/limit=1-8         478 ± 0%       474 ± 0%   -0.73%  (p=0.000 n=8+10)
Scan/Cockroach/count=1/limit=10-8        476 ± 0%       472 ± 0%   -0.81%  (p=0.000 n=8+8)
Scan/Cockroach/count=1/limit=100-8       477 ± 0%       473 ± 0%   -0.84%  (p=0.002 n=7+8)



name                                  old time/op    new time/op    delta
Scan/Cockroach/count=100/limit=0-8       589µs ± 7%     576µs ±12%     ~     (p=0.315 n=10+10)
Scan/Cockroach/count=100/limit=1-8       416µs ± 7%     374µs ± 7%  -10.20%  (p=0.000 n=8+9)
Scan/Cockroach/count=100/limit=10-8      449µs ± 8%     428µs ±16%     ~     (p=0.095 n=9+10)
Scan/Cockroach/count=100/limit=100-8     626µs ± 6%     606µs ±11%     ~     (p=0.156 n=9+10)

name                                  old alloc/op   new alloc/op   delta
Scan/Cockroach/count=100/limit=0-8      60.6kB ± 0%    53.9kB ± 1%  -11.04%  (p=0.000 n=9+8)
Scan/Cockroach/count=100/limit=1-8      46.7kB ± 1%    46.4kB ± 0%   -0.77%  (p=0.000 n=8+8)
Scan/Cockroach/count=100/limit=10-8     47.7kB ± 0%    46.8kB ± 0%   -1.84%  (p=0.000 n=8+9)
Scan/Cockroach/count=100/limit=100-8    63.2kB ± 1%    56.4kB ± 1%  -10.65%  (p=0.000 n=10+8)

name                                  old allocs/op  new allocs/op  delta
Scan/Cockroach/count=100/limit=0-8         657 ± 0%       654 ± 0%   -0.59%  (p=0.000 n=10+8)
Scan/Cockroach/count=100/limit=1-8         480 ± 1%       476 ± 0%   -0.91%  (p=0.000 n=8+8)
Scan/Cockroach/count=100/limit=10-8        496 ± 0%       493 ± 0%   -0.62%  (p=0.000 n=8+9)
Scan/Cockroach/count=100/limit=100-8       688 ± 0%       685 ± 0%   -0.48%  (p=0.000 n=10+6)

@jordanlewis
Copy link
Member Author

Hmm, I wonder if making this change inadvertently exposed us to worse memory behavior, since the GC can no longer precisely throw away keys that aren't used - instead, the GC can only throw away one of these batches when nobody is referencing the batch anymore.

I'm exploring this as a possibility for what's going on in #36842.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants