Skip to content

Commit

Permalink
storage: return byte batches in ScanResponse
Browse files Browse the repository at this point in the history
Previously, the MVCCScan API call completely deserialized the batch it
got from C++ into a slice of roachpb.KeyValue, and sent that slice of
roachpb.KeyValue over gRPC via ScanResponse. This is needlessly
expensive for several reasons.

- gRPC must re-serialize what we sent it to a flat byte stream. But, we
already had a flat byte stream to begin with, before inflating it into
KeyValues. In effect, we're doing pointless deserialization and
reserialization.
- We needed to dynamically allocate a slice of roachpb.KeyValue on every
scan request, in buildScanResponse. This was the second largest cause of
allocations in our system, beside the first copy from C++ to Go. But,
it's pointless, since we're just going to throw that slice away again
when we either serialize to the network or iterate over it and inflate
the KeyValues into rows later down the pipe.

Now, MVCCScan can optionally skip this inflation and return the raw
write batch that it got from C++. The txnKVFetcher and rowFetcher are
modified to use this option. They now deserialize keys from the write
batch as necessary.

This results in a large decrease in the number of allocations performed
per scan. When going over the network, only 1 object has to be
marshalled and demarshalled (the batch) instead of the number of
returned keys. Also, we don't have to allocate the initial slice of
[]KeyValue, or any of the slices within Key or Value, to return data.

Release note: None
  • Loading branch information
jordanlewis committed Jul 14, 2018
1 parent c93feab commit c07d6f7
Show file tree
Hide file tree
Showing 16 changed files with 917 additions and 451 deletions.
2 changes: 1 addition & 1 deletion docs/generated/settings/settings.html
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,6 @@
<tr><td><code>trace.debug.enable</code></td><td>boolean</td><td><code>false</code></td><td>if set, traces for recent requests can be seen in the /debug page</td></tr>
<tr><td><code>trace.lightstep.token</code></td><td>string</td><td><code></code></td><td>if set, traces go to Lightstep using this token</td></tr>
<tr><td><code>trace.zipkin.collector</code></td><td>string</td><td><code></code></td><td>if set, traces go to the given Zipkin instance (example: '127.0.0.1:9411'); ignored if trace.lightstep.token is set.</td></tr>
<tr><td><code>version</code></td><td>custom validation</td><td><code>2.0-9</code></td><td>set the active cluster version in the format '<major>.<minor>'.</td></tr>
<tr><td><code>version</code></td><td>custom validation</td><td><code>2.0-10</code></td><td>set the active cluster version in the format '<major>.<minor>'.</td></tr>
</tbody>
</table>
13 changes: 13 additions & 0 deletions pkg/kv/dist_sender.go
Original file line number Diff line number Diff line change
Expand Up @@ -516,6 +516,19 @@ func (ds *DistSender) initAndVerifyBatch(
}
}

// Make sure that MVCCScan requests aren't in batch form if our cluster
// version is too old.
// TODO(jordan): delete this stanza after 2.1 is released.
if !ds.st.Version.IsMinSupported(cluster.VersionBatchResponse) {
for i := range ba.Requests {
switch req := ba.Requests[i].GetInner().(type) {
case *roachpb.ScanRequest:
req.ScanFormat = roachpb.KEY_VALUES
case *roachpb.ReverseScanRequest:
req.ScanFormat = roachpb.KEY_VALUES
}
}
}
return nil
}

Expand Down
2 changes: 2 additions & 0 deletions pkg/roachpb/api.go
Original file line number Diff line number Diff line change
Expand Up @@ -256,6 +256,7 @@ func (sr *ScanResponse) combine(c combinable) error {
if sr != nil {
sr.Rows = append(sr.Rows, otherSR.Rows...)
sr.IntentRows = append(sr.IntentRows, otherSR.IntentRows...)
sr.BatchResponse = append(sr.BatchResponse, otherSR.BatchResponse...)
if err := sr.ResponseHeader.combine(otherSR.Header()); err != nil {
return err
}
Expand All @@ -271,6 +272,7 @@ func (sr *ReverseScanResponse) combine(c combinable) error {
if sr != nil {
sr.Rows = append(sr.Rows, otherSR.Rows...)
sr.IntentRows = append(sr.IntentRows, otherSR.IntentRows...)
sr.BatchResponse = append(sr.BatchResponse, otherSR.BatchResponse...)
if err := sr.ResponseHeader.combine(otherSR.Header()); err != nil {
return err
}
Expand Down
Loading

0 comments on commit c07d6f7

Please sign in to comment.