util/chunk: optimize (*ListInDisk).GetChunk
and add a fast row container reader (#45130)
#45205
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is an automated cherry-pick of #45130
What problem does this PR solve?
Issue Number: close #45125
Problem Summary:
The existing reading method of
RowContainer
(GetChunk(...)
) is not fast enough for dumping a lot of rows from disk (for thecursorFetch
use case).The existing
Iterator4RowContainer
is even slower, as it allocates a new chunk for each row 🤦.This PR is extracted from #44730 (with a some refractor).
What is changed and how it works?
This PR pipelines the IO and CPU calculation, to make full use of the IO bandwidth. It should also help other features using
rowContainer
, asGetChunk
is now much faster.The performance of existing benchmark
BenchmarkListInDisk_GetChunk
increases from2877471ns/op
to462864ns/op
Check List
Tests