Proposed:
- We have one.
- It implements an aligned page block design, as done in Azure WASB (and apparently in google GS).
- We choose a page block size which works well with ORC and Parquet read patterns.
- We could consider an async prefetch of the next page, assuming a forward read pattern.
The strengths of this design are
- Reduced cost/time of reading data already in page.
- A shared design across stores means one implementation to get correct.
- And one implementation to tune for the IO Patterns which can be collected from traces of queries.
If/when a vectored IO input stream API is added, we need to plan for it to work with this.
Obvious first step: if in cache, return immediately.
But what if it is not in that cache, or only partially in cache?
- read new pages in and then fill in response. Best if subsequent reads will be using the same pages.
- Or: assume app really knows what it is doing and bypass the cache to retrieve all missed data. That is: only ask for the ranges and don't cache.
Strategy #1 would seem best if the subsequent reads were likely to read from the adjacent bits of the file. We are assuming locality of reads.
For an seek+read input source this probably holds, at least with non-columnar data sources.
Strategy #2 says: the read patten is explictly declared in the API sequence.
That is: if there is locality, it will be included from the list of read operations, and if not so declared, there is no locality.