Skip to content

Commit

Permalink
Give a hint to IndexInput about slices that have a forward-only acc…
Browse files Browse the repository at this point in the history
…ess pattern.

This introduces a new API that allows directories to optimize access to
`IndexInput`s that have a forward-only access pattern by reading ahead of the
current position. It would be applicable to:
 - Postings lists,
 - Doc values,
 - Norms,
 - Points.

Relates apache#13179
  • Loading branch information
jpountz committed Jun 4, 2024
1 parent 801b822 commit 8a5cb8f
Show file tree
Hide file tree
Showing 3 changed files with 29 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -2068,9 +2068,9 @@ private void seekAndPrefetchPostings(IndexInput docIn, IntBlockTermState state)
// make sure to include some skip data.
docIn.prefetch(state.docStartFP, state.skipOffset + 1);
} else {
// Default case: prefetch the page that holds the first byte of postings. We'll prefetch
// skip data when we have evidence that it is used.
docIn.prefetch(state.docStartFP, 1);
// Default case: the postings list is long, instruct the index input to perform some
// read-ahead. We'll prefetch skip data when we have evidence that it is used.
docIn.readAhead(state.docStartFP, state.skipOffset);
}
}
// Note: we don't prefetch positions or offsets, which are less likely to be needed.
Expand Down
16 changes: 16 additions & 0 deletions lucene/core/src/java/org/apache/lucene/store/IndexInput.java
Original file line number Diff line number Diff line change
Expand Up @@ -203,4 +203,20 @@ public String toString() {
* @param length the number of bytes to prefetch
*/
public void prefetch(long offset, long length) throws IOException {}

/**
* Optional method: Give a hint to this input that some bytes will be read in the given range with
* a forward-only access pattern. Implementations may start reading the first bytes in the
* background immediately, and then dynamically read a few pages ahead of the current position to
* help make data available before it's needed.
*
* <p><b>NOTE</b>: This method may be called on long ranges of bytes. It is discouraged to
* prefetch everything at once.
*
* <p>The default implementation is a no-op.
*
* @param offset start offset
* @param length the number of bytes to prefetch
*/
public void readAhead(long offset, long length) throws IOException {}
}
Original file line number Diff line number Diff line change
Expand Up @@ -370,6 +370,16 @@ public void prefetch(long offset, long length) throws IOException {
}
}

@Override
public void readAhead(long offset, long length) throws IOException {
// Start loading the first bytes in the background
if (length != 0) {
prefetch(offset, 1);
}
// TODO: Is there a hint we can give to the OS to let it optimize for our forward-only access
// pattern in the given range?
}

@Override
public byte readByte(long pos) throws IOException {
try {
Expand Down

0 comments on commit 8a5cb8f

Please sign in to comment.