Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check ahead if we can get the count #13899

Merged
merged 5 commits into from
Oct 25, 2024
Merged

Conversation

LuXugang
Copy link
Member

Check whether the IndexSortSortedNumericDocValuesRangeQuery's count can be obtained before traversing the BKD tree or performing binary search using DocValues.

see #13890

@LuXugang LuXugang requested a review from jpountz October 12, 2024 06:45
@LuXugang LuXugang linked an issue Oct 12, 2024 that may be closed by this pull request
Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic makes sense to me but it's a bit hard to read, could we avoid touching getDocIdSetIteratorOrNull and only have new logic in the Weight#count impl?

@LuXugang
Copy link
Member Author

The logic makes sense to me but it's a bit hard to read, could we avoid touching getDocIdSetIteratorOrNull and only have new logic in the Weight#count impl?

Thank you for your feedback! @jpountz I really appreciate your suggestion. I’ve made the changes as you recommended

@LuXugang LuXugang requested a review from jpountz October 24, 2024 08:04
Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the code makes sense and I see that we already have test coverage for counting for our various optimizations.

IteratorAndCount itAndCount = getDocIdSetIteratorOrNull(context);
if (lowerValue > upperValue) {
return 0;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be moved before the check of whether the segment has deletes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if lowerValue > upperValue and has deletes, PointRangeQuery's count will reture -1.
I’ll temporarily maintain the consistency of the two queries for now. Later, we can consider whether to add this optimization to the PointRangeQuery. @jpountz

update
@LuXugang LuXugang merged commit 0bbef32 into apache:main Oct 25, 2024
3 checks passed
LuXugang added a commit that referenced this pull request Oct 25, 2024
Currently, we traverse the BKD tree or perform a binary search using DocValues first, and then check whether the count can be obtained in the count() method of IndexSortSortedNumericDocValuesRangeQuery.

we should consider providing a mechanism to perform this check beforehand, avoid unnecessary processing when dealing with a sparseRange
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Check ahead of time if the count can be obtained
2 participants