-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle range query edge case #63397
Handle range query edge case #63397
Conversation
Currently when searching with an empty string as lower bound for a range query on text-based fields we return all documents when 'gte' is used (including the lower bound) but no documents when 'gt' is used. This might seem counterintuitive since every value should be greate than the empty string. This PR fixed this special edge case by implicitely setting the "lower" include flag in this case before constructing the TermRangeQuery. Closes elastic#63386
Pinging @elastic/es-search (:Search/Search) |
I'm confused, why is an empty BytesRef not comparing as strictly less than any other term? |
I believe I tracked it down to https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/util/automaton/Automata.java#L257 where in case |
There are even safeguard this doesn't happen for 'null' values in the beginning of that method. I decided to but the change into |
I think this is a bug at the lucene level - if max is null and min is of length 0, then we should be returning a match-any binary no matter what the 'includeLower' value is, surely? |
That what I wasn't sure about. I opted fixing on our side but if you think this is a Lucene bug I can open a fix there as well. |
+1 |
@romseygeek I opened apache/lucene-solr#1976 with a fix in the |
After apache/lucene-solr#1976 is merged, I'm keeping this PR open to add tests that confirm the fix is also used by ES. This will happen once we move to a current 8.7 snapshot or release. I expect for SimpleSearchIT to fail until then but pass once we merge the fix in. |
@elasticmachine update branch |
@elasticmachine update branch |
@elasticmachine run elasticsearch-ci/packaging-sample-windows |
@romseygeek this has been fixed in apache/lucene-solr#1976 which we use on master and 7.x now. I'd just like to add this test on our side to verify and test this behaviour going forward. Would you mind taking a quick look if you agree with adding this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for handling this @cbuescher
Currently when searching with an empty string as lower bound for a range query on text-based fields we return all documents when 'gte' is used (including the lower bound) but no documents when 'gt' is used. This might seem counterintuitive since every value should be greate than the empty string. The bug has been fixed in Lucene and this PR adds a test for assuring we observe the fixed behaviour on searches now. Closes elastic#63386
Currently when searching with an empty string as lower bound for a range query on text-based fields we return all documents when 'gte' is used (including the lower bound) but no documents when 'gt' is used. This might seem counterintuitive since every value should be greate than the empty string. The bug has been fixed in Lucene and this PR adds a test for assuring we observe the fixed behaviour on searches now. Closes elastic#63386
Currently when searching with an empty string as lower bound for a range query on text-based fields we return all documents when 'gte' is used (including the lower bound) but no documents when 'gt' is used. This might seem counterintuitive since every value should be greate than the empty string. The bug has been fixed in Lucene and this PR adds a test for assuring we observe the fixed behaviour on searches now. Closes #63386
Currently when searching with an empty string as lower bound for a range query
on text-based fields we return all documents when 'gte' is used (including the
lower bound) but no documents when 'gt' is used. This might seem
counterintuitive since every value should be greate than the empty string. This
PR fixed this special edge case by implicitely setting the "lower" include flag
in this case before constructing the TermRangeQuery.
Closes #63386