Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing SortField comparison to use equals instead of reference equality #6901

Merged
merged 1 commit into from
Apr 3, 2023

Conversation

reta
Copy link
Collaborator

@reta reta commented Mar 30, 2023

Description

Fixing SortField comparison to use equals instead of reference equality

Issues Resolved

N/A

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@@ -636,7 +636,7 @@ private static Sort createSort(TopFieldDocs[] topFieldDocs) {
*/
private static boolean isSortWideningRequired(TopFieldDocs[] topFieldDocs, int sortFieldindex) {
for (int i = 0; i < topFieldDocs.length - 1; i++) {
if (topFieldDocs[i].fields[sortFieldindex] != topFieldDocs[i + 1].fields[sortFieldindex]) {
if (!topFieldDocs[i].fields[sortFieldindex].equals(topFieldDocs[i + 1].fields[sortFieldindex])) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gashutos mind taking a look? I found this one out while adding tests for multi index search queries

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I remember I did test this with int & long and it worked fine.
The unit tests too are working fine, really is this the issue ?

Copy link
Collaborator Author

@reta reta Mar 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly this one, but assume that index1 and index2 have same counter field with type long - in this case we will do unnecessary unwinding although the sort fields are the same. The test cases I added just to capture the ability to do sort across different types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh got it, so correctness wise it was still giving right results ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, no visible effect

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, that I got.
but ./gradlew check is having integ test failure for comparator type mismtach. That should work though..

Copy link
Collaborator Author

@reta reta Mar 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh got it, you mean we don't need this test 260_sort_mixed.yml ? I haven't found any rest spec tests for mixed indices search ... I thought it is good to have one

Copy link
Contributor

@gashutos gashutos Mar 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, its good to have. I was just curious why it is failing :)

you mean we don't need this test 260_sort_mixed.yml

No no, we need it, thank you for adding it...

Copy link
Contributor

@gashutos gashutos Mar 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reta I think this is mixed cluster scenario which is failing in ./gradlew check.
Our old versions dont support mixing results between LONG/DOUBLE. We are specifying sortType Long for Long NumericType here, and SortType Double for Double numeric type. So it will return results in Long for those shards of index with Long mapping, while Double for another case.

The new changes we did as part of widening sort type handles it correctly but that will be in only latest version.
May be we can change mapping types from Long to Double to Int to Long or Float to Double to having them work in mixed cluster scenario.

Copy link
Collaborator Author

@reta reta Mar 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new changes we did as part of widening sort type handles it correctly but that will be in only latest version.

Correct, so I was adding the version check here:

  - skip:
      version: " - 2.6.99"
      reason: relies on numeric sort optimization that landed in 2.7.0 only

So the test should be skipped for anything below 2.7.0 (we backported this change to 2.x so it should work for 3.0.0 and 2.7.0) but for some reasons, the test is run on versions it is not supposed to ... I am curious to find out why ....

Running the test on 2.x branch manually passed for me, looking into it ...

Copy link
Contributor

@gashutos gashutos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this. @reta

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@codecov-commenter
Copy link

Codecov Report

Merging #6901 (02fa61b) into main (cdab7f2) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@             Coverage Diff              @@
##               main    #6901      +/-   ##
============================================
- Coverage     70.72%   70.72%   -0.01%     
+ Complexity    59259    59246      -13     
============================================
  Files          4812     4812              
  Lines        283764   283750      -14     
  Branches      40918    40915       -3     
============================================
- Hits         200700   200689      -11     
+ Misses        66597    66523      -74     
- Partials      16467    16538      +71     
Impacted Files Coverage Δ
...pensearch/action/search/SearchPhaseController.java 83.52% <100.00%> (-0.57%) ⬇️
...va/org/opensearch/common/util/CollectionUtils.java 78.87% <100.00%> (+0.80%) ⬆️
...org/opensearch/index/mapper/BinaryFieldMapper.java 80.00% <100.00%> (-0.25%) ⬇️

... and 466 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@@ -0,0 +1,65 @@
"search across indices with mixed long and double numeric types":
Copy link
Collaborator Author

@reta reta Mar 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gashutos so I found out why test is flipping, this is very interesting.

The test only works on single node clusters (or, to generalize it, when search results come from single node without serialization being involved). The reason for that is that SortedNumericSortFields are deserialized by Apache Lucene as SortFields and that is how they come to SearchPhaseController. As you may guessed already, the sort optimization does not trigger anymore because instanceof SortedNumericSortField is not met (those are just SortField instances).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is concerning @reta , The optimization will still get applied but while merging results, if it has different SortFiled.Type (INT & LONG from two different shards), it will fail the request -
Let me see what could be done here....

Copy link
Collaborator Author

@reta reta Mar 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will fail the request, which is essentially the "expected" behavior as per current implementation. So basically without the optimization - the test always fails, with optimization - passed on single node clusters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But current implementation works for Int & Long indices. Since we used to widen sortType to Long for Int field too during shard level sort. That will break with this, I wasnt knowing serilization will trigger issue here hence didnt test on multinode scenario....
tagging @nknize too..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But current implementation works for Int & Long indices.

No, it does not (Lucene is typecasting), here is the prove:

  1> Caused by: java.lang.ClassCastException: class java.lang.Long cannot be cast to class java.lang.Integer (java.lang.Long and java.lang.Integer are in module java.base of loader 'bootstrap')
  1>    at java.base/java.lang.Integer.compareTo(Integer.java:71)                                                                                                                             
  1>    at org.apache.lucene.search.FieldComparator.compareValues(FieldComparator.java:115)   

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gashutos just to may be clarify, this is not the regression caused by optimization changes, it existed before and still exist in Elasticsearch 7.x release line.

Copy link
Contributor

@gashutos gashutos Apr 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reta I verified that this is regression. To reproduce this, do below steps.

  1. Ingest http_logs workload from OSB to multi node cluster (I created with 5 nodes)
  2. Create an index with exactly same field mapping except change size field as long from int
  3. Re-index any of the http_logs workload index to newly created index on 2nd step.
  4. Sort by asc/desc order on size field with logs-* as an index.
  5. Check this behaviour without code changes in Enable numeric sort optimization support for all numeric types #6424 (it works fine, while with Enable numeric sort optimization support for all numeric types #6424 it breaks).

I raised a PR to fix this behaviour, #6926 lets get that in before this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 test case have been updated

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
@github-actions
Copy link
Contributor

github-actions bot commented Apr 3, 2023

Gradle Check (Jenkins) Run Completed with:

@reta
Copy link
Collaborator Author

reta commented Apr 3, 2023

@nknize LGTM? :-)

@dblock dblock merged commit 55936ac into opensearch-project:main Apr 3, 2023
@dblock dblock added the backport 2.x Backport to 2.x branch label Apr 3, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Apr 3, 2023
…ty (#6901)

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
(cherry picked from commit 55936ac)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
dblock pushed a commit that referenced this pull request Apr 4, 2023
…ty (#6901) (#6957)

(cherry picked from commit 55936ac)

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
mitrofmep pushed a commit to mitrofmep/OpenSearch that referenced this pull request Apr 5, 2023
…ty (opensearch-project#6901)

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
Signed-off-by: Valentin Mitrofanov <mitrofmep@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants