Fixing SortField comparison to use equals instead of reference equality #6901

reta · 2023-03-30T17:12:31Z

Description

Fixing SortField comparison to use equals instead of reference equality

Issues Resolved

N/A

Check List

New functionality includes testing.
- All tests pass
New functionality has been documented.
- New functionality has javadoc added
Commits are signed per the DCO using --signoff
Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

github-actions · 2023-03-30T17:47:52Z

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE ❌
URL: https://build.ci.opensearch.org/job/gradle-check/13114/
CommitID: da8fab0
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
Is the failure a flaky test unrelated to your change?

reta · 2023-03-30T17:52:18Z

server/src/main/java/org/opensearch/action/search/SearchPhaseController.java

@@ -636,7 +636,7 @@ private static Sort createSort(TopFieldDocs[] topFieldDocs) {
     */
    private static boolean isSortWideningRequired(TopFieldDocs[] topFieldDocs, int sortFieldindex) {
        for (int i = 0; i < topFieldDocs.length - 1; i++) {
-            if (topFieldDocs[i].fields[sortFieldindex] != topFieldDocs[i + 1].fields[sortFieldindex]) {
+            if (!topFieldDocs[i].fields[sortFieldindex].equals(topFieldDocs[i + 1].fields[sortFieldindex])) {


@gashutos mind taking a look? I found this one out while adding tests for multi index search queries

Oh, I remember I did test this with int & long and it worked fine.
The unit tests too are working fine, really is this the issue ?

Not exactly this one, but assume that index1 and index2 have same counter field with type long - in this case we will do unnecessary unwinding although the sort fields are the same. The test cases I added just to capture the ability to do sort across different types.

Oh got it, so correctness wise it was still giving right results ?

Yes, no visible effect

yea, that I got.
but ./gradlew check is having integ test failure for comparator type mismtach. That should work though..

oh got it, you mean we don't need this test 260_sort_mixed.yml ? I haven't found any rest spec tests for mixed indices search ... I thought it is good to have one

yea, its good to have. I was just curious why it is failing :)

you mean we don't need this test 260_sort_mixed.yml

No no, we need it, thank you for adding it...

@reta I think this is mixed cluster scenario which is failing in ./gradlew check.
Our old versions dont support mixing results between LONG/DOUBLE. We are specifying sortType Long for Long NumericType here, and SortType Double for Double numeric type. So it will return results in Long for those shards of index with Long mapping, while Double for another case.

The new changes we did as part of widening sort type handles it correctly but that will be in only latest version.
May be we can change mapping types from Long to Double to Int to Long or Float to Double to having them work in mixed cluster scenario.

The new changes we did as part of widening sort type handles it correctly but that will be in only latest version.

Correct, so I was adding the version check here:

- skip: version: " - 2.6.99" reason: relies on numeric sort optimization that landed in 2.7.0 only

So the test should be skipped for anything below 2.7.0 (we backported this change to 2.x so it should work for 3.0.0 and 2.7.0) but for some reasons, the test is run on versions it is not supposed to ... I am curious to find out why ....

Running the test on 2.x branch manually passed for me, looking into it ...

gashutos

Thanks for fixing this. @reta

github-actions · 2023-03-30T18:23:41Z

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE ❌
URL: https://build.ci.opensearch.org/job/gradle-check/13120/
CommitID: 43eb791
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
Is the failure a flaky test unrelated to your change?

github-actions · 2023-03-30T19:23:48Z

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE ❌
URL: https://build.ci.opensearch.org/job/gradle-check/13123/
CommitID: 43eb791
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
Is the failure a flaky test unrelated to your change?

github-actions · 2023-03-30T20:41:06Z

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE ❌
URL: https://build.ci.opensearch.org/job/gradle-check/13127/
CommitID: 3ae2b63
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
Is the failure a flaky test unrelated to your change?

github-actions · 2023-03-30T21:34:44Z

Gradle Check (Jenkins) Run Completed with:

RESULT: SUCCESS ✅
URL: https://build.ci.opensearch.org/job/gradle-check/13134/
CommitID: 02fa61b

codecov-commenter · 2023-03-30T21:41:32Z

Codecov Report

Merging #6901 (02fa61b) into main (cdab7f2) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@             Coverage Diff              @@
##               main    #6901      +/-   ##
============================================
- Coverage     70.72%   70.72%   -0.01%     
+ Complexity    59259    59246      -13     
============================================
  Files          4812     4812              
  Lines        283764   283750      -14     
  Branches      40918    40915       -3     
============================================
- Hits         200700   200689      -11     
+ Misses        66597    66523      -74     
- Partials      16467    16538      +71

Impacted Files	Coverage Δ
...pensearch/action/search/SearchPhaseController.java	`83.52% <100.00%> (-0.57%)`	⬇️
...va/org/opensearch/common/util/CollectionUtils.java	`78.87% <100.00%> (+0.80%)`	⬆️
...org/opensearch/index/mapper/BinaryFieldMapper.java	`80.00% <100.00%> (-0.25%)`	⬇️

... and 466 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

reta · 2023-03-31T16:33:02Z

rest-api-spec/src/main/resources/rest-api-spec/test/search/260_sort_mixed.yml

@@ -0,0 +1,65 @@
+"search across indices with mixed long and double numeric types":


@gashutos so I found out why test is flipping, this is very interesting.

The test only works on single node clusters (or, to generalize it, when search results come from single node without serialization being involved). The reason for that is that SortedNumericSortFields are deserialized by Apache Lucene as SortFields and that is how they come to SearchPhaseController. As you may guessed already, the sort optimization does not trigger anymore because instanceof SortedNumericSortField is not met (those are just SortField instances).

This is concerning @reta , The optimization will still get applied but while merging results, if it has different SortFiled.Type (INT & LONG from two different shards), it will fail the request -
Let me see what could be done here....

Yes, it will fail the request, which is essentially the "expected" behavior as per current implementation. So basically without the optimization - the test always fails, with optimization - passed on single node clusters.

But current implementation works for Int & Long indices. Since we used to widen sortType to Long for Int field too during shard level sort. That will break with this, I wasnt knowing serilization will trigger issue here hence didnt test on multinode scenario....
tagging @nknize too..

But current implementation works for Int & Long indices.

No, it does not (Lucene is typecasting), here is the prove:

1> Caused by: java.lang.ClassCastException: class java.lang.Long cannot be cast to class java.lang.Integer (java.lang.Long and java.lang.Integer are in module java.base of loader 'bootstrap') 1> at java.base/java.lang.Integer.compareTo(Integer.java:71) 1> at org.apache.lucene.search.FieldComparator.compareValues(FieldComparator.java:115)

@gashutos just to may be clarify, this is not the regression caused by optimization changes, it existed before and still exist in Elasticsearch 7.x release line.

@reta I verified that this is regression. To reproduce this, do below steps.

Ingest http_logs workload from OSB to multi node cluster (I created with 5 nodes)

Create an index with exactly same field mapping except change size field as long from int

Re-index any of the http_logs workload index to newly created index on 2nd step.

Sort by asc/desc order on size field with logs-* as an index.

Check this behaviour without code changes in Enable numeric sort optimization support for all numeric types #6424 (it works fine, while with Enable numeric sort optimization support for all numeric types #6424 it breaks).

I raised a PR to fix this behaviour, #6926 lets get that in before this PR.

👍 test case have been updated

github-actions · 2023-03-31T18:19:35Z

Gradle Check (Jenkins) Run Completed with:

RESULT: SUCCESS ✅
URL: https://build.ci.opensearch.org/job/gradle-check/13161/
CommitID: e838b95

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

github-actions · 2023-04-03T13:04:25Z

Gradle Check (Jenkins) Run Completed with:

RESULT: SUCCESS ✅
URL: https://build.ci.opensearch.org/job/gradle-check/13261/
CommitID: acaef08

reta · 2023-04-03T13:06:27Z

@nknize LGTM? :-)

…ty (#6901) Signed-off-by: Andriy Redko <andriy.redko@aiven.io> (cherry picked from commit 55936ac) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

…ty (#6901) (#6957) (cherry picked from commit 55936ac) Signed-off-by: Andriy Redko <andriy.redko@aiven.io> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

…ty (opensearch-project#6901) Signed-off-by: Andriy Redko <andriy.redko@aiven.io> Signed-off-by: Valentin Mitrofanov <mitrofmep@gmail.com>

reta added the skip-changelog label Mar 30, 2023

github-actions bot added the distinguished-contributor label Mar 30, 2023

reta force-pushed the fix.sort.widening branch from da8fab0 to 43eb791 Compare March 30, 2023 17:50

reta marked this pull request as ready for review March 30, 2023 17:50

reta requested review from anasalkouz, andrross, Bukhtawar, CEHENKLE, dblock, gbbafna, setiah, kartg, kotwanikunal, mch2, nknize, owaiskazi19, Rishikesh1159, ryanbogan, saratvemulapalli, shwetathareja, dreamer-89, tlfeng, VachaShah and xuezhou25 as code owners March 30, 2023 17:50

reta commented Mar 30, 2023

View reviewed changes

gashutos approved these changes Mar 30, 2023

View reviewed changes

reta force-pushed the fix.sort.widening branch from 43eb791 to 3ae2b63 Compare March 30, 2023 20:08

reta mentioned this pull request Mar 30, 2023

Introduce new 'unsigned_long' numeric field type support opensearch-project/documentation-website#3585

Merged

1 task

reta force-pushed the fix.sort.widening branch from 3ae2b63 to 02fa61b Compare March 30, 2023 21:03

gashutos approved these changes Mar 30, 2023

View reviewed changes

reta commented Mar 31, 2023

View reviewed changes

reta force-pushed the fix.sort.widening branch from 02fa61b to e838b95 Compare March 31, 2023 17:49

gashutos mentioned this pull request Apr 1, 2023

Fix sort in case of different numeric field types between indices #6926

Merged

6 tasks

Fixing SortField comparison to use equals instead of reference equality

acaef08

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

reta force-pushed the fix.sort.widening branch from e838b95 to acaef08 Compare April 3, 2023 12:35

dblock approved these changes Apr 3, 2023

View reviewed changes

dblock merged commit 55936ac into opensearch-project:main Apr 3, 2023

dblock added the backport 2.x Backport to 2.x branch label Apr 3, 2023

opensearch-trigger-bot bot mentioned this pull request Apr 3, 2023

[Backport 2.x] Fixing SortField comparison to use equals instead of reference equality #6957

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing SortField comparison to use equals instead of reference equality #6901

Fixing SortField comparison to use equals instead of reference equality #6901

reta commented Mar 30, 2023 •

edited

Loading

github-actions bot commented Mar 30, 2023

reta Mar 30, 2023

gashutos Mar 30, 2023

reta Mar 30, 2023 •

edited

Loading

gashutos Mar 30, 2023

reta Mar 30, 2023

gashutos Mar 30, 2023

reta Mar 30, 2023 •

edited

Loading

gashutos Mar 30, 2023 •

edited

Loading

gashutos Mar 30, 2023 •

edited

Loading

reta Mar 30, 2023 •

edited

Loading

gashutos left a comment •

edited

Loading

github-actions bot commented Mar 30, 2023

github-actions bot commented Mar 30, 2023

github-actions bot commented Mar 30, 2023

github-actions bot commented Mar 30, 2023

codecov-commenter commented Mar 30, 2023

reta Mar 31, 2023 •

edited

Loading

gashutos Mar 31, 2023

reta Mar 31, 2023 •

edited

Loading

gashutos Mar 31, 2023

reta Mar 31, 2023

reta Mar 31, 2023

gashutos Apr 1, 2023 •

edited

Loading

reta Apr 3, 2023

github-actions bot commented Mar 31, 2023

github-actions bot commented Apr 3, 2023

reta commented Apr 3, 2023

		@@ -0,0 +1,65 @@
		"search across indices with mixed long and double numeric types":

Fixing SortField comparison to use equals instead of reference equality #6901

Fixing SortField comparison to use equals instead of reference equality #6901

Conversation

reta commented Mar 30, 2023 • edited Loading

Description

Issues Resolved

Check List

github-actions bot commented Mar 30, 2023

Gradle Check (Jenkins) Run Completed with:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reta Mar 30, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reta Mar 30, 2023 • edited Loading

Choose a reason for hiding this comment

gashutos Mar 30, 2023 • edited Loading

Choose a reason for hiding this comment

gashutos Mar 30, 2023 • edited Loading

Choose a reason for hiding this comment

reta Mar 30, 2023 • edited Loading

Choose a reason for hiding this comment

gashutos left a comment • edited Loading

Choose a reason for hiding this comment

github-actions bot commented Mar 30, 2023

Gradle Check (Jenkins) Run Completed with:

github-actions bot commented Mar 30, 2023

Gradle Check (Jenkins) Run Completed with:

github-actions bot commented Mar 30, 2023

Gradle Check (Jenkins) Run Completed with:

github-actions bot commented Mar 30, 2023

Gradle Check (Jenkins) Run Completed with:

codecov-commenter commented Mar 30, 2023

Codecov Report

reta Mar 31, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reta Mar 31, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gashutos Apr 1, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Mar 31, 2023

Gradle Check (Jenkins) Run Completed with:

github-actions bot commented Apr 3, 2023

Gradle Check (Jenkins) Run Completed with:

reta commented Apr 3, 2023

reta commented Mar 30, 2023 •

edited

Loading

reta Mar 30, 2023 •

edited

Loading

reta Mar 30, 2023 •

edited

Loading

gashutos Mar 30, 2023 •

edited

Loading

gashutos Mar 30, 2023 •

edited

Loading

reta Mar 30, 2023 •

edited

Loading

gashutos left a comment •

edited

Loading

reta Mar 31, 2023 •

edited

Loading

reta Mar 31, 2023 •

edited

Loading

gashutos Apr 1, 2023 •

edited

Loading