Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: MinHash token filter parameters not working #15233

Merged
merged 3 commits into from
Aug 16, 2024

Conversation

mridehalgh
Copy link
Contributor

Description

The MinHash token filter has four configurable parameters: bucket_count, hash_count, hash_set_size, and with_rotation. Currently both bucket_count and hash_set_size have no effect, because there appears to be a bug in the code that interfaces with the underlying Lucene class.

Notice the camel-case bucketCount and hashSetSize in the hasValue lines. These should be bucket_count and hash_set_size, since those are the parameter names that would appear (and be fetched on the following lines) from the settings object.

Related Issues

Resolves #15183

Check List

  • Functionality includes testing.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for 83afab7: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 59dc3d8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@dblock
Copy link
Member

dblock commented Aug 13, 2024

Thank you! Would you be able to check that this (and maybe other) token filter(s) are present in https://github.com/opensearch-project/opensearch-api-specification and add a test for this one in that repo?

Do update CHANGELOG in this PR, this is a bug fix.

Copy link
Contributor

✅ Gradle check result for 423b829: SUCCESS

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a CHANGELOG update, thx.

Copy link

codecov bot commented Aug 13, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.89%. Comparing base (cbe7921) to head (3905dc3).
Report is 2 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #15233      +/-   ##
============================================
+ Coverage     71.82%   71.89%   +0.06%     
- Complexity    63046    63103      +57     
============================================
  Files          5207     5207              
  Lines        295581   295581              
  Branches      42690    42690              
============================================
+ Hits         212295   212495     +200     
+ Misses        65875    65689     -186     
+ Partials      17411    17397      -14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mridehalgh
Copy link
Contributor Author

Needs a CHANGELOG update, thx.

@dblock - thanks for the quick feedback. Added changelog entry.

@mridehalgh mridehalgh requested a review from dblock August 14, 2024 10:48
Copy link
Contributor

❕ Gradle check result for e6a3822: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@dblock dblock added the backport 2.x Backport to 2.x branch label Aug 14, 2024
Copy link
Contributor

❌ Gradle check result for 3df3b0b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 3df3b0b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for 3df3b0b: SUCCESS

@dblock
Copy link
Member

dblock commented Aug 15, 2024

Thank you! Would you be able to check that this (and maybe other) token filter(s) are present in https://github.com/opensearch-project/opensearch-api-specification and add a test for this one in that repo?

@mridehalgh while this is getting rebased, care to take a look at ^ ?

Copy link
Contributor

❌ Gradle check result for 26993dc: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 26993dc: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@dblock
Copy link
Member

dblock commented Aug 16, 2024

❌ Gradle check result for 26993dc: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

I don't understand why this failed. @mridehalgh care to rebase to get the latest?

Signed-off-by: Matt Ridehalgh <mrideh@amazon.co.uk>
Signed-off-by: Matt Ridehalgh <mrideh@amazon.co.uk>
Signed-off-by: Matt Ridehalgh <mrideh@amazon.co.uk>
@mridehalgh
Copy link
Contributor Author

❌ Gradle check result for 26993dc: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

I don't understand why this failed. @mridehalgh care to rebase to get the latest?

@dblock neither do I. Rebasing :)

Copy link
Contributor

✅ Gradle check result for 3905dc3: SUCCESS

@dblock dblock merged commit 9661e8d into opensearch-project:main Aug 16, 2024
34 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-15233-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 9661e8db62e8897cb53b0bc1860ead43b28dfa21
# Push it to GitHub
git push --set-upstream origin backport/backport-15233-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-15233-to-2.x.

@dblock
Copy link
Member

dblock commented Aug 16, 2024

@mridehalgh Looks like it will need a manual backport to 2.x please?

mridehalgh added a commit to mridehalgh/OpenSearch that referenced this pull request Aug 19, 2024
…15233)

* fix: minhash configuration

Signed-off-by: Matt Ridehalgh <mrideh@amazon.co.uk>

* style: linting

Signed-off-by: Matt Ridehalgh <mrideh@amazon.co.uk>

* chore: update CHANGELOG.md

Signed-off-by: Matt Ridehalgh <mrideh@amazon.co.uk>

---------

Signed-off-by: Matt Ridehalgh <mrideh@amazon.co.uk>
(cherry picked from commit 9661e8d)
wdongyu pushed a commit to wdongyu/OpenSearch that referenced this pull request Aug 22, 2024
…15233)

* fix: minhash configuration

Signed-off-by: Matt Ridehalgh <mrideh@amazon.co.uk>

* style: linting

Signed-off-by: Matt Ridehalgh <mrideh@amazon.co.uk>

* chore: update CHANGELOG.md

Signed-off-by: Matt Ridehalgh <mrideh@amazon.co.uk>

---------

Signed-off-by: Matt Ridehalgh <mrideh@amazon.co.uk>
akolarkunnu pushed a commit to akolarkunnu/OpenSearch that referenced this pull request Sep 10, 2024
…15233)

* fix: minhash configuration

Signed-off-by: Matt Ridehalgh <mrideh@amazon.co.uk>

* style: linting

Signed-off-by: Matt Ridehalgh <mrideh@amazon.co.uk>

* chore: update CHANGELOG.md

Signed-off-by: Matt Ridehalgh <mrideh@amazon.co.uk>

---------

Signed-off-by: Matt Ridehalgh <mrideh@amazon.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed bug Something isn't working Indexing Indexing, Bulk Indexing and anything related to indexing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] MinHash token filter parameters not working
2 participants