Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

org.apache.lucene.search.TestFloatVectorSimilarityQuery.testVectorsAboveSimilarity fails intermittently #12955

Closed
ChrisHegarty opened this issue Dec 18, 2023 · 3 comments

Comments

@ChrisHegarty
Copy link
Contributor

Description

org.apache.lucene.search.TestFloatVectorSimilarityQuery > testVectorsAboveSimilarity FAILED
java.lang.AssertionError: expected:<45> but was:<44>
at __randomizedtesting.SeedInfo.seed([1A1CDC0974AF361:984DEFE40C7C94CA]:0)
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.failNotEquals(Assert.java:835)
at org.junit.Assert.assertEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:633)
at org.apache.lucene.search.BaseVectorSimilarityQueryTestCase.testVectorsAboveSimilarity(BaseVectorSimilarityQueryTestCase.java:406)
at org.apache.lucene.search.TestFloatVectorSimilarityQuery.testVectorsAboveSimilarity(TestFloatVectorSimilarityQuery.java:26)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
at java.base/java.lang.Thread.run(Thread.java:833)

Gradle command to reproduce

gradlew :lucene:core:test --tests "org.apache.lucene.search.TestFloatVectorSimilarityQuery.testVectorsAboveSimilarity" -Ptests.jvms=12 -Ptests.jvmargs= -Ptests.seed=1A1CDC0974AF361

@kaivalnp
Copy link
Contributor

This test checks whether the FloatVectorSimilarityQuery can surface all vectors above a resultSimilarity

The test fails because it expects 45 results to be found, but actually finds 44. This should ideally not be possible, because we set a traversalSimilarity of Float.NEGATIVE_INFINITY -- so all nodes in the HNSW graph with a score above -Infinity will be traversed (which is all nodes in the HNSW graph)

I suspect this has something to do with a disconnected graph, where one of the nodes is a valid result but not reachable. To demonstrate this, I wrote a snippet that calculates the nodes reachable from the entry point, the doc that was missed from results, and the doc and score of unreachable nodes. Here is the result (using the repro command above):

  1> Similarity = 0.097851
  1> Total Nodes = 131, Reachable Nodes = 129
  1> 
  1> Unreachable Nodes:
  1> Doc = 76, Score = 0.103340
  1> Doc = 117, Score = 0.083347
  1> 
  1> Missed Docs = {76=0.103340186}

Looks like the missed doc is unreachable..

I suspect a similar case is possible for KNN search as well (for example this test case) -- where we index random vectors and search for a random topK, not sure if we have seen such failures there?

I also see an open issue for graph disconnectedness: #12627

As for the fix here, we can try something like a lower number of dimensions or lower number of vectors, but the issue will only get less common until a permanent solution is found?

@benwtrent
Copy link
Member

@kaivalnp this does indeed seem related to disconnectedness. That is a larger effort. I would suggest updating the graph parameters for this particular test to reduce the chance of the failure.

@kaivalnp
Copy link
Contributor

kaivalnp commented Jan 2, 2024

Makes sense @benwtrent.. Opened #12988 to fix this

benwtrent pushed a commit that referenced this issue Jan 2, 2024
…#12988)

### Description

Identified in #12955, where `TestFloatVectorSimilarityQuery.testVectorsAboveSimilarity` fails because of a disconnected HNSW graph

This is a bigger issue, but we can reduce intermittent failures by keeping the number of docs and dimensions same as [`BaseKnnVectorQueryTestCase.testRandom`](https://github.com/apache/lucene/blob/dc9f154aa574e8cd0e60070a1814c1d221fbec5d/lucene/core/src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java#L470) (similar test for KNN with random vectors)

### Command to reproduce

```
./gradlew :lucene:core:test --tests "org.apache.lucene.search.TestFloatVectorSimilarityQuery.testVectorsAboveSimilarity" -Ptests.jvms=12 -Ptests.jvmargs= -Ptests.seed=1A1CDC0974AF361
```
slow-J pushed a commit to slow-J/lucene that referenced this issue Jan 16, 2024
…apache#12988)

### Description

Identified in apache#12955, where `TestFloatVectorSimilarityQuery.testVectorsAboveSimilarity` fails because of a disconnected HNSW graph

This is a bigger issue, but we can reduce intermittent failures by keeping the number of docs and dimensions same as [`BaseKnnVectorQueryTestCase.testRandom`](https://github.com/apache/lucene/blob/dc9f154aa574e8cd0e60070a1814c1d221fbec5d/lucene/core/src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java#L470) (similar test for KNN with random vectors)

### Command to reproduce

```
./gradlew :lucene:core:test --tests "org.apache.lucene.search.TestFloatVectorSimilarityQuery.testVectorsAboveSimilarity" -Ptests.jvms=12 -Ptests.jvmargs= -Ptests.seed=1A1CDC0974AF361
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants