-
Notifications
You must be signed in to change notification settings - Fork 411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OAK-10617: oak-search-elastic fix deadlock with includePathRestrictions=false and multiple filtered results #1276
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor comment, unrelated to the changes in this PR.
The queue with the search results is unbounded. In general I find unbounded queues to be unsafe, as they can potentially lead to OOME. If the consumer are very slow, the producer will keep calling scan()
to get more results and the queue will keep increasing. It would be safer to have a bounded queue, like an ArrayBlockingQueue
, forcing the producer to block when the queue is full.
...g/apache/jackrabbit/oak/plugins/index/elastic/query/async/ElasticResultRowAsyncIterator.java
Outdated
Show resolved
Hide resolved
@nfsantos the LinkedBlockingQueue is now bounded to the maximum number of results we can read. I prefer not to change it with an |
...g/apache/jackrabbit/oak/plugins/index/elastic/query/async/ElasticResultRowAsyncIterator.java
Outdated
Show resolved
Hide resolved
…process at least one element
...g/apache/jackrabbit/oak/plugins/index/elastic/query/async/ElasticResultRowAsyncIterator.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change atm might be considered a breaking change imo.
If we know for sure that this interface is not implemented outside oak, we should probably mark it with @ProviderType.
…ns=false and multiple filtered results (#1276) * OAK-10617: oak-search-elastic fix deadlock with includePathRestrictions=false and multiple filtered results * OAK-10617: always apply path restrictions in ES * OAK-10617: remove use of guava from modified classes * OAK-10617: updated docs * OAK-10617: simplify logic to check if all listeners have process at least one element * OAK-10617: async iterator queue is now bounded to the read limit * OAK-10617: small improvement in logic to check if all listeners have process at least one element * OAK-10617: minor improvement * OAK-10617: minor improvement: use a BitSet to keep track of search hit listeners * OAK-10617: annotate ElasticResponseListener with @ProviderType
This pull request addresses a potential deadlock issue in Elastic that may arise when an index definition has
includePathRestrictions=false
and a query with a large number of results is subjected to filtering.The occurrence of this problem is attributed to the utilization of two threads during query execution. The first thread executes an asynchronous query in Elastic, waiting on a blocking queue for results, while the second thread reads the response and appends the results to the queue. This design facilitates the concurrent processing of results across multiple pages, allowing the initiation of subsequent result processing while the first thread handles the previous page.
However, when includePathRestrictions is set to false and a response exclusively contains filtered paths, the queue remains unpopulated, resulting in a process deadlock.
While path restrictions are always applicable in Elastic (ancestors are always persisted), certain scenarios (e.g., nodes requiring path transformations) may lead to result filtering. The proposed solution addresses this issue by continuously scanning results in situations where listeners are not able to process results within a page.