Very large scroll search (i.e. reindex) can gradually slow down #65780
Labels
>bug
:Distributed Indexing/Reindex
Issues relating to reindex that are not caused by issues further down
:Search/Search
Search-related issues that do not fall into other categories
Team:Distributed (Obsolete)
Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.
Team:Search
Meta label for search team
Since 7.7 (via this PR) added better ability to cancel a search request. However, this resulted in adding a method to cancel a task to a collection on the context searcher. That collection is checked very frequently and the count of that collection can grow unbounded. The memory footprint is not an issue, rather the number of iterations for very long running scroll searches, such as used by re-index. In testing this started to show an issue around 50m documents and kept increasing the search latency as time went on.
Below is a test run of 180m documents being re-index that show the increase in the search latency and decrease in the search rate.
(7.9.1)
![image](https://user-images.githubusercontent.com/976291/100934058-cf801480-34b3-11eb-9056-93d93d47382a.png)
Hot threads will look similar to:
This issue is fixed as 7.10.0 due to #61062 and #46523 which will now re-create the searcher on each phase even for scroll requests. Which means that this collection will grow unbounded anymore. The same test above was run on 7.10.0 and did not show any signs of performance degradation.
For 7.7 -> 7.9.x there is an easy work around to for this issue:
Which will will prevent that collection from even being used. (also tested to fix the issue).
The text was updated successfully, but these errors were encountered: