-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize Prefixes and Merges #5124
Conversation
/bench workloads/hackernews-add-new-documents.json workloads/hackernews-modify-* |
/bench workloads/movies.json workloads/hackernews.json |
☀️ Benchmark invocation completed, please find the results for your workloads below: |
bors merge |
bors merge |
Already running a review |
I want to change the name of the |
bors cancel |
Canceled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors merge
bors merge |
Already running a review |
5125: Change the default max memory usage to 5% of the total memory r=ManyTheFish a=Kerollmops After thorough testing, we found that giving 5% of the total available memory to allocate resident memory (caches and channels) is the best approach. The main reason is that the new indexer is highly memory-map oriented, with LMDB, and reads the database while performing the indexation. So, by allowing the maximum amount of memory available to LMDB and the OS, it will perform the key-value store reads and all other indexation operations faster by keeping more pages hot in the cache. In #5124, we also sorted the entries to merge to improve the read speed of LMDB. This is common in database management systems: Reading stuff on the disk is much faster when done in lexicographic order (the default sorted order of key values). The entries have a great chance of already being in the OS memory cache, as they were loaded in a previous read, and reading stuff on the disk is very slow compared to reading memory. Co-authored-by: Kerollmops <clement@meilisearch.com>
In this PR, we plan to optimize the read of LMDB to use read the entries in lexicographic order and better use the memory-mapping OS cache:
Benchmarks on 1cpu 2gb gpo3 (5k IOps)
Before on the tag meilisearch-v1.12.0-rc.3.
After sorting the whole
HashMap
s in aVec
on this branch.