-
Notifications
You must be signed in to change notification settings - Fork 734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NEW] More refined memory management system #1792
Comments
I like this idea. Should we track dataset memory accurately then? Currently, we don't track it accurately. We just infer it as total memory minus overhead. In #852 there is a suggestion to track memory per slot. The total dataset memory is the sum for all slots. |
Nice idea @soloestoy. I've a bit of apprehension though. In terms of usability, won't it become a nightmare to determine the correct value for each of these ? And based on the workload behavior / different scenarios within the lifecycle of a given application, it would require different value at different point of time. For e.g., sometimes users are not aware of the increasing size of the values and after a certain period of time, client output buffer starts facing overflow scenarios when there is a read heavy workload. Also, a sudden burst in write traffic could cause replication buffer overflow. Even though this would give us fine grained control, the overhead of understanding each of these knobs and managing it well enough seems quite a difficult task for administrators. |
@kyle-yh-kim Just to explicitly bring you into the conversation. We talked about reviving the conversation for Valkey 9.0. |
Thanks for the detailed write up @soloestoy! It is a great observation that memory in our case serves two roles. It is both storage for the user data (the disk for other databases) and the resource to support user requests. Being able to express and manage the two roles explicitly IMO brings more clarity to memory management. I am directionally aligned. @hwware what do you think about pausing #831 and tidy'ing up our memory management first? |
Agree with you. Let's change our direction to this issue. |
@hpatro thank you for raising these points. Yes, there is indeed a contradiction here: finer-grained memory management requires more configuration, which introduces learning costs for users. We need to strike a balance and implement changes in phases.
In fact, many aspects of this granular memory management are already underway. I believe this systematic approach is essential to evolve Valkey into a more robust and production-ready database. |
Read through everything now, and I agree about having optional dedicated memory buffers for clients, replication, aof, and lua cache. I think we should probably leave maxmemory for dataset as is, and just let it the remaining be additional buffers reserved out of it. Basically, everything that is not set aside for replication, clients, etc is given to maxmemory. |
The discussion at #831 regarding
maxmemory
flexibility has raised several considerations. Particularly, the currentmaxmemory
configuration and its implications require scrutiny - specifically whether memory overlimit handling correctly distinguishes between data and non-data memory usage. The current mechanism of deleting user data when non-data memory consumption grows (e.g., system operational memory) appears unreasonable.I discussed with several members of @valkey-io/core-team and think that it is necessary to implement more refined memory management.
Unlike traditional databases that store data on disk and use memory mainly for caching and essential operational buffers (where memory shortage typically triggers swapping rather than data deletion), Valkey stores all data in memory. However, Valkey's memory contains not only user data but also various operational components like client I/O buffers. The current implementation forces data eviction when total memory exceeds
maxmemory
, even when the overusage stems from system operations rather than actual data growth - an unfair approach. Moreover, when usingno-eviction
policy, uncontrolled growth of system memory can cause total usage to far exceedmaxmemory
limits.Below is a sample memory statistics output via
MEMORY STATS
command showing current component breakdown (note some memory usages remain unaccounted):Currently, only
clients.slaves
andaof.buffer
are excluded from eviction calculations. Other system memory growth still triggers data eviction, which is problematic.Additional unaccounted system memory includes:
Memory composition diagram:
To establish fair and granular memory management, I propose implementing categorized memory limits with corresponding handling mechanisms:
maxmemory-dataset
: Data eviction when exceededmaxmemory-clients
: Connection termination when client buffers overflowmaxmemory-replication-buffer
: Write throttling or sync disconnectionmaxmemory-aof-buffer
: Write throttling or prioritized disk flushingmaxmemory-lua-caches
: Script cache evictionThis hierarchical approach would enable differentiated control over various memory components while maintaining system stability and fairness during memory pressure scenarios.
The text was updated successfully, but these errors were encountered: