-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: memory pressure notification hooks #64965
Comments
The Go folks have always wanted people to experiment with the prototype linked in golang/go#29696 (https://go-review.googlesource.com/c/go/+/46751). Maybe this exploration would call for giving that thing a shot. |
@ajwerner thanks for linking this - this definitely seems like something worth experimenting with, especially considering runtime experiments are happening elsewhere, such as @knz's task group resource accounting. In a hypothetical where we actually used the |
The idea is sound. |
Might even be worth adding an explicit call to |
this is obs infra, not server |
Just a quick update on this - through experimentation I was unsuccessful in subscribing to the I've not given up entirely on this as it would be an exceptional heuristic for us to gain access to from inside each crdb node. Perhaps we can explore the possibility of subscribing to such notifications in the orchestration layer and then deliver notifications to the relevant crdb node over the network. More experimentation needs to be done to determine if this is a valid approach. |
We have marked this issue as stale because it has been inactive for |
When CockroachDB runs out of memory, it usually manifests as a
SIGKILL
from the oomkiller, giving the program no time to respond with any crash dumps or other emergency actions.It appears that cgroups can be configured to send notifications before this happens, at a configurable percent of used memory. See: https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#memory-interface-files
It would be useful to let CockroachDB detect a close-to-OOM situation and perform custom logic, such as performing a crashdump, dumping active goroutines and a memory profile, or even dumping unnecessary caches.
Even without cgroups, perhaps we could poll Go's memstats and compare against a configured maximum, and use that to trigger any hooks.
Jira issue: CRDB-7366
The text was updated successfully, but these errors were encountered: