-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libmem: implement policy-agnostic memory allocation/accounting. #332
base: main
Are you sure you want to change the base?
Conversation
d7744f7
to
6632815
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
....starting review. I could not yet follow through the whole path if it is or not possible that PreserveContainer requests can return zones different from originals. But if they do... for instance if a single zone of size 32 GB is the only zone that is allowed for two preserved-memory containers, each requesting 32 GB of RAM, then I'd still assume that whoever did this preservation must know what he is doing. If he prefers oom kill of his containers rather than letting them access memory out of the zone, then so be it, we won't touch pinning.
8dbbe4f
to
9f6da52
Compare
0a54715
to
d61ab79
Compare
8d72a3b
to
bf4ab62
Compare
3ed379c
to
515f99a
Compare
Use PrettyName()-compatible container name dumps together with container IDs in NRI evnet dumps. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Having loggers embedded in resmgr types as members was a bad idea. Replace them with module global logger instances. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
We recently discovered a problem with the generated stream of container lifecycle events with some runtime versions. A side effect of this is that we get Create/Stop events for multiple container instances with seemingly overlapping lifecycle: the latter instance get created before the former one is stopped. When undetected, such a false overlap might cause overcommit of resources, with both instances temporarily using the full resource set of the container. As a workaround, we now track containers also by fully qualified name ($namespace/pod/ctr) and internally generate an event for releasing the resources if the old instance whenever we notice that a creation event would cause a duplicate instance for the same name. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Check grants, looking for grants with stale allocations or duplicate containers (detected using fully qualified names). Dump total memory and CPU granted. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Initial implementation of a policy agnostic memory accounting and allocation library. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
515f99a
to
59732a4
Compare
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Add a brief description of libmem, its basic ideas and core concepts, as package level documentation. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Cut out the original memory accounting and allocation code. Plug in a libmem-based memory allocator instead. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Update unit tests after libmem conversion. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Plug in libmem-based memory allocation (and accounting). Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Add support for per balloon type memory configuration and per container overrides using pod annotations. Pass configured or annotated memory types to libmem for allocation. TODO(klihub): per balloon configuration still missing (?) Co-authored-by: Krisztian Litkey <krisztian.litkey@intel.com> Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Add first e2e tests for topology-aware policy memory allocation and type control. Co-authored-by: Krisztian Litkey <krisztian.litkey@intel.com> Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
"MEMTYPE=x create ..." in test scripts creates a guaranteed or burstable pod with "memory-type...: x" annotation effective for all containers in the pod. Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
Adds fuzz test generator script, model and runner for topology-aware policy reliability tests on HBM+DRAM+PMEM platform. Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
59732a4
to
0bb2c73
Compare
Note: This PR has been rebased on #358, which should be should reviewed first.
This patch series
topology-aware
policy with libmemballoons
policyFor a more detailed description of the main ideas, core concepts, and some high level
implementation details, see the included package documentation.
TODO items:
ensureNormalMemory()
during initial zone selectionRemaining questions related to this initial implementation: