-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
neonvm: Add support for virtio-mem #970
Conversation
This already was de-facto the case, which made much of the logic around handling "but what if .max is nil?" somewhat hard to reason about. Extracted from #970 and fleshed out here. All our staging & production clusters have VMs meeting this criteria, with the notable exception of a single, already-broken VM in dev-us-east-2-beta. More here: https://neondb.slack.com/archives/C03TN5G758R/p1718742308251569 --- Checked each cluster with: kubectl get neonvm -o json \ | jq -r '.items[] | select(.spec.guest | [.cpus.max,.cpus.min,.memorySlots.max,.memorySlots.min] | map(select(. == null)) | (length != 0)) | .metadata.name'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple thoughts
Fixes #967, extracted from #970. In short, the state machine becomes simpler if we only set the VM's pod name when it's in Pending. This gives us a couple things: 1. When the VM is Failed or Succeeded, we can un-set the pod name and it will be stable. This is what fixes #967 2. It gives us a consistent place to set status fields relating to a particular pod *before* creating the pod. This is why it's helpful for #970
Extracted from #970. Currently, when leaving the Scaling phase, we emit events like MemoryInfo VirtualMachine vm-foo uses 4GiB memory CpuInfo VirtualMachine vm-foo uses 1 cpu cores To provide more up-to-date information in the event log, we can emit these events every time .status.{cpus,memorySize} changes. This is relevant for #970 because with virtio-mem, the current state of the memory is performed asychronously and so otherwise wouldn't be available in the logs.
This already was de-facto the case, which made much of the logic around handling "but what if .max is nil?" somewhat tricky to reason about. Extracted from #970 and fleshed out here. All our staging & production clusters have VMs meeting this criteria, with the notable exception of a single, already-broken VM in dev-us-east-2-beta. More here: https://neondb.slack.com/archives/C03TN5G758R/p1718742308251569 --- Checked each cluster with: kubectl get neonvm -o json \ | jq -r '.items[] | select(.spec.guest | [.cpus.max,.cpus.min,.memorySlots.max,.memorySlots.min] | map(select(. == null)) | (length != 0)) | .metadata.name'
Fixes #967, extracted from #970. In short, the state machine becomes simpler if we only set the VM's pod name when it's in Pending. This gives us a couple things: 1. When the VM is Failed or Succeeded, we can un-set the pod name and it will be stable. This is what fixes #967 2. It gives us a consistent place to set status fields relating to a particular pod *before* creating the pod. This is why it's helpful for #970
Extracted from #970. Currently, when leaving the Scaling phase, we emit events like MemoryInfo VirtualMachine vm-foo uses 4GiB memory CpuInfo VirtualMachine vm-foo uses 1 cpu cores To provide more up-to-date information in the event log, we can emit these events every time .status.{cpus,memorySize} changes. This is relevant for #970 because with virtio-mem, the current state of the memory is performed asychronously and so otherwise wouldn't be available in the logs.
b58c8b5
to
8877ad8
Compare
8877ad8
to
ec73215
Compare
ec73215
to
072fe6a
Compare
Fixes #967, extracted from #970. In short, the state machine becomes simpler if we only set the VM's pod name when it's in Pending. This gives us a couple things: 1. When the VM is Failed or Succeeded, we can un-set the pod name and it will be stable. This is what fixes #967 2. It gives us a consistent place to set status fields relating to a particular pod *before* creating the pod. This is why it's helpful for #970
Extracted from #970. Currently, when leaving the Scaling phase, we emit events like MemoryInfo VirtualMachine vm-foo uses 4GiB memory CpuInfo VirtualMachine vm-foo uses 1 cpu cores To provide more up-to-date information in the event log, we can emit these events every time .status.{cpus,memorySize} changes. This is relevant for #970 because with virtio-mem, the current state of the memory is performed asychronously and so otherwise wouldn't be available in the logs.
072fe6a
to
8e28c09
Compare
Did backwards-compatibility testing locally (started a VM, upgraded the version, checked status was as expected, assuming that our existing e2e tests can take it from there). Found & fixed a bug. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall
neonvm/controllers/vm_controller.go
Outdated
// Maybe we're already using the amount we want? | ||
// Update the status to reflect the current size - and if it matches goalTotalSize, ram | ||
// is done. | ||
currentTotalSize, err := QmpGetMemorySize(QmpAddr(vm)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a redundant check, we could get the current memory size from where it is set.
But no worries, I'll clean it up in #943
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in testing I actually saw this value change in between QMP queries in the same reconcile cycle. Not totally sure though...
if p := vm.Spec.Guest.MemoryProvider; p != nil { | ||
return *p | ||
} | ||
if p := vm.Status.MemoryProvider; p != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this prevent us from changing from one provider to the other by deleting the pod, because the old value be preserved through Status?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the tl;dr on that is that we reset .status.memoryProvider
in the same place we reset .status.podName
(there's some method on VirtualMachineStatus
that does it for us).
I tested it locally and it works as expected :)
Extracted from #970. Currently, when leaving the Scaling phase, we emit events like MemoryInfo VirtualMachine vm-foo uses 4GiB memory CpuInfo VirtualMachine vm-foo uses 1 cpu cores To provide more up-to-date information in the event log, we can emit these events every time .status.{cpus,memorySize} changes. This is relevant for #970 because with virtio-mem, the current state of the memory is performed asychronously and so otherwise wouldn't be available in the logs.
f766bc5
to
5b8aead
Compare
Change in preparation to reduce the diff from adding virtio-mem.
Adapted from #705. This implementation is focused on making virtio-mem a drop-in replacement for our existing DIMM hotplug -based memory scaling. As such, the externally visible differences are: 1. VirtualMachines have a new optional field: .spec.guest.memoryProvider 2. VirtualMachines have a new optional field: .status.memoryProvider 3. VirtualMachine .status.memorySize changes in smaller increments if using virtio-mem. This change introduces the concept of a "memory provider", which is either "DIMMSlots" or "VirtioMem". If a VM is created without setting the memory provider in the spec, the controller will use the default one, which is specified by its --default-memory-provider flag. This will only persist for the lifetime of the runner pod. If the VM is restarted, then it may change to a new default. Additionally, because our usage of virtio-mem has 8MiB block sizes, we actually *don't* default to virtio-mem if the VM was created with some odd memory slot size that isn't divisible by 8MiB. The memory provider of the current runner pod is stored in the new status field. For old VMs created before this change, the status field is set to DIMMSlots iff the pod name is set without the status field. Also, we allow mutating .spec.guest.memoryProvider to allow the flexibility to solidify a particular provider from future changes or to force it to change across restarts. There was more discussion on this in the PR comments, if you're curious. Co-authored-by: Andrey Taranik <andrey@cicd.team> Co-authored-by: Oleg Vasilev <oleg@neon.tech>
d2b8d89
to
e2aab65
Compare
Change in preparation to reduce the diff from adding virtio-mem.
In the original PR (#970), I'd intended to make this field mutable and added a comment about it, but forgot to actually remove it from the list of immutable fields. *Probably* doesn't matter much, but IMO it's worth cleaning up.
In the original PR (#970), I'd intended to make this field mutable and added a comment about it, but forgot to actually remove it from the list of immutable fields. Probably doesn't matter much, but IMO it's worth cleaning up.
Adapted from #705.
This implementation is focused on making virtio-mem a drop-in replacement for our existing DIMM hotplug -based memory scaling.
As such, the externally visible differences are:
.spec.guest.memoryProvider
.status.memoryProvider
.status.memorySize
changes in smaller increments if using virtio-mem.This change introduces the concept of a "memory provider", which is either "DIMMSlots" or "VirtioMem".
If a VM is created without setting the memory provider in the spec, the controller will use the default one, which is specified by its
--default-memory-provider
flag.This will only persist for the lifetime of the runner pod. If the VM is restarted, then it may change to a new default.
Additionally, because our usage of virtio-mem has 8MiB block sizes, we actually don't default to virtio-mem if the VM was created with some odd memory slot size that isn't divisible by 8MiB.
The memory provider of the current runner pod is stored in the new status field. For old VMs created before this change, the status field is set to DIMMSlots iff the pod name is set without the status field.
Also, we allow mutating
.spec.guest.memoryProvider
IFF it is being set equal to the current.status.memoryProvider
, in case we we want old VMs to keep their dimm slots after switching to virtio-mem by default.Changes to split out from this PR:
On top of those, there's a couple places where code was indented that should probably be split into its own commit so it can be properly added to .git-blame-ignore-revs :)
Remaining testing for this PR before merging:
--default-memory-provider=DIMMSlots
).--default-memory-provider=VirtioMem
when this is released, we cannot make it rollback-safe.Test that updating.spec.guest.memoryProvider
works as expectedFuture/follow-up work: