You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ideal future situation
Roll a new version of the agent without impact on any running container on the nodes.
Implementation options
Some random ideas that come to mind:
Configure the daemonset to create the new pod before destroying the old one. The new pod can communicate and receive the fds from the old. This of course doesn't help with crashes, though.
Have a small binary that holds the fds and pass them to the agent when it starts. This will handle just fine if the agent crashes or needs to be restarted. This binary should probably be another daemonset. Need to decide how the flow would look like (this binary receives from the socket and passes fds to the agent? So it needs to communicate with the agent not only on startup? Can the fds be received by both, the agent and the small binary, so the agent only queries the small binary on startup?)
Investigate if CRIU has any interesting idea we can use here: https://criu.org
We could get some inspiration from systemd and its "FD Store" facility that stores file descriptors from services when they restart (systemctl restart). See FDSTORE=1 in
Oooor, just rely on the host systemd to save the fds for us, using that functionality. No "inspiration", just use it!
We should explore more that option (like security concerns, etc.) but seems worth exploring. Also, Kubernetes graceful shutdown KEP works only with systemd hosts, so most hosts really should have systemd. Maybe we can't use it if we want to run on GKE Autopilot, but all at its own time :)
Ideal future situation
Roll a new version of the agent without impact on any running container on the nodes.
Implementation options
Some random ideas that come to mind:
The text was updated successfully, but these errors were encountered: