-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add heartbeat support #406
Comments
estimate at 5 because there's investigation to do. |
github.com/vmware/open-vm-tools has the code to support heartbeat - this needs to extract it from the rest of tools and consume it. If at all possible let's keep it pure Go so that it's portable. |
@dougm this is a direct follow on from the push to tether work (i.e. it's the channel in the other direction). |
Issue vmware#742 Issue vmware#407 Issue vmware#406
Initial version to support vmware-tools "lite" in pure Go. Towards: Issue vmware#742 Issue vmware#407 Issue vmware#406
Initial version to support vmware-tools "lite" in pure Go. Towards: Issue vmware#742 Issue vmware#407 Issue vmware#406
Initial version to support vmware-tools "lite" in pure Go. Towards: Issue vmware#742 Issue vmware#407 Issue vmware#406
Initial version to support vmware-tools "lite" in pure Go. Towards: Issue vmware#742 Issue vmware#407 Issue vmware#406
Initial version to support vmware-tools "lite" in pure Go. Towards: Issue vmware#742 Issue vmware#407 Issue vmware#406
Initial version to support vmware-tools "lite" in pure Go. Towards: Issue vmware#742 Issue vmware#407 Issue vmware#406
Not sure we need to add anything to tether/toolbox for VM HA to work. Let's try the test on shared storage first, see: #3784 (comment) |
removing the high priority based on @dougm's assessment. setting to low |
Note that Dockerfile now has a HEALTHCHECK option and you can specify a command to be run periodically to determine if a container is healthy. Docker engine will generate an event if it becomes unhealthy. This may not link directly to HA, but it's easy to imagine how you could get vic-machine to link the unhealthy event to a reboot. |
I can confirm that HA works without the heartbeat (and shared storage). Should we close this issue and track support for health check in another? |
HA works in a limited set of circumstances. It works in the event of host failure, but doesn't in the event of workload failure. Unless there's something I'm misunderstanding. That's what this is designed to address, right? |
If by workload failure you mean tether failure (panic), then we reboot the VM currently, and we could modify that behavior to conform to whatever restart policy the user wants. I am not sure we need heartbeat for this, since vsphere will monitor network traffic and cpu (?) if no heartbeat support is present in the toolbox. @dougm could you confirm that is the case since you stated above heartbeat is not needed. If by workload failure you mean the container process crashes, then I believe we will restart the process, and again we can modify that behavior to support what the user wants. |
@hmahmood my comment was related to VM HA when a host fails, and referenced a comment that the HA test was broken (using local storage). Our toolbox responding to the "ping" command is also enough for vSphere to set the VirtualMachine.guestHeartbeatStatus to "green". I'd still need to investigate what exactly the tools "heartbeat" message is used for. Also note that "App HA" has been EOL'd: App HA didn't use tools IIRC, there was a separate library/backdoor RPC that applications would call into. |
vSphere HA depends on a heartbeat mechanism, which I believe is supplied by guest tools.
We wish to leverage HA to implement container restart policies - for that we need heartbeat.
I would like to add this into the library that also provides rpctool function as I believe the low level comms mechanisms are the same - if I'm incorrect perhaps this can be a separate library.
My intent is that this library be used by tether - tether will check for the liveliness of the processes it's responsible for, and if those are present and responsive the heartbeat continues. 'checking' the liveliness may be as simple as watching for SIGCHLD, or a more involved check.
The text was updated successfully, but these errors were encountered: