Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add heartbeat support #406

Closed
hickeng opened this issue Apr 3, 2016 · 10 comments
Closed

Add heartbeat support #406

hickeng opened this issue Apr 3, 2016 · 10 comments
Labels

Comments

@hickeng
Copy link
Member

hickeng commented Apr 3, 2016

vSphere HA depends on a heartbeat mechanism, which I believe is supplied by guest tools.
We wish to leverage HA to implement container restart policies - for that we need heartbeat.

I would like to add this into the library that also provides rpctool function as I believe the low level comms mechanisms are the same - if I'm incorrect perhaps this can be a separate library.

My intent is that this library be used by tether - tether will check for the liveliness of the processes it's responsible for, and if those are present and responsive the heartbeat continues. 'checking' the liveliness may be as simple as watching for SIGCHLD, or a more involved check.

@hickeng hickeng added area/vsphere Intergration and interoperation with vSphere component/tether component/utilities labels Apr 3, 2016
@hickeng
Copy link
Member Author

hickeng commented Apr 3, 2016

estimate at 5 because there's investigation to do.

@hickeng
Copy link
Member Author

hickeng commented May 23, 2016

github.com/vmware/open-vm-tools has the code to support heartbeat - this needs to extract it from the rest of tools and consume it. If at all possible let's keep it pure Go so that it's portable.

@stuclem stuclem added the impact/doc/user Requires changes to official user documentation label May 24, 2016
@stuclem stuclem removed the impact/doc/user Requires changes to official user documentation label May 31, 2016
@hickeng
Copy link
Member Author

hickeng commented Jun 17, 2016

@dougm this is a direct follow on from the push to tether work (i.e. it's the channel in the other direction).

dougm added a commit to dougm/vic that referenced this issue Jun 28, 2016
dougm added a commit to dougm/vic that referenced this issue Jun 29, 2016
Initial version to support vmware-tools "lite" in pure Go.

Towards:

Issue vmware#742
Issue vmware#407
Issue vmware#406
dougm added a commit to dougm/vic that referenced this issue Jun 29, 2016
Initial version to support vmware-tools "lite" in pure Go.

Towards:

Issue vmware#742
Issue vmware#407
Issue vmware#406
dougm added a commit to dougm/vic that referenced this issue Jun 29, 2016
Initial version to support vmware-tools "lite" in pure Go.

Towards:

Issue vmware#742
Issue vmware#407
Issue vmware#406
dougm added a commit to dougm/vic that referenced this issue Jun 29, 2016
Initial version to support vmware-tools "lite" in pure Go.

Towards:

Issue vmware#742
Issue vmware#407
Issue vmware#406
dougm added a commit to dougm/vic that referenced this issue Jun 30, 2016
Initial version to support vmware-tools "lite" in pure Go.

Towards:

Issue vmware#742
Issue vmware#407
Issue vmware#406
dougm added a commit to dougm/vic that referenced this issue Jul 1, 2016
Initial version to support vmware-tools "lite" in pure Go.

Towards:

Issue vmware#742
Issue vmware#407
Issue vmware#406
@dougm
Copy link
Member

dougm commented Feb 1, 2017

Not sure we need to add anything to tether/toolbox for VM HA to work. Let's try the test on shared storage first, see: #3784 (comment)

@mdubya66
Copy link
Contributor

removing the high priority based on @dougm's assessment. setting to low

@corrieb
Copy link
Contributor

corrieb commented Mar 8, 2017

Note that Dockerfile now has a HEALTHCHECK option and you can specify a command to be run periodically to determine if a container is healthy. Docker engine will generate an event if it becomes unhealthy. This may not link directly to HA, but it's easy to imagine how you could get vic-machine to link the unhealthy event to a reboot.

@hmahmood
Copy link
Contributor

I can confirm that HA works without the heartbeat (and shared storage). Should we close this issue and track support for health check in another?

@corrieb
Copy link
Contributor

corrieb commented Mar 16, 2017

HA works in a limited set of circumstances. It works in the event of host failure, but doesn't in the event of workload failure. Unless there's something I'm misunderstanding. That's what this is designed to address, right?

@hmahmood
Copy link
Contributor

If by workload failure you mean tether failure (panic), then we reboot the VM currently, and we could modify that behavior to conform to whatever restart policy the user wants. I am not sure we need heartbeat for this, since vsphere will monitor network traffic and cpu (?) if no heartbeat support is present in the toolbox. @dougm could you confirm that is the case since you stated above heartbeat is not needed.

If by workload failure you mean the container process crashes, then I believe we will restart the process, and again we can modify that behavior to support what the user wants.

@dougm
Copy link
Member

dougm commented Mar 20, 2017

@hmahmood my comment was related to VM HA when a host fails, and referenced a comment that the HA test was broken (using local storage).

Our toolbox responding to the "ping" command is also enough for vSphere to set the VirtualMachine.guestHeartbeatStatus to "green".

I'd still need to investigate what exactly the tools "heartbeat" message is used for.

Also note that "App HA" has been EOL'd:
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2108249

App HA didn't use tools IIRC, there was a separate library/backdoor RPC that applications would call into.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants