Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mechanism for reporting instance health to cloud platforms #120

Closed
bgilbert opened this issue Oct 31, 2018 · 7 comments
Closed

Add mechanism for reporting instance health to cloud platforms #120

bgilbert opened this issue Oct 31, 2018 · 7 comments

Comments

@bgilbert
Copy link
Contributor

Feature Request

Environment

Azure, Packet

Desired Feature

Consider adding a command-line option or mode which reports a successful boot to the cloud platform and then exits. On platforms that don't require health reporting, the command would silently succeed.

Other Information

The idea is that the command would run as a systemd oneshot after greenboot is happy.

See also coreos/fedora-coreos-tracker#69.

@lucab
Copy link
Contributor

lucab commented Nov 5, 2018

Some more details on cloud reporting protocols.

Packet

A phone-home IP can be discovered from the metadata service. A simple JSON needs to be posted there. We are interested in reporting a "state":"succeeded". This is documented at https://support.packet.com/kb/articles/user-state

Azure

The wire server is provided as a custom DHCP option number 245 (a known/fallback one is at 168.63.129.16). The /machine?comp=health endpoint there expects an XML to be posted with some specific custom headers. The format of that has been reverse-engineered: https://github.com/larsks/azure-tools/blob/master/docs/api.md. We are interested in reporting <Health><State>Ready</State></Health>. The corresponding WALinuxAgent logic is https://github.com/Azure/WALinuxAgent/blob/242278735745483ff63f5001d58ba0889c978936/azurelinuxagent/common/protocol/wire.py#L222

@bgilbert
Copy link
Contributor Author

bgilbert commented Nov 6, 2018

The Azure DHCP option is needed for AzureStack and for ASM instances; ARM instances should use a consistent IP.

@arithx
Copy link
Contributor

arithx commented Nov 6, 2018

@lucab: I've posted some relevant info about the Azure check-in along with a simplified bash script + systemd service to perform it in coreos/fedora-coreos-tracker#65

@arithx
Copy link
Contributor

arithx commented Nov 9, 2018

For Azure specifically we might also need to bounce networking.

Ubuntu is cycling networking inside of cloud-init on Azure to DHCP to Azure with the correct hostname so the DDNS is updated for private network communications (which they note mimics what the agent is doing).

@dustymabe
Copy link
Member

copying this from coreos/fedora-coreos-tracker#102 (comment)

@dustymabe
I would say so unless it's a much larger work item than I'm currently assuming. You know much more about it than I do so maybe you can clue me in.

@arithx
Overall an initial implementation that doesn't tackle the DHCP option parsing is very straightforward and still covers most use cases of Azure (only missing ASM [legacy] & AzureStack machines).

So we may be able to break out the azure work into two parts. One that does DHCP option parsing and one that does not.

@dustymabe
Copy link
Member

was this fixed in #147 ?

@lucab
Copy link
Contributor

lucab commented May 22, 2019

@dustymabe indeed, #147 plus #201 fixed this. Closing.

@lucab lucab closed this as completed May 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants