-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition in Equinix Metal install process #1371
Comments
please add the label platform/equinixmetal thanks |
It would be worth reviewing #1143 which was a change in the phone-home setup. |
Also for review flatcar/scripts#1197 and flatcar/init#107 |
Hello @vielmetti, thanks for the report. Please note that Flatcar is tested on Equinix Metal at each release (at least for AMD64) in a PXE based environment. |
Thanks @tormath1 . The |
@vielmetti just capturing what I wrote in slack here: Our test suite tests iPXE Flatcar on EM, and in that case one needs it to phone-home from the iPXE env. |
Ideally the internal provisioning would be done in a tinkerbell container that directly runs |
@jepio Thanks for the suggestion. I'll test |
@jepio it worked!!! |
Thanks @turegano-equinix - is there something that we should add to the documentation or otherwise commit upstream? I'll hold this issue open to capture any of that. |
Done to our satisfaction; if a doc issue comes up, it's an internal one here. Closing as completed. |
Description
When provisioning instances on the Equinix Metal platform, instances are sometimes marked as "active" before
they are fully ready. There appears to be a race condition between the install script and the "phone-home"
operation.
Impact
When provisioning devices, there are times when the server says it is provisioned and it displays it is up and running on the console but the user doesn't have ssh access to it.
Environment and steps to reproduce
a. Provision a new Flatcar Linux server from the Equinix console
b. After some time, observe that the console says the device is "ready"
c. Attempt to log on with ssh; login sometimes fails
d. Attempt to log on using out of band management ("SOS"), login succeeds
Expected behavior
Expected behavior is that when the "phone home" script runs to signal to the
Equinix console that the device is ready, that the device will actually be ready for ssh logins
and will have completed all of its provisioning tasks.
Additional information
CoreOS has a service coreos-metadata.service that phones home using the url retrieved from the metadata.
Flatcar Linux does test on the Equinix Metal platform.
Is it easy to run the phone-home conditioned on flatcar.first_boot=detected and not on flatcar.first_boot=1?
cc @turegano-equinix
The text was updated successfully, but these errors were encountered: