-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding a disk to an instance changed the boot order #5112
Comments
I was able to repro with a bare Propolis server. I have a hunch as to what's happening:
I suspect that what's happening is that when the new drive is added in slot 0, the descriptions are shifting around: before, slot 1 was labeled The reason this matters is that once I think this would explain what's being seen above: when the new disk gets added, the existing entries for the disks in slots 1 and 2 end up mismatching either on their file paths or descriptions, so they get trimmed; then all three disks get added back at the end of the boot order, where they happen to be after the UEFI shell entry, which makes the instance boot to the shell. To see this in action, I added some debug prints to the bootrom to show how these comparisons proceed on a functional and non-functional boot sequence. Here's what I get when I boot a VM with a working boot disk attached at 0.17.0:
(Handler 4 is the NVMe device description handler.) If I now attach a blank disk at 0.16.0 I get the following:
Notice that the "blank" description gets applied by the NVMe handler twice (so the disambiguating I think is consistent with the following events:
The main thing I think I'm missing at this point is tracing that conclusively demonstrates that the NVMe boot options are getting added/described in PCI slot order--I think the logs above show a lot of smoke, but I'd really like to see the fire. |
The disambiguating integers get added in This fits with the behavior described above provided |
Reassigning per the discussion at the 22 Aug 2024 hypervisor huddle. |
with #6585 and oxidecomputer/console#2464 landed there is at least a way to work around the problem that can occur here, and instance pages in the UI will heavily guide towards having a boot disk. one important outstanding question here is why are we in a situation where boot options are so unstable? devices as i'm seeing them today in UEFI boot options end up named something like reiterating what we know: if a low-PCI-number disk is detached, when disk names are influenced by there is no way to leave so i think in an ideal world, we'd be seeing NVMe devices named according to their reported model and serial numbers. those names are much more stable than "wherever it happens to be in PCI device order", and adding/removing disks will stop renaming later devices, and stop invalidating their boot options. we're definitely providing a disk serial number currently, and as i can see in
i can't obviously see a reason we might have stopped seeing descriptions from |
"UEFI Misc Device" rather than "UEFI " is because i was comparing against a VM i'd run locally with
where the name at some point after this description is constructed the string appears to be with a patched
changing with a default so, the exciting discovery here is probably that with current OVMF builds, providing model numbers will render some instances unbootable. if we provide a model number, that will change boot option descriptions and in some cases probably kick a real boot disk after the EFI shell same as in the original observation. |
now that we can specify boot disks it's possible to unwedge an instance that gets in this state: specify a boot disk, boot the instance to that disk, and its UEFI variables will reflect that disk as the boot option if the boot disk is unset again later. or leave the intended guest OS disk as the boot disk in perpetuity! the above issues and test help make sure we don't unwittingly afflict VMs with this issue if it did not have a boot disk set. if we can get to them, it'll make it much more difficult to get to this wedged state, as well. i think this is about as good of a place as i can get this right now. |
I added a disk to an instance that has been running in the colo for a while, and it failed to boot afterwards, dropping to the UEFI shell. I've replicate this with a fresh instance and the rest of this note is from that replication case.
One notable thing about the VM that I originally saw the problem with is that its two disks were in slots 1 and 2, with nothing present in slot 0. This is likely because it was created before the fix for #5067 was merged.
To replicate the failure, I created a new disk from an image, and then two additional blank ones. By attaching them to a new instance in the right order, then detaching a blank disk again, I was able to end up with an instance in the same configuration, with the boot disk in slot 1 and slot 0 being empty.
I then booted this instance, which was successful, and mounted the EFI System Partition (ESP) to fish out the
NvVars
file which is where the UEFI bootrom stores its persistent variables. Decoding this shows that the bootrom has enumerated all of the possible boot devices, assigned them numbers and configured an initial boot order:So far so good. I rebooted the instance a couple of times to confirm that it booted normally, and that these variables didn't change.
I then shut down the instance and attached a new blank disk to it. This disk was 128G in size and used a 4096 sector size. After this, the database showed that the new disk has been placed in slot 0. This mirrors what happened with the previously failed instance.
On booting the instance back up, it dropped to the EFI shell after failing to boot from
Boot0003
and via PXE:Using the EFI shell to look at the persistent variables now showed something interesting:
The new disk has been enumerated and added as
Boot 0006
, which is not a surprise, but the boot order has been changed so that all three NVMe disks are now at the end. This explains why the instance attempted to boot from Boot0003, which is the cidata volume, and failed, then tried PXE boot and finally dropped to the EFI shell.The bootrom's debug output from this boot also shows this same strange boot order:
To replicate this I faithfully reproduced what happened in the colo -- not all of the steps here may be necessary to trigger it, more experimentation is necessary.
The text was updated successfully, but these errors were encountered: