Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mender delta updates fail to work due to systemd-machine-id-setup service #214

Closed
vinotion opened this issue Jul 14, 2022 · 3 comments
Closed

Comments

@vinotion
Copy link

As promised in this discussion; here's a suggestion for making Mender delta updates work.

At the beginning of the boot process, systemd-machine-id-setup will change the machine ID in /etc/machine-id. If the root filesystem is writable at this moment in time, this will effectively change the root filesystem, and thereby disabling the ability to use Mender delta updates (which assumes an unmodified root filesystem to allow for incremental updates).

By making sure the kernel already mounts the root filesystem as read-only, this change of /etc/machine-id is prevented. This can be done as follows:

diff --git a/layers/meta-tegrademo/conf/distro/tegrademo-mender.conf b/layers/meta-tegrademo/conf/distro/tegrademo-mender.conf
index 0acf1d7..a842607 100644
--- a/layers/meta-tegrademo/conf/distro/tegrademo-mender.conf
+++ b/layers/meta-tegrademo/conf/distro/tegrademo-mender.conf
@@ -41,3 +41,6 @@ PREFERRED_PROVIDER_virtual/bootloader:tegra186 = "cboot-t18x"
 
 # Use u-boot by default on the TX2 COT when using the FIT image
 PREFERRED_PROVIDER_virtual/bootloader:jetson-tx2-devkit-cot = "u-boot-tegra"
+
+# Force root FS to be read-only at early boot, to make sure Mender delta-updates will work.
+KERNEL_ARGS += "ro"

Because we also want various runtime-configurable persistent system configuration changes, we have an overlay mount for /etc.

Aside from this minor issue, there seems to be a general compatibility issue of systemd with read-only root filesystems and overlayed /etc mounts. Some unconfigurable aspects of systemd simply assume that the root filesystem is writable at early boot time (i.e. before the /etc overlay is mounted). This leads to interesting problems like:

  • The hostname not being configurable. Systemd sets the hostname from what is in /etc at very early boot time, before any overlay is mounted (which, of course, is where the customized hostname is). This can not be configured, systemd just assumes /etc is correct and writable at this point.
  • Enabling services that are disabled in the (read-only) root filesystem by default. systemctl enable ... creates symlinks in the overlayed /etc filesystem, but at the very early boot time, the read-only root filesystem version of /etc is used.
  • ...

We did not find a suitable solution for working with systemd and (partly) persistent changes in /etc.

@dwalkes
Copy link
Member

dwalkes commented Jul 14, 2022

Thanks @vinotion!

This relates to #198 as well as OE4T/meta-mender-community#8 (comment) and OE4T/meta-tegra#527. It's also related to systemd/systemd#14131 Cross referencing to setup links to these.

At the beginning of the boot process, systemd-machine-id-setup will change the machine ID in /etc/machine-id.

The change at https://github.com/OE4T/meta-tegra/pull/527/files#diff-6952bcad754469ed729bf94101a17d36ff760011bb19601d5e3b0f50d74a546f (see https://github.com/OE4T/meta-tegra/pull/527/files#diff-6952bcad754469ed729bf94101a17d36ff760011bb19601d5e3b0f50d74a546f for instance) puts the systemd.machine_id on the command line. See this example from my booting xavier-nx-devkit-emmc image running kirkstone at f8b9c37 with source setup-env --machine jetson-xavier-nx-devkit-emmc --distro tegrademo-mender and bitbake demo-image-base

root@jetson-xavier-nx-devkit-emmc:~# cat /proc/cmdline
console=ttyTCU0,115200 console=tty0 fbcon=map:0 video=tegrafb earlycon=tegra_comb_uart,mmio32,0x0c168000 gpt rootfs.slot_suffix= usbcore.old_scheme_first=1 tegraid=19.1.2.0.0 maxcpus=6 boot.slot_suffix=_b boot.ratchetvalues=0.4.2 vpr_resize sdhci_tegra.en_boot_part_access=1 systemd.machine_id=4e2dca41797e4f92af897d88f06d9409

However, systemd still writes /etc/machine-id at boot, even when specified on the command line. This detail is not included in https://www.freedesktop.org/software/systemd/man/machine-id.html, however you can see this logic at https://github.com/systemd/systemd/blob/b33c2757d84d4f14f6c31da1c79dc343c43682e2/src/shared/machine-id-setup.c#L135-L161 and the file is populated in etc with the same content.

root@jetson-xavier-nx-devkit-emmc:~# cat /etc/machine-id
4e2dca41797e4f92af897d88f06d9409

This thread provides some systemd philosophy regarding read only rootfs: https://lists.freedesktop.org/archives/systemd-devel/2021-February/046149.html. The summary is they expect /etc to be writeable and don't appear to be contemplating delta updates, at least in that thread. The thread at systemd/systemd#14131 is also interesting, and also doesn't appear to contemplate delta updates.

The part I don't understand yet is why more platforms using mender delta update don't have this problem and where we are different in this regard. My guess is that the comment at #198 (reply in thread) is related and there's something different about cboot builds which means we don't get the ro kernel argument with IMAGE_FEATURES including read-only-rootfs, probably relates to https://github.com/OE4T/tegra-demo-distro/blob/402f67f3e9ad7bd4c92f4688d75e1a0fbe97224a/layers/meta-tegrademo/classes/rootfs-postcommands-overrides.bbclass in a way I don't understand yet.

@madisongh
Copy link
Member

The way I've dealt with this in my projects is to use systemd's volatile root feature, setting systemd.volatile=overlay on the kernel command line, documented here, then handling things like host name and timezone settings with additional programs/scripts to stash the settings in a persistent location elsewhere and restore them at boot time.

Getting this to work right with Yocto builds involves a bit of additional work, essentially turning off the volatile-binds stuff (/var/volatile is unnecessary in this setup) and adjusting files/fs-perms.txt. I have a subset of the changes in my test distro.

There are some issues with overlayfs support in the 4.9 kernel which can cause some issues, so I also wound up importing the back-port of the 4.19 overlayfs support to 4.9 that one of the overlayfs developers maintains to address those.

@krisvanrens
Copy link

Ugh, apologies for not getting back to this! The suggested links to discussions and solutions are great. Thanks @dwalkes @madisongh 👍🏻

We ended up adding some systemd scripts to deal with things like setting the host name from the mounted overlay settings etc. Not the prettiest solution, but simple and very reliable. Perhaps in a future incarnation of our Yocto project we will investigate a more thorough approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants