Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ds-identify: fails to recognize NoCloud datasource on boot cause it does not have /sbin in $PATH and thus does not find blkid #3182

Closed
ubuntu-server-builder opened this issue May 11, 2023 · 33 comments
Labels
launchpad Migrated from Launchpad

Comments

@ubuntu-server-builder
Copy link
Collaborator

This bug was originally filed in Launchpad as LP: #1771382

Launchpad details
affected_projects = ['cloud-init (openSUSE)']
assignee = None
assignee_name = None
date_closed = 2018-06-20T18:06:00.396760+00:00
date_created = 2018-05-15T15:37:38.152796+00:00
date_fix_committed = 2018-05-22T14:11:10.843529+00:00
date_fix_released = 2018-06-20T18:06:00.396760+00:00
id = 1771382
importance = low
is_complete = True
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1771382
milestone = None
owner = ms-proact
owner_name = Martin Steigerwald
private = False
status = fix_released
submitter = ms-proact
submitter_name = Martin Steigerwald
tags = []
duplicates = []

Launchpad user Martin Steigerwald(ms-proact) wrote on 2018-05-15T15:37:38.152796+00:00

cloud-init 18.2 from http://download.opensuse.org/repositories/Cloud:/Tools/SLE_12_SP3/ on SLES 12 SP 3 with NoCloud data source via Cloud Init drive made by Proxmox.

On SLES 12 SP3 NoCloud data source was not working, despite

slestemplate:~ # blkid -c /dev/null -o export
[…]
DEVNAME=/dev/sr0
UUID=2018-05-15-16-34-27-00
LABEL=cidata
TYPE=iso9660
[…]

with necessary files on it. blkid gives 0 as returncode

Why?

I only kept parts of the output:

slestemplate:/etc/cloud # cat /run/cloud-init/ds-identify.log
[up 8.63s] ds-identify
policy loaded: mode=search report=false found=all maybe=all notfound=disabled
no datasource_list found, using default: MAAS ConfigDrive NoCloud AltCloud Azure Bigstep CloudSigma CloudStack DigitalOcean AliYun Ec2 GCE OpenNebula OpenStack OVF SmartOS Scaleway Hetzner IBMCloud
ERROR: failed running [127]: blkid -c /dev/null -o export
[…]
FS_LABELS=unavailable:error
ISO9660_DEVS=unavailable:error

It might have been that I did not yet add the CloudInit drive in Proxmox yet.

A subsequent call to

slestemplate:~ # /usr/lib/cloud-init/ds-identify

did not yet yield a different result.

Only by analysing the source I found that it caches results and I can use the --force option to override this. I did this and the NoCloud datasource got detected properly. Apparently this is cached now.

The tool would only inform of the caching as a DEBUG message. However I set logging to INFO for all parts of Cloud Init as the FileHandler clutters the log with tons of messages how many bytes it read from each file. Sure, I could use INFO only for FileHandler.

Several issues reduce the ease of administration here:

  1. Don´t cache errors. Really… just… don´t.

  2. Don´t cache errors almost silently (just as a debug message).

  3. Decide wisely what is a debug message and what is not.

  4. A search for ds-identify in the documentation available at https://cloudinit.readthedocs.io/en/latest/ did not yield any result.

  5. And in general: Keep it short and simple.

IMHO the first is the most important: Don´t cache errors. If the resource now is there, recognize it, without further discussion.

Related bugs:

  • bug 1791691: [systemd] PATH broken in systemd units
@ubuntu-server-builder ubuntu-server-builder added the launchpad Migrated from Launchpad label May 11, 2023
@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Martin Steigerwald(ms-proact) wrote on 2018-05-16T10:20:37.418939+00:00

I now cloned a VM from the Cloud Init image that I "fixed" by running ds-identify --force, but I am running into the same thing again:

Cloud Init is disabled again, since ds-identify fails to run blkid command:

slestemplate:~ # cat /run/cloud-init/ds-identify.log
[… Hostname should have been changed to sles1 by Cloud Init …]
[up 11.01s] ds-identify
policy loaded: mode=search report=false found=all maybe=all notfound=disabled
no datasource_list found, using default: MAAS ConfigDrive NoCloud AltCloud Azure Bigstep CloudSigma CloudStack DigitalOcean AliYun Ec2 GCE OpenNebula OpenStack OVF SmartOS Scaleway Hetzner IBMCloud
ERROR: failed running [127]: blkid -c /dev/null -o export
DMI_PRODUCT_NAME=Standard PC (i440FX + PIIX, 1996)
DMI_SYS_VENDOR=QEMU
DMI_PRODUCT_SERIAL=
DMI_PRODUCT_UUID=0CE53BB3-A48E-44F7-9EC7-3C339E9C80D3
PID_1_PRODUCT_NAME=unavailable
DMI_CHASSIS_ASSET_TAG=
FS_LABELS=unavailable:error
ISO9660_DEVS=unavailable:error
KERNEL_CMDLINE=BOOT_IMAGE=/boot/vmlinuz-4.4.73-5-default root=/dev/mapper/sys0-rootfs splash=silent quiet showopts console=tty0 console=ttyS0,115200
VIRT=kvm
UNAME_KERNEL_NAME=Linux
UNAME_KERNEL_RELEASE=4.4.73-5-default
UNAME_KERNEL_VERSION=#1 SMP Tue Jul 4 15:33:39 UTC 2017 (b7ce4e4)
UNAME_MACHINE=x86_64
UNAME_NODENAME=slestemplate
UNAME_OPERATING_SYSTEM=GNU/Linux
DSNAME=
DSLIST=MAAS ConfigDrive NoCloud AltCloud Azure Bigstep CloudSigma CloudStack DigitalOcean AliYun Ec2 GCE OpenNebula OpenStack OVF SmartOS Scaleway Hetzner IBMCloud
MODE=search
ON_FOUND=all
ON_MAYBE=all
ON_NOTFOUND=disabled
pid=1138 ppid=1133
is_container=false
ec2 platform is 'Unknown'.
No ds found [mode=search, notfound=disabled]. Disabled cloud-init [1]
[up 11.05s] returning 1

Which leads to:
slestemplate:~ # cat /run/cloud-init/cloud-init-generator.log
/usr/lib/systemd/system-generators/cloud-init-generator normal=/run/systemd/generator early=/run/systemd/generator.early late=/run/systemd/generator.late
kernel command line (/proc/cmdline): BOOT_IMAGE=/boot/vmlinuz-4.4.73-5-default root=/dev/mapper/sys0-rootfs splash=silent quiet showopts console=tty0 console=ttyS0,115200
kernel_cmdline found unset
etc_file found unset
default found enabled
checking for datasource
ds-identify rc=1
ds-identify _RET=notfound
cloud-init is enabled but no datasource found, disabling
already disabled: no change needed [no /run/systemd/generator.early/multi-user.target.wants/cloud-init.target

However just using it again with the force option fixes the issue:

slestemplate:~ # /usr/lib/cloud-init/ds-identify --force
slestemplate:~ # cat /run/cloud-init/ds-identify.log
[up 11.01s] ds-identify
policy loaded: mode=search report=false found=all maybe=all notfound=disabled
no datasource_list found, using default: MAAS ConfigDrive NoCloud AltCloud Azure Bigstep CloudSigma CloudStack DigitalOcean AliYun Ec2 GCE OpenNebula OpenStack OVF SmartOS Scaleway Hetzner IBMCloud
ERROR: failed running [127]: blkid -c /dev/null -o export
DMI_PRODUCT_NAME=Standard PC (i440FX + PIIX, 1996)
DMI_SYS_VENDOR=QEMU
DMI_PRODUCT_SERIAL=
DMI_PRODUCT_UUID=0CE53BB3-A48E-44F7-9EC7-3C339E9C80D3
PID_1_PRODUCT_NAME=unavailable
DMI_CHASSIS_ASSET_TAG=
FS_LABELS=unavailable:error
ISO9660_DEVS=unavailable:error
KERNEL_CMDLINE=BOOT_IMAGE=/boot/vmlinuz-4.4.73-5-default root=/dev/mapper/sys0-rootfs splash=silent quiet showopts console=tty0 console=ttyS0,115200
VIRT=kvm
UNAME_KERNEL_NAME=Linux
UNAME_KERNEL_RELEASE=4.4.73-5-default
UNAME_KERNEL_VERSION=#1 SMP Tue Jul 4 15:33:39 UTC 2017 (b7ce4e4)
UNAME_MACHINE=x86_64
UNAME_NODENAME=slestemplate
UNAME_OPERATING_SYSTEM=GNU/Linux
DSNAME=
DSLIST=MAAS ConfigDrive NoCloud AltCloud Azure Bigstep CloudSigma CloudStack DigitalOcean AliYun Ec2 GCE OpenNebula OpenStack OVF SmartOS Scaleway Hetzner IBMCloud
MODE=search
ON_FOUND=all
ON_MAYBE=all
ON_NOTFOUND=disabled
pid=1138 ppid=1133
is_container=false
ec2 platform is 'Unknown'.
No ds found [mode=search, notfound=disabled]. Disabled cloud-init [1]
[up 11.05s] returning 1
[up 544.63s] ds-identify --force
policy loaded: mode=search report=false found=all maybe=all notfound=disabled
no datasource_list found, using default: MAAS ConfigDrive NoCloud AltCloud Azure Bigstep CloudSigma CloudStack DigitalOcean AliYun Ec2 GCE OpenNebula OpenStack OVF SmartOS Scaleway Hetzner IBMCloud
DMI_PRODUCT_NAME=Standard PC (i440FX + PIIX, 1996)
DMI_SYS_VENDOR=QEMU
DMI_PRODUCT_SERIAL=
DMI_PRODUCT_UUID=0CE53BB3-A48E-44F7-9EC7-3C339E9C80D3
PID_1_PRODUCT_NAME=unavailable
DMI_CHASSIS_ASSET_TAG=
FS_LABELS=cidata
ISO9660_DEVS=/dev/sr0=cidata
KERNEL_CMDLINE=BOOT_IMAGE=/boot/vmlinuz-4.4.73-5-default root=/dev/mapper/sys0-rootfs splash=silent quiet showopts console=tty0 console=ttyS0,115200
VIRT=kvm
UNAME_KERNEL_NAME=Linux
UNAME_KERNEL_RELEASE=4.4.73-5-default
UNAME_KERNEL_VERSION=#1 SMP Tue Jul 4 15:33:39 UTC 2017 (b7ce4e4)
UNAME_MACHINE=x86_64
UNAME_NODENAME=slestemplate
UNAME_OPERATING_SYSTEM=GNU/Linux
DSNAME=
DSLIST=MAAS ConfigDrive NoCloud AltCloud Azure Bigstep CloudSigma CloudStack DigitalOcean AliYun Ec2 GCE OpenNebula OpenStack OVF SmartOS Scaleway Hetzner IBMCloud
MODE=search
ON_FOUND=all
ON_MAYBE=all
ON_NOTFOUND=disabled
pid=2762 ppid=2314
is_container=false
check for 'NoCloud' returned found
ec2 platform is 'Unknown'.
Found single datasource: NoCloud
[up 544.70s] returning 0

I can also reproduce the behavior on the template VM just by rebooting it.

It appears that the initial check for NoCloud data source during boot fails. So the caching may not be the issue here. Retitling.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Martin Steigerwald(ms-proact) wrote on 2018-05-16T10:29:36.729631+00:00

The command reliable works here on the fully booted VM:

slestemplate:~ # blkid -c /dev/null -o export >/dev/null ; echo $?
0

DEVNAME=/dev/sda1
UUID=[…]
TYPE=LVM2_member
PARTUUID=[…]

DEVNAME=/dev/sr0
UUID=2018-05-16-12-18-22-00
LABEL=cidata
TYPE=iso9660

DEVNAME=/dev/mapper/0QEMU_QEMU_HARDDISK_drive-scsi0
PTUUID=[…]
PTTYPE=dos

DEVNAME=/dev/mapper/0QEMU_QEMU_HARDDISK_drive-scsi0-part1
UUID=Tlko7f-sFXz-tgEn-3Pks-m652-XxVK-tl5dIB
TYPE=LVM2_member
PARTUUID=[…]

DEVNAME=/dev/mapper/sys0-rootfs
UUID=[…]
UUID_SUB=[…]
TYPE=btrfs

I think I am just going to hardcode the datasource now. Either via configuration file if possible or by hacking the shell script.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Martin Steigerwald(ms-proact) wrote on 2018-05-16T11:39:48.127955+00:00

I ran blkid command 10 times in a row and always got return code 0. So I don´t get why:

out=$(blkid -c /dev/null -o export) || {
    ret=$?
    error "failed running [$ret]: blkid -c /dev/null -o export"
    DI_FS_LABELS="$UNAVAILABLE:error"
    DI_ISO9660_DEVS="$UNAVAILABLE:error"
    return $ret
}

is not working on boot. It looks like correct shell code. Only idea I have:

It is run to early for the ISO device to become available. Okay, testing for this with this change:

slestemplate:~ # git diff /usr/lib/cloud-init/ds-identify.orig /usr/lib/cloud-init/ds-identify
diff --git a/usr/lib/cloud-init/ds-identify.orig b/usr/lib/cloud-init/ds-identify
index 9a2db5c..2083734 100755
--- a/usr/lib/cloud-init/ds-identify.orig
+++ b/usr/lib/cloud-init/ds-identify
@@ -199,14 +199,24 @@ read_fs_info() {
return
fi
local oifs="$IFS" line="" delim=","

  • local ret=0 out="" labels="" dev="" label="" ftype="" isodevs="" uuids=""
  • out=$(blkid -c /dev/null -o export) || {
  • local ret=1 out="" labels="" dev="" label="" ftype="" isodevs="" uuids=""
  • local attempt=1
  • while [ $ret -ne 0 -a $attempt -le 10 ]; do
  •   out=$( blkid -c /dev/null -o export 2>&1 )
    
  •   ret=$?
    
  •   if [ $ret -ne 0 ]; then
    
  •           error "failed running [$ret]: blkid -c /dev/null -o export, attempt: $attempt, output: $out"
    
  •           sleep 2
    
  •   fi
    
  •   let attempt++;
    
  • done
  • if [ $ret -ne 0 ]; then
    ret=$?
    error "failed running [$ret]: blkid -c /dev/null -o export"
    DI_FS_LABELS="$UNAVAILABLE:error"
    DI_ISO9660_DEVS="$UNAVAILABLE:error"
    return $ret
  • }
  • fi

    'set --' will collapse multiple consecutive entries in IFS for

    whitespace characters (\n, tab, " ") so we cannot rely on getting

    empty lines in "$@" below.

Which gets me:

slestemplate:~ # cat /run/cloud-init/ds-identify.log | head -7
[up 8.80s] ds-identify
policy loaded: mode=search report=false found=all maybe=all notfound=disabled
no datasource_list found, using default: MAAS ConfigDrive NoCloud AltCloud Azure Bigstep CloudSigma CloudStack DigitalOcean AliYun Ec2 GCE OpenNebula OpenStack OVF SmartOS Scaleway Hetzner IBMCloud
ERROR: failed running [127]: blkid -c /dev/null -o export, attempt: 1, output: /usr/lib/cloud-init/ds-identify: line 205: blkid: command not found
ERROR: failed running [127]: blkid -c /dev/null -o export, attempt: 2, output: /usr/lib/cloud-init/ds-identify: line 205: blkid: command not found
ERROR: failed running [127]: blkid -c /dev/null -o export, attempt: 3, output: /usr/lib/cloud-init/ds-identify: line 205: blkid: command not found
ERROR: failed running [127]: blkid -c /dev/null -o export, attempt: 4, output: /usr/lib/cloud-init/ds-identify: line 205: blkid: command not found

Which may just mean that during startup via Systemd

slestemplate:~ # type blkid
blkid is /sbin/blkid

is not in path.

And well now I learned from bash manpagethat is exactly what the bash error code tells me (Manpage: bash(1)):

   If a command is not found, the child process created
   to execute it returns a status of 127.  If a command
   is found but is not executable, the return status is
   126.

But it does not seem that the systemd generator is being run on reboot, cause I added:

slestemplate:~ # diff -u cloud-init-generator.orig /usr/lib/systemd/system-generators/cloud-init-generator
--- cloud-init-generator.orig 2018-05-16 13:08:41.302467498 +0200
+++ /usr/lib/systemd/system-generators/cloud-init-generator 2018-05-16 13:22:26.661939261 +0200
@@ -1,6 +1,8 @@
#!/bin/sh
set -f

+echo "PATH: $PATH" > /root/path
+
LOG=""
DEBUG_LEVEL=1
LOG_D="/run/cloud-init"

at the beginning of it, yet got no output in /tmp, after reboot, while when running it manually I get the output. So it appears on reboot something else is calling it and this does not have /sbin in path.

I have no clue what else might be calling it:

slestemplate:/etc # grep -ir "ds-identify" .

slestemplate:/usr/lib/systemd # grep -ir "ds-identify"

only reports that system-generators/cloud-init-generator.

Also nothing in

slestemplate:/var # LANG=en grep -ir "ds-identify" .
Binary file ./lib/rpm/Packages matches
Binary file ./lib/rpm/Basenames matches
Binary file ./lib/mlocate/mlocate.db matches

So I am done with it for now and will just hardcode the path in ds-identify to /sbin/blkid.

And voila, this finally works. After a few dozens of attempts and reboots I finally at least have found the root cause and a work-around. I think to be really portable it ds-identify needs to try harder to find blkid, cause hard coding it to UsrMerge /usr/sbin/blkid is going to break on Debian and Ubuntu als long as UsrMerge is not done there. Or one might use /sbin/blkid at this is hard-linked on SLES 12 and RHEL 7 to /usr/sbin – and I bet these hardlinks better to be kept around for decades.

Gosh, this works. This finally works. Retitling again and adding patch.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Martin Steigerwald(ms-proact) wrote on 2018-05-16T11:46:47.836829+00:00

I had 18.2-1.1.x86_64 but

http://download.opensuse.org/repositories/Cloud:/Tools/SLE_12_SP3/x86_64/cloud-init-18.2-2.1.x86_64.rpm

is also affected.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Martin Steigerwald(martin-steigerwald) wrote on 2018-05-16T11:54:15+00:00

Please see upstream bug report for all the details on this:

ds-identify: fails to recognize NoCloud datasource on boot cause it does not have /sbin in $PATH and thus does not find blkid
https://bugs.launchpad.net/cloud-init/+bug/1771382

Minimal patch to fix the issue:

slestemplate:~ # diff -u ds-identify.orig /usr/lib/cloud-init/ds-identify
--- ds-identify.orig 2018-05-16 13:34:06.376646777 +0200
+++ /usr/lib/cloud-init/ds-identify 2018-05-16 13:47:59.215541889 +0200
@@ -200,7 +200,7 @@
fi
local oifs="$IFS" line="" delim=","
local ret=0 out="" labels="" dev="" label="" ftype="" isodevs="" uuids=""

  • out=$(blkid -c /dev/null -o export) || {
  • out=$(/sbin/blkid -c /dev/null -o export) || {
    ret=$?
    error "failed running [$ret]: blkid -c /dev/null -o export"
    DI_FS_LABELS="$UNAVAILABLE:error"

Of course with UsrMerge you could also use /usr/sbin/blkid.

As stated in upstream bug report I have not the slightest idea what it calling ds-identify during boot. I thought it would be the systemd cloud-init generator, but I added debug output to it and it apparently is not called. For all the gory details see the upstream bug report.

Proper fix might be to make sure blkid is in $PATH.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Martin Steigerwald(martin-steigerwald) wrote on 2018-05-16T11:58:12+00:00

Created attachment 770432
minimal patch to add path to ds-identify

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Martin Steigerwald(martin-steigerwald) wrote on 2018-05-16T12:01:53+00:00

Created attachment 770433
patch with additional debug output + reattempting

I first thought about blkid might be called to early in the boot time, but that is not true. 127 is bash´s return code for command not found. Still attaching it as a reference.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Martin Steigerwald(ms-proact) wrote on 2018-05-16T12:15:26.470274+00:00

I have no idea how to attach patches as files here, but I attached them in downstream bug report:

https://bugzilla.opensuse.org/show_bug.cgi?id=1093501

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Robert Schweikert(rjschwei) wrote on 2018-05-16T12:30:29+00:00

We will not take this patch.

At present it is not understood why PATH is not always part of the environment when the generator runs. This is being investigated.

For example the generator that spawns ds-identify also runs in the SUSE published images in AWS and there is no problem with finding blkid.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Martin Steigerwald(martin-steigerwald) wrote on 2018-05-16T13:28:21+00:00

(In reply to Robert Schweikert from comment #3)

We will not take this patch.

Fair enough, I will use it nonetheless, cause with it I have it working now.

At present it is not understood why PATH is not always part of the
environment when the generator runs. This is being investigated.

Interesting. That SLES 12 image has some service pack migrations behind it already. Its an minimal image I installed and adapted myself for training purposes. It has only minimal adaptions in configuration:

These are the files I checked in to git repo – my changes are limited to these files (I did not adapt os-release and so on of course):

etc/SuSE-release
etc/cloud
etc/cloud/cloud.cfg
etc/cloud/cloud.cfg.d
etc/cloud/cloud.cfg.d/05_logging.cfg
etc/default
etc/default/grub
etc/dracut.conf.d
etc/dracut.conf.d/90-hostonly.conf
etc/fstab
etc/group
etc/hostname
etc/hosts
etc/os-release
etc/passwd
etc/resolv.conf
etc/screenrc
etc/snapper
etc/snapper/configs
etc/snapper/configs/root
etc/sysconfig
etc/sysconfig/network
etc/sysconfig/network/ifcfg-eth0
etc/sysconfig/network/routes
etc/zypp
etc/zypp/repos.d
etc/zypp/repos.d/Cloud_Tools.repo
etc/zypp/repos.d/SLES12-SP3_12.3-0.repo

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Martin Steigerwald(ms-proact) wrote on 2018-05-16T13:43:21.469875+00:00

Downstream is investigating why PATH is not always in the environment of cloud-init systemd generator. I never clearly noted: The OpenSUSE Build Service cloud-init 18.2 are still experimental.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Robert Schweikert(rjschwei) wrote on 2018-05-16T15:40:11+00:00

After receiving additional input and based on what was already known I have decided to drop the generator from our package.

The generator, which in turn runs ds-identify, where the problem is created by using blkid, speeds up the boot process in cases where cloud-init shold not be running in the first place. The chain of events is as follows:

generator runs ds-identify
if ds-identify cannot find a data source the cloud-init services are disabled

With cloud-init disabled the boot is sped up as no Python code gets executed.

A reasonable assumption is that the person installing and enabling cloud-init knows they run in environment where cloud-init is needed. Thus looking for the data source twice, once in ds-identify and then again in the cloud-init Python code is not really an advantage.

Dropping the generator avoids the problem with blkid and it avoids looking for the data source twice, once in shell code and once in Python code.

Change is on the way to Factory

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Swamp-a(swamp-a) wrote on 2018-05-16T16:00:39+00:00

This is an autogenerated message for OBS integration:
This bug (1093501) was mentioned in
https://build.opensuse.org/request/show/609843 Factory / cloud-init

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Scott Moser(smoser) wrote on 2018-05-17T17:32:24.294552+00:00

For what its worth, the caching of ds-identify result is due to cloud-init-generator being called multiple times in a boot and thus ds-identify being called multiple times. We wanted to avoid 'blkid' calls to re-search disks during high IO process as boot.

we do have intent to make ds-identify more stand-alone useful. In doing that it would make sense to have the systemmd-generator use a "--respect-previous-run" or something and only then cache it.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Scott Moser(smoser) wrote on 2018-05-17T20:14:24.185007+00:00

I've put a merge proposal up.
https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/345786

that will ensure that PATH is set to include common locations.

I really am interesetd in cloud-init doing the right thing in all cases, and thus I would like to have ds-identify enabled in suse and am willing to carry the change there so that we can assume a sane PATH. Even though I think a sane PATH should be set by the system rather than any program that expected to execute other programs.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Martin Steigerwald(ms-proact) wrote on 2018-05-22T10:56:43+00:00

Just removing the generator leads to cloud-init services not being run at startup on my SLES 12 SP 3 VM with

slestemplate:~ # find /etc/systemd | grep cloud
/etc/systemd/system/cloud-init.target.wants
/etc/systemd/system/cloud-init.target.wants/cloud-final.service
/etc/systemd/system/cloud-init.target.wants/cloud-config.service
/etc/systemd/system/cloud-init.target.wants/cloud-init-local.service
/etc/systemd/system/cloud-init.target.wants/cloud-init.service

All services are enabled according to systemctl status SERVICE

slestemplate:~ # rpm -qa | grep cloud-init
cloud-init-config-suse-18.2-3.1.x86_64
cloud-init-18.2-3.1.x86_64

slestemplate:~ # systemctl status cloud-init.target
● cloud-init.target - Cloud-init target
Loaded: loaded (/usr/lib/systemd/system/cloud-init.target; static; vendor preset: disabled)
Active: inactive (dead)
slestemplate:~ # systemctl status cloud-config.target
● cloud-config.target - Cloud-config availability
Loaded: loaded (/usr/lib/systemd/system/cloud-config.target; static; vendor preset: disabled)
Active: inactive (dead)

The output I get is:

slestemplate:~ # systemctl |grep cloud

When I do:

slestemplate:~ # systemctl start cloud-init.target
slestemplate:~ # systemctl | grep cloud
cloud-config.service loaded active exited Apply the settings specified in cloud-config
cloud-final.service loaded active exited Execute cloud user/final scripts
cloud-init-local.service loaded active exited Initial cloud-init job (pre-networking)
cloud-init.service loaded active exited Initial cloud-init job (metadata service crawler)
cloud-config.target loaded active active Cloud-config availability
cloud-init.target loaded active active Cloud-init target

So it is still not working out of the box. I enabled all services and the targets with systemctl enable. It is not obvious for me to enable cloud init in case the Systemd generator does not do it. I thought I did, but apparently I did not.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Martin Steigerwald(ms-proact) wrote on 2018-05-22T10:59:43+00:00

Also please note that according

https://bugs.launchpad.net/cloud-init/+bug/1771382/comments/15

upstream developer Scott Moser would like to see cloud-init doing the sane thing in all cases and added a merge proposal for setting the PATH to common locations. He would like to see ds-identify enabled in SUSE.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Martin Steigerwald(ms-proact) wrote on 2018-05-22T11:06:13.419848+00:00

Scott, I mentioned your comment about having a sane PATH everywhere and ds-identify enabled in SUSE at the SUSE bugtracker. Their change to just remove the generator did not yield the expected result on my SLES 12 SP 3 VM. Cloud Init is simply not started at all then. Reported there.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Martin Steigerwald(ms-proact) wrote on 2018-05-22T11:15:33+00:00

(In reply to Martin Steigerwald from comment #8)

Just removing the generator leads to cloud-init services not being run at
startup on my SLES 12 SP 3 VM with

slestemplate:~ # find /etc/systemd | grep cloud
/etc/systemd/system/cloud-init.target.wants
/etc/systemd/system/cloud-init.target.wants/cloud-final.service
/etc/systemd/system/cloud-init.target.wants/cloud-config.service
/etc/systemd/system/cloud-init.target.wants/cloud-init-local.service
/etc/systemd/system/cloud-init.target.wants/cloud-init.service
[…]
slestemplate:~ # systemctl status cloud-init.target
● cloud-init.target - Cloud-init target
Loaded: loaded (/usr/lib/systemd/system/cloud-init.target; static; vendor
preset: disabled)
Active: inactive (dead)
[…]
So it is still not working out of the box. I enabled all services and the
targets with systemctl enable. It is not obvious for me to enable cloud init
in case the Systemd generator does not do it. I thought I did, but
apparently I did not.

With the work-around

slestemplate:/etc/systemd/system # mv cloud-init.target.wants/* multi-user.target.wants/

cloud-init is started on boot.

Seems cloud-init.target is never triggered. I am not really experienced with targets in Systemd.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Robert Schweikert(rjschwei) wrote on 2018-05-22T11:45:36+00:00

  • Please set the datasource in your cloud.cfg file
  • Disable the target
  • Enable the services

cloud.cfg should contain something along these lines:

datasource_list: [ NoCloud,......, None ]

systemctl disable cloud-init.target
systemctl enable cloud-init-local
systemctl enable cloud-init
systemctl enable cloud-config
systemctl enable cloud-final

or if you are building images with kiwi add the following to config.sh

suseInsertService cloud-init-local
suseInsertService cloud-init
suseInsertService cloud-config
suseInsertService cloud-final

For the next version 18.3 I expect a better solution as upstream has a pending patch for the PATH issue

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Scott Moser(smoser) wrote on 2018-05-22T14:11:09.305026+00:00

An upstream commit landed for this bug.

To view that commit see the following URL:
https://git.launchpad.net/cloud-init/commit/?id=b4ae0e1f

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Robert Schweikert(rjschwei) wrote on 2018-05-22T19:46:56+00:00

OK, this is clearly putting the burden on the user and not really what we want to do. I've pulled the upstream patch to address the PATH issue. New cloud-init on it's way to Factory and available in Cloud:Tools

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Swamp-a(swamp-a) wrote on 2018-05-22T20:20:12+00:00

This is an autogenerated message for OBS integration:
This bug (1093501) was mentioned in
https://build.opensuse.org/request/show/611409 Factory / cloud-init

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Swamp-a(swamp-a) wrote on 2018-05-23T22:14:54+00:00

openSUSE-RU-2018:1407-1: An update that has two recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1089824,1093501
CVE References:
Sources used:
openSUSE Leap 15.0 (src): cloud-init-18.2-lp150.2.3.1

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Martin Steigerwald(ms-proact) wrote on 2018-05-24T10:22:06+00:00

(In reply to Robert Schweikert from comment #12)

OK, this is clearly putting the burden on the user and not really what we
want to do. I've pulled the upstream patch to address the PATH issue. New
cloud-init on it's way to Factory and available in Cloud:Tools

Working out of the box with:

slestemplate:~ # rpm -qa | grep cloud
cloud-init-18.2-4.1.x86_64
cloud-init-config-suse-18.2-4.1.x86_64

Thank you, Robert.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Martin Steigerwald(ms-proact) wrote on 2018-05-24T10:49:26.856738+00:00

Fix confirmed to work with:

slestemplate:~ # rpm -qa | grep cloud
cloud-init-18.2-4.1.x86_64
cloud-init-config-suse-18.2-4.1.x86_64

Thank you, Scott.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Swamp-a(swamp-a) wrote on 2018-06-05T13:50:37+00:00

This is an autogenerated message for OBS integration:
This bug (1093501) was mentioned in
https://build.opensuse.org/request/show/614273 15.0 / cloud-init

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Swamp-a(swamp-a) wrote on 2018-06-07T16:19:37+00:00

SUSE-RU-2018:1575-1: An update that has 10 recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1069635,1072811,1080595,1084509,1084749,1085787,1089824,1092637,1093501,997614
CVE References:
Sources used:
SUSE Linux Enterprise Module for Public Cloud 12 (src): cloud-init-18.2-37.14.1
SUSE CaaS Platform ALL (src): cloud-init-18.2-37.14.1
OpenStack Cloud Magnum Orchestration 7 (src): cloud-init-18.2-37.14.1

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Swamp-a(swamp-a) wrote on 2018-06-08T19:12:49+00:00

openSUSE-RU-2018:1609-1: An update that has 10 recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1069635,1072811,1080595,1084509,1084749,1085787,1089824,1092637,1093501,997614
CVE References:
Sources used:
openSUSE Leap 42.3 (src): cloud-init-18.2-37.1

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Swamp-a(swamp-a) wrote on 2018-06-08T19:15:54+00:00

openSUSE-RU-2018:1613-1: An update that has two recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1089824,1093501
CVE References:
Sources used:
openSUSE Leap 15.0 (src): cloud-init-18.2-lp150.2.6.1

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Scott Moser(smoser) wrote on 2018-06-20T18:06:01.770440+00:00

This bug is believed to be fixed in cloud-init in version 18.3. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Dimitri John Ledkov(xnox) wrote on 2018-09-11T18:53:20.401668+00:00

Systemd by default executes things, with execv, not execve. Hence the default environment is not available. However, cloud-init generator is executed by /bin/sh which does that a built-in default path

$ lxc launch images:opensuse/15.0 test-sh-built-in-path

$ lxc exec test-sh-built-in-path -- env -u PATH /bin/sh -c 'echo $PATH'
/usr/local/bin:/usr/bin:/bin:.

On ubuntu, it is instead:

$ env -u PATH /bin/dash -c 'echo $PATH'
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

$ env -u PATH /bin/bash -c 'echo $PATH'
/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:.

Maybe you want to report a bug against Suse's default /bin/sh about this....

Also /bin/dash and /bin/bash differences are akward....

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Dimitri John Ledkov(xnox) wrote on 2018-09-11T19:03:37.665716+00:00

Systemd by default executes things, with execv, not execve. Hence the default environment is not available. However, cloud-init generator is executed by /bin/sh which does has a built-in default path

$ lxc launch images:opensuse/15.0 test-sh-built-in-path

$ lxc exec test-sh-built-in-path -- env -u PATH /bin/sh -c 'echo $PATH'
/usr/local/bin:/usr/bin:/bin:.

No idea if it is intentional, or not, that "sbin" is excluded there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
launchpad Migrated from Launchpad
Projects
None yet
Development

No branches or pull requests

1 participant