Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"failed stage init-local" while testing autoinstall on Noble #4630

Closed
p-gentili opened this issue Nov 27, 2023 · 2 comments
Closed

"failed stage init-local" while testing autoinstall on Noble #4630

p-gentili opened this issue Nov 27, 2023 · 2 comments
Labels
bug Something isn't working correctly priority Fix soon

Comments

@p-gentili
Copy link

Bug report

I can't make autoinstall work on the latest Noble image. I can see tracebacks in the cloud-init.log and, as a result, subiquity doesn't not recognize autoinstall was provided.

I tested both USB and HTTP methods for provisioning autoinstall configuration files but no luck in both cases. The logs below refers to the HTTP test.

Steps to reproduce the problem

  • Flash the latest amd64 Noble ISO
  • Host this user-data file over HTTP
#cloud-config
autoinstall:
  version: 1
  identity:
    hostname: ai-test
    password: "$6$exDY1mhS4KUYCE/2$zmn9ToZwTKLhCw.b4/b.ZRTIZM30JZ4QrOQ2aOXJ8yk96xpcCof0kxKwuX1kqLG/ygbJ1f8wxED22bTL4F46P0"
    username: ubuntu
  ssh:
    install-server: true
  storage:
    layout:
      name: hybrid

Apart from this test, I've also tested both the basic examples in here without success.

Environment details

  • Cloud-init version: 23.4~3g0cb0b80f-0ubuntu1
  • Operating System Distribution: Ubuntu Noble
  • Cloud provider, platform or installer type: amd64 ISO

cloud-init logs

2023-11-27 16:00:45,678 - util.py[DEBUG]: failed stage init-local
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 386, in main_init
    init.fetch(existing=existing)
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 469, in fetch
    return self._get_data_source(existing=existing)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 360, in _get_data_source
    (ds, dsname) = sources.find_source(
                   ^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 1028, in find_source
    raise DataSourceNotFoundException(msg)
cloudinit.sources.DataSourceNotFoundException: Did not find any data source, searched classes: (DataSourceNoCloud)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 772, in status_wrapper
    ret = functor(name, args)
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 407, in main_init
    init.apply_network_config(bring_up=bring_up_interfaces)
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 1026, in apply_network_config
    self._apply_netcfg_names(netcfg)
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 948, in _apply_netcfg_names
    atomic_helper.write_json(
  File "/usr/lib/python3/dist-packages/cloudinit/atomic_helper.py", line 89, in write_json
    return write_file(
           ^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/cloudinit/atomic_helper.py", line 65, in write_file
    raise e
  File "/usr/lib/python3/dist-packages/cloudinit/atomic_helper.py", line 46, in write_file
    tf = tempfile.NamedTemporaryFile(
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/tempfile.py", line 702, in NamedTemporaryFile
    file = _io.open(dir, mode, buffering=buffering,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/tempfile.py", line 699, in opener
    fd, name = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/tempfile.py", line 395, in _mkstemp_inner
    fd = _os.open(file, flags, 0o600)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/var/lib/cloud/instance/tmpmxvzqpqp'
@p-gentili p-gentili added bug Something isn't working correctly new An issue that still needs triage labels Nov 27, 2023
@blackboxsw blackboxsw removed the new An issue that still needs triage label Nov 28, 2023
@blackboxsw
Copy link
Collaborator

I can reproduce this on Nobel server-live images in manual install case as well. Marking this as needed for 23.4. Investigating the source of the issue here and we want to ensure this is resolved/understood before we release cloud-init 23.4.

@blackboxsw blackboxsw added the priority Fix soon label Nov 28, 2023
@blackboxsw
Copy link
Collaborator

Ok, the issue was introduced in this commit ee86a37 and affects any datasource detected only in the init-Network boot stage. When no datasource is discovered/detected during the Local boot stage, cloud-init still calls apply_network_config to emit a basic dhcp on primary NIC configuration prior to system networking being brought up. Due to the above commit, cloud-init also attempts to write the generated network configuration dictionary to the expected symlink: /var/lib/cloud/instance/network-config.json. In the Local boot stage, when no datasource is yet detected, the symlink doesn't yet exist at /var/lib/cloud/instance/. So, we need to ensure that cloud-init doesn't try to write out that network-config.json file yet, it should wait until a proper datasource is detected first.

blackboxsw added a commit to blackboxsw/cloud-init that referenced this issue Nov 28, 2023
In Local boot stage (pre-systemnetworking) only write
/var/lib/cloud/instance/network-config.json if the symlink
/v/l/c/instance already exits. This instance symlink is created only
once a datasource is discovered in either Local or Network boot
stage.

This bug affects any environment where the detected datasource is
only discovered in the Network boot stage.
If no viable datasource is detected during the Local boot stage,
cloud-init calls apply_network_config to minimally setup a basic
fallback network config with DHCP on the primary NIC.

In this case, cloud-init will no longer attempt to write the
fallback network-config.json during Local boot. Defer writing the
networ-config.json to Network boot stage once
/var/lib/cloud/instance exists for a detected datasource.

Fixes canonicalGH-4630
blackboxsw added a commit to blackboxsw/cloud-init that referenced this issue Nov 28, 2023
…ical#4635)

In Local boot stage (pre-systemnetworking) only write
/var/lib/cloud/instance/network-config.json if the symlink
/v/l/c/instance already exits. This instance symlink is created only
once a datasource is discovered in either Local or Network boot
stage.

This bug affects any environment where the detected datasource is
only discovered in the Network boot stage.
If no viable datasource is detected during the Local boot stage,
cloud-init calls apply_network_config to minimally setup a basic
fallback network config with DHCP on the primary NIC.

In this case, cloud-init will no longer attempt to write the
fallback network-config.json during Local boot. Defer writing the
networ-config.json to Network boot stage once
/var/lib/cloud/instance exists for a detected datasource.

Fixes canonicalGH-4630
blackboxsw added a commit to blackboxsw/cloud-init that referenced this issue Nov 29, 2023
…ical#4635)

Only write /var/lib/cloud/instance/network-config.json once
datasource is detected. The /var/lib/clound/instance symlink is
created by Init.instancify after datasource detection is complete.

Migrate creation of /var/lib/cloud/instance/network-config.json out
of apply_network_config and into instancify, after the datasource
is detected.

This fixes a bug by ensuring /var/lib/cloud/instance exists before
persisting /v/l/cloud/instance/network-config.json.

Fixes canonicalGH-4630
blackboxsw added a commit to blackboxsw/cloud-init that referenced this issue Nov 29, 2023
…4635)

Only write /var/lib/cloud/instance/network-config.json once
datasource is detected. The /var/lib/clound/instance symlink is
created by Init.instancify after datasource detection is complete.

Move creation of /var/lib/cloud/instance/network-config.json into
it's own method _write_network_config_json. It will be called by
any call to apply_network_config.

apply_network_config is called in both Local and Network stages.

In Local stage, apply_network_config is used to either:
 - render the final network config of datasource detected in Local
 - in absence of Local datasource, render basic fallback DHCP on
   primary NIC to allow network to come up before detecting a
   Network datasource

For Network datasources, they will not have been discovered or
instancify'd in Local boot stage, so apply_network_config cannot
yet persist network-config.json.

Defer creation of network-config.json for Network datasources
until until the link /var/lib/cloud/instance exists and
apply_network_config is called in Network stage to render final
network config.

Fixes canonicalGH-4630
blackboxsw added a commit to blackboxsw/cloud-init that referenced this issue Nov 30, 2023
…4635)

Only write /var/lib/cloud/instance/network-config.json once
datasource is detected. The /var/lib/clound/instance symlink is
created by Init.instancify after datasource detection is complete.

Move creation of /var/lib/cloud/instance/network-config.json into
it's own method _write_network_config_json. It will be called by
any call to apply_network_config.

apply_network_config is called in both Local and Network stages.

In Local stage, apply_network_config is used to either:
 - render the final network config of datasource detected in Local
 - in absence of Local datasource, render basic fallback DHCP on
   primary NIC to allow network to come up before detecting a
   Network datasource

For Network datasources, they will not have been discovered or
instancify'd in Local boot stage, so apply_network_config cannot
yet persist network-config.json.

Defer creation of network-config.json for Network datasources
until until the link /var/lib/cloud/instance exists and
apply_network_config is called in Network stage to render final
network config.

Fixes canonicalGH-4630
blackboxsw added a commit to blackboxsw/cloud-init that referenced this issue Nov 30, 2023
…4635)

Only write /var/lib/cloud/instance/network-config.json once
datasource is detected. The /var/lib/clound/instance symlink is
created by Init.instancify after datasource detection is complete.

Move creation of /var/lib/cloud/instance/network-config.json into
it's own method _write_network_config_json. It will be called by
any call to apply_network_config.

apply_network_config is called in both Local and Network stages.

In Local stage, apply_network_config is used to either:
 - render the final network config of datasource detected in Local
 - in absence of Local datasource, render basic fallback DHCP on
   primary NIC to allow network to come up before detecting a
   Network datasource

For Network datasources, they will not have been discovered or
instancify'd in Local boot stage, so apply_network_config cannot
yet persist network-config.json.

Defer creation of network-config.json for Network datasources
until until the link /var/lib/cloud/instance exists and
apply_network_config is called in Network stage to render final
network config.

Fixes canonicalGH-4630
blackboxsw added a commit to blackboxsw/cloud-init that referenced this issue Nov 30, 2023
…4635)

Only write /var/lib/cloud/instance/network-config.json once
datasource is detected. The /var/lib/clound/instance symlink is
created by Init.instancify after datasource detection is complete.

Move creation of /var/lib/cloud/instance/network-config.json into
it's own method _write_network_config_json. It will be called by
any call to apply_network_config.

apply_network_config is called in both Local and Network stages.

In Local stage, apply_network_config is used to either:
 - render the final network config of datasource detected in Local
 - in absence of Local datasource, render basic fallback DHCP on
   primary NIC to allow network to come up before detecting a
   Network datasource

For Network datasources, they will not have been discovered or
instancify'd in Local boot stage, so apply_network_config cannot
yet persist network-config.json.

Defer creation of network-config.json for Network datasources
until until the link /var/lib/cloud/instance exists and
apply_network_config is called in Network stage to render final
network config.

Fixes canonicalGH-4630
blackboxsw added a commit to blackboxsw/cloud-init that referenced this issue Nov 30, 2023
…4635)

Only write /var/lib/cloud/instance/network-config.json once
datasource is detected. The /var/lib/clound/instance symlink is
created by Init.instancify after datasource detection is complete.

Move creation of /var/lib/cloud/instance/network-config.json into
it's own method _write_network_config_json. It will be called by
any call to apply_network_config.

apply_network_config is called in both Local and Network stages.

In Local stage, apply_network_config is used to either:
 - render the final network config of datasource detected in Local
 - in absence of Local datasource, render basic fallback DHCP on
   primary NIC to allow network to come up before detecting a
   Network datasource

For Network datasources, they will not have been discovered or
instancify'd in Local boot stage, so apply_network_config cannot
yet persist network-config.json.

Defer creation of network-config.json for Network datasources
until until the link /var/lib/cloud/instance exists and
apply_network_config is called in Network stage to render final
network config.

Fixes canonicalGH-4630
blackboxsw added a commit to blackboxsw/cloud-init that referenced this issue Nov 30, 2023
…4635)

Only write /var/lib/cloud/instance/network-config.json once
datasource is detected. The /var/lib/clound/instance symlink is
created by Init.instancify after datasource detection is complete.

Move creation of /var/lib/cloud/instance/network-config.json into
it's own method _write_network_config_json. It will be called by
any call to apply_network_config.

apply_network_config is called in both Local and Network stages.

In Local stage, apply_network_config is used to either:
 - render the final network config of datasource detected in Local
 - in absence of Local datasource, render basic fallback DHCP on
   primary NIC to allow network to come up before detecting a
   Network datasource

For Network datasources, they will not have been discovered or
instancify'd in Local boot stage, so apply_network_config cannot
yet persist network-config.json.

Defer creation of network-config.json for Network datasources
until until the link /var/lib/cloud/instance exists and
apply_network_config is called in Network stage to render final
network config.

Fixes canonicalGH-4630
blackboxsw added a commit to blackboxsw/cloud-init that referenced this issue Nov 30, 2023
…4635)

Only write /var/lib/cloud/instance/network-config.json once
datasource is detected. The /var/lib/clound/instance symlink is
created by Init.instancify after datasource detection is complete.

Move creation of /var/lib/cloud/instance/network-config.json into
a separate method _write_network_config_json. It will be called by
any call to apply_network_config.

apply_network_config is called in both Local and Network stages.

In Local stage, apply_network_config is used to either:
 - render the final network config of datasource detected in Local
 - in absence of Local datasource, render basic fallback DHCP on
   primary NIC to allow network to come up before detecting a
   Network datasource

For Network datasources, they will not have been discovered or
instancify'd in Local boot stage, so apply_network_config cannot
yet persist network-config.json.

Defer creation of network-config.json for Network datasources
until the link /var/lib/cloud/instance exists and
apply_network_config is called in Network stage to render final
network config.

Fixes canonicalGH-4630
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working correctly priority Fix soon
Projects
None yet
Development

No branches or pull requests

2 participants