Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fall back to cached local ds if no valid ds found #4997

Merged

Conversation

PengpengSun
Copy link
Contributor

@PengpengSun PengpengSun commented Mar 4, 2024

Proposed Commit Message

Fall back to cached local ds if no valid ds found

Rebooting an instance which has finished VMware guest
customization with DataSourceVMware will load
DataSourceNone due to metadata is NOT available.

This is mostly a re-post of PR#229, few differences are:
1. Let ds decide if fallback is allowed, not always fall back
   to previous cached LOCAL ds.
2. No comparing instance-id of cached ds with previous instance-id
   due to I think they are always identical.

Fixes GH-3402 (GitHub Issue number. Remove line if irrelevant)
LP: #1835205 (Launchpad bug number. Remove line if irrelevant)

Additional Context

Fixes #3402

Test Steps

Checklist

Merge type

  • Squash merge using "Proposed Commit Message"
  • Rebase and merge unique commits. Requires commit messages per-commit each referencing the pull request number (#<PR_NUM>)

Copy link
Member

@holmanb holmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this happens, does ds-identify still correctly identify the platform?

@@ -925,6 +925,10 @@ def check_instance_id(self, sys_cfg):
# quickly (local check only) if self.instance_id is still
return False

def check_fallback(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make this method name more descriptive please? Also we should add a docstring that will communicate to other datasource authors what the purpose of this method is. There are many fallbacks in cloud-init, so this isn't very meaningful on its own.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review.
Yes, I've renamed this method(please suggest a better name if needed), and added a docstring.

@PengpengSun
Copy link
Contributor Author

When this happens, does ds-identify still correctly identify the platform?

Not sure if I get your question correctly, with my understanding, ds-identify identifies the platform before init --local, so this shall not have impact on ds identify process. The purpose of this change is to add an opportunity for the cached ds to be restored after no data can be gotten from ds in the ds list.

cloudinit/stages.py Outdated Show resolved Hide resolved
cloudinit/stages.py Outdated Show resolved Hide resolved
@holmanb holmanb self-assigned this Mar 9, 2024
@PengpengSun PengpengSun requested a review from holmanb March 15, 2024 05:40
@PengpengSun
Copy link
Contributor Author

Hi @holmanb , I'm removing WIP from title, please kindly review this again and let me know if any updates needed. Thanks!

@PengpengSun PengpengSun changed the title WIP: Fall back to cached local ds if no valid ds found Fall back to cached local ds if no valid ds found Mar 19, 2024
ds = self._restore_from_cache()
if (
ds
and hasattr(ds, "check_if_fallback_is_allowed")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is undesirable for two reasons:

  • old pickles will never acquire this attribute
  • encapsulation: we shift pickle handling outside of the class definition which should handle pickle issues directly

How about the following modification?

diff --git a/cloudinit/distros/__init__.py b/cloudinit/distros/__init__.py
index 2c4d54871..e225e4ee8 100644
--- a/cloudinit/distros/__init__.py
+++ b/cloudinit/distros/__init__.py
@@ -185,6 +185,8 @@ class Distro(persistence.CloudInitPickleMixin, metaclass=abc.ABCMeta):
             self._dhcp_client = None
         if not hasattr(self, "_fallback_interface"):
             self._fallback_interface = None
+        if not hasattr(self, "check_if_fallback_is_allowed"):
+            self.check_if_fallback_is_allowed = lambda: False
 
     def _validate_entry(self, entry):
         if isinstance(entry, str):
diff --git a/cloudinit/stages.py b/cloudinit/stages.py
index 13bc26d26..c228805d1 100644
--- a/cloudinit/stages.py
+++ b/cloudinit/stages.py
@@ -378,11 +378,7 @@ class Init:
                 if existing != "check":
                     raise e
                 ds = self._restore_from_cache()
-                if (
-                    ds
-                    and hasattr(ds, "check_if_fallback_is_allowed")
-                    and ds.check_if_fallback_is_allowed()
-                ):
+                if ds and ds.check_if_fallback_is_allowed():
                     LOG.info(
                         "Restored fallback datasource from checked cache: %s",
                         ds,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @holmanb,
Thanks for the review! I've updated the code accordingly(I think it'd be sources/init.py), PTAL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tox is not happy with the last commit, need to fix it.

Copy link
Contributor Author

@PengpengSun PengpengSun Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @holmanb , I've updated code again which fixed the test error, PTAL, thanks!

@PengpengSun PengpengSun requested a review from holmanb March 29, 2024 06:30
Copy link
Member

@holmanb holmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @PengpengSun!

@holmanb holmanb merged commit 9929a00 into canonical:main Mar 29, 2024
28 checks passed
@PengpengSun PengpengSun deleted the fall-back-to-cached-ds-if-no-valid-ds-found branch April 2, 2024 06:45
holmanb pushed a commit that referenced this pull request Apr 3, 2024
Rebooting an instance which has finished VMware guest
customization with DataSourceVMware will load
DataSourceNone due to metadata is NOT available.

This is mostly a re-post of PR#229, few differences are:
1. Let ds decide if fallback is allowed, not always fall back
   to previous cached LOCAL ds.
2. No comparing instance-id of cached ds with previous instance-id
   due to I think they are always identical.

Fixes GH-3402
holmanb pushed a commit that referenced this pull request Apr 3, 2024
Rebooting an instance which has finished VMware guest
customization with DataSourceVMware will load
DataSourceNone due to metadata is NOT available.

This is mostly a re-post of PR#229, few differences are:
1. Let ds decide if fallback is allowed, not always fall back
   to previous cached LOCAL ds.
2. No comparing instance-id of cached ds with previous instance-id
   due to I think they are always identical.

Fixes GH-3402
holmanb pushed a commit that referenced this pull request Apr 3, 2024
Rebooting an instance which has finished VMware guest
customization with DataSourceVMware will load
DataSourceNone due to metadata is NOT available.

This is mostly a re-post of PR#229, few differences are:
1. Let ds decide if fallback is allowed, not always fall back
   to previous cached LOCAL ds.
2. No comparing instance-id of cached ds with previous instance-id
   due to I think they are always identical.

Fixes GH-3402
@ani-sinha
Copy link
Contributor

@holmanb is it possible to backport this patch to 24.1 train maybe in 24.1.5?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OVF datasource should check if instant id is still on VMware Platform
3 participants