Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move GCE metadata fetch to init-local (SC-502) #1122

Merged
merged 4 commits into from
Dec 2, 2021

Conversation

TheRealFalcon
Copy link
Member

Proposed Commit Message

Move GCE metadata fetch to init-local

GCE currently fetches metadata after network has come up. There's no
reason we can't fetch at init-local time, so update GCE to fetch at
init-local time to be more consistent with other datasources.

Additional Context

Test Steps

Upload dpkg to GCE instance and install
cloud-init clean --logs --reboot
Verify metadata is fetched at init-local time
Verify metadata is NOT fetched at init time

cloud-init.log can be seen at:
https://paste.ubuntu.com/p/tY3RkS9rKY/

Additionally, I ran integration tests on GCE

Checklist:

  • My code follows the process laid out in the documentation
  • I have updated or added any unit tests accordingly
  • I have updated or added any documentation accordingly

GCE currently fetches metadata after network has come up. There's no
reason we can't fetch at init-local time, so update GCE to fetch at
init-local time to be more consistent with other datasources.
@TheRealFalcon TheRealFalcon changed the title Move GCE metadata fetch to init-local Move GCE metadata fetch to init-local (SC-502) Nov 29, 2021
Copy link
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Content looks good, I think we need at least some unittest and integration test coverage.

For integration tests can we add a @pytest.mark.gce test case to tests.integration_tests.test_combined.TestCombined for test_instance_json_gce somewhat like what we have for the other clouds.

I'd also like to extend those combined tests with a datasource detection test. something like

diff --git a/tests/integration_tests/modules/test_combined.py b/tests/integration_tests/modules/test_combined.py
index bc19c2a2..4bee0f20 100644
--- a/tests/integration_tests/modules/test_combined.py
+++ b/tests/integration_tests/modules/test_combined.py
@@ -195,6 +195,32 @@ class TestCombined:
             'date "+%Z" --date="Thu, 03 Nov 2016 00:47:00 -0400"')
         assert timezone_output.strip() == "HDT"
 
+    def test_correct_datasource_detected(self, class_client: IntegrationInstance):
+        """Test datasource is detected at the proper boot stage."""
+        client = class_client
+        status_file = client.read_from_file('/run/cloud-init/status.json')
+
+         if instance.settings.PLATFORM == 'azure':
+             datasource = 'DataSourceAzure [seed=/dev/sr0]'
+         elif instance.settings.PLATFORM == 'ec2':
+             datasource = 'DataSourceEc2Local'
+         elif instance.settings.PLATFORM == 'gce':
+             datasource = 'DataSourceGCELocal'
+         elif instance.settings.PLATFORM == 'oci':
+             datasource = 'DataSourceOracle'
+         elif instance.settings.PLATFORM == 'openstack':
+             datasource = 'DataSourceOpenStackLocal [net,ver=2]'
+         elif instance.settings.PLATFORM in 'lxd_container':
+             datasource = (
+                 'DataSourceNoCloud'
+                 ' [seed=/var/lib/cloud/seed/nocloud-net][dsmode=net]'
+             )
+         elif instance.settings.PLATFORM == 'lxd_vm':
+             datasource = 'DataSourceNoCloud [seed=/dev/sr0][dsmode=net]'
+         else:
+             datasource = "UNKNOWN_PLATFORM"
+         assert datasrouce == json.loads(status_file)['v1']['datasource']
+
     def test_no_problems(self, class_client: IntegrationInstance):
         """Test no errors, warnings, or tracebacks"""
         client = class_client

What do you think?

cloudinit/sources/DataSourceGCE.py Show resolved Hide resolved
@TheRealFalcon
Copy link
Member Author

@blackboxsw Added some tests. Please re-review.

@blackboxsw
Copy link
Collaborator

Cool BTW on the cloud-init analyze show on your boot logs looking around a 30% "performance" gain in the boot time for that GCP instance. We'll see if that performance improvement holds in general over a broad set of runs, detecting in init-local instead of init-network stage.

Copy link
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much cleaner approach on your integration test updates than my suggestion +1!

Looks like some mock issues in CI which prevent package builds and result in failures on integration tests.
Could this work?

diff --git a/tests/unittests/test_datasource/test_gce.py b/tests/unittests/test_datasource/test_gce.py
index c9360da4..7a6848bc 100644
--- a/tests/unittests/test_datasource/test_gce.py
+++ b/tests/unittests/test_datasource/test_gce.py
@@ -371,8 +371,9 @@ class TestDataSourceGCE(test_helpers.HttprettyTestCase):
         ds = DataSourceGCE.DataSourceGCELocal(
             sys_cfg={}, distro=None, paths=None
         )
+        _set_mock_metadata()
         ds._get_data()
-        m_dhcp.assert_called_once()
+        self.assertEqual(1, m_dhcp.call_count)
 
     @mock.patch(
         "cloudinit.sources.DataSourceGCE.EphemeralDHCPv4",
@@ -380,6 +381,7 @@ class TestDataSourceGCE(test_helpers.HttprettyTestCase):
     )
     def test_datasource_doesnt_use_ephemeral_dhcp(self, m_dhcp):
         ds = DataSourceGCE.DataSourceGCE(sys_cfg={}, distro=None, paths=None)
+        _set_mock_metadata()
         ds._get_data()
         m_dhcp.assert_not_called()

Approved once CI is fixed.

sys_cfg={}, distro=None, paths=None
)
ds._get_data()
m_dhcp.assert_called_once()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DNE, you probably meant assert_called_once_with. But we can instead just check call_count

Suggested change
m_dhcp.assert_called_once()
self.assertEqual(1, m_dhcp.call_count)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh...so since I knew the death of 3.5 is imminent for cloud-init, and Brett showed me a really cool package for black formatting of only changed code that is 3.6+, I figured I can upgrade my local dev environment because am I really going to run into any python 3.5-only stuff in the next 3 weeks?

Apparently the answer is yes 🤦‍♂️

@TheRealFalcon
Copy link
Member Author

It's interesting that these mocks are only necessary for 3.5. I wonder what sort of magic is or isn't happening in 3.5.

@TheRealFalcon TheRealFalcon merged commit cf38c2c into canonical:main Dec 2, 2021
@TheRealFalcon TheRealFalcon deleted the gce-local branch December 2, 2021 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants