Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report useful message when extension processing is disabled #2895

Merged
merged 69 commits into from
Aug 14, 2023
Merged
Show file tree
Hide file tree
Changes from 52 commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
da72c37
Update version to dummy 1.0.0.0'
maddieford Nov 8, 2022
59dbd22
Revert version change
maddieford Nov 8, 2022
633a826
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Nov 21, 2022
14a743f
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Dec 8, 2022
54ea0f3
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Jan 10, 2023
e79c4c5
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Feb 8, 2023
498b612
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Feb 14, 2023
1e269f4
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Mar 13, 2023
7b49e76
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Mar 24, 2023
0a426cc
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Apr 6, 2023
17fbf6a
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Apr 7, 2023
995cbb9
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Apr 13, 2023
eaadc83
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Apr 24, 2023
fb03e07
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Apr 27, 2023
6a8e0d6
Merge remote-tracking branch 'upstream/develop' into develop
maddieford May 19, 2023
b4951c8
Merge branch 'develop' of github.com:Azure/WALinuxAgent into develop
maddieford Jun 6, 2023
c6d9300
Merge branch 'develop' of github.com:maddieford/WALinuxAgent into dev…
maddieford Jun 23, 2023
f650fe4
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Jul 10, 2023
a10bdfa
Merge branch 'develop' of github.com:maddieford/WALinuxAgent into dev…
maddieford Jul 10, 2023
50dcec5
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Jul 18, 2023
b87db37
merge changes
maddieford Jul 20, 2023
56e84c1
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Jul 25, 2023
fd5d0f5
Merge remote-tracking branch 'upstream/develop' into develop
maddieford Aug 2, 2023
2561746
Fail GS fast in case of extensions disabled
maddieford Aug 7, 2023
4448a30
Update extensions_disabled scenario to look for GS failed instead of …
maddieford Aug 7, 2023
3cf195d
Update to separate onHold and extensions enabled
maddieford Aug 7, 2023
8b2a2d5
Report ext disabled error in handler status
maddieford Aug 8, 2023
e529556
Try using GoalStateUnknownFailure
maddieford Aug 8, 2023
0131ced
Fix indentation error
maddieford Aug 8, 2023
531e738
Try failing ext handler and checking logs
maddieford Aug 8, 2023
400ae62
Report ext processing error
maddieford Aug 8, 2023
f8cb01c
Attempt to fail fast
maddieford Aug 8, 2023
e0ccd1b
Fix param name
maddieford Aug 8, 2023
556bd77
Init error
maddieford Aug 8, 2023
bcbcbb0
Try to reuse current code
maddieford Aug 8, 2023
5e67eb7
Try to reuse current code
maddieford Aug 8, 2023
3d49953
Clean code
maddieford Aug 9, 2023
d323f96
Update scenario tests
maddieford Aug 9, 2023
5156336
Add ext status file to fail fast
maddieford Aug 9, 2023
90d66fd
Fail fast test
maddieford Aug 9, 2023
52bf5e5
Report error when ext disabled
maddieford Aug 9, 2023
422dcf3
Update timeout to 20 mins
maddieford Aug 9, 2023
4b17b7a
Re enable ext for debugging
maddieford Aug 9, 2023
e7961d9
Re enable ext for debugging
maddieford Aug 10, 2023
dd5b5bc
Log agent status update
maddieford Aug 10, 2023
989de90
Create ext status file with error code
maddieford Aug 10, 2023
2a9413b
Create ext status file with error code
maddieford Aug 10, 2023
35036d3
We should report handler status even if not installed in case of exte…
maddieford Aug 10, 2023
f996473
Clean up code change
maddieford Aug 10, 2023
0bf2b33
Update tests for extensions disabled
maddieford Aug 10, 2023
3109a49
Update test comment
maddieford Aug 10, 2023
bd3682b
Update test
maddieford Aug 10, 2023
5b5f774
Remove unused line
maddieford Aug 10, 2023
977d47b
Remove ununsed timeout
maddieford Aug 10, 2023
ec78d64
Test failing case
maddieford Aug 11, 2023
c6b0a39
Remove old case
maddieford Aug 11, 2023
3d74737
Remove unused import
maddieford Aug 11, 2023
d50b64f
Test multiconfig ext
maddieford Aug 11, 2023
ebe68d4
Add multi-config test case
maddieford Aug 11, 2023
a2df75d
Clean up test
maddieford Aug 11, 2023
b766ff7
Improve logging
maddieford Aug 11, 2023
f6696e6
Fix dir for testfile
maddieford Aug 11, 2023
6aeaecb
Merge branch 'develop' into extensions_enabled_fix
maddieford Aug 11, 2023
4a03253
Merge branch 'develop' into extensions_enabled_fix
maddieford Aug 11, 2023
589113d
Remove ignore error rules
maddieford Aug 11, 2023
17060a3
Merge branch 'extensions_enabled_fix' of github.com:maddieford/WALinu…
maddieford Aug 11, 2023
aad90c3
Remove ununsed imports
maddieford Aug 11, 2023
25255af
Set handler status to not ready explicitly
maddieford Aug 11, 2023
f7d9a30
Use OS Util to get agent conf path
maddieford Aug 14, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 26 additions & 10 deletions azurelinuxagent/ga/exthandlers.py
Original file line number Diff line number Diff line change
Expand Up @@ -299,7 +299,7 @@ def run(self):
# we make a deep copy of the extensions, since changes are made to self.ext_handlers while processing the extensions
self.ext_handlers = copy.deepcopy(egs.extensions)

if not self._extension_processing_allowed():
if self._extensions_on_hold():
return

utc_start = datetime.datetime.utcnow()
Expand Down Expand Up @@ -433,17 +433,15 @@ def _cleanup_outdated_handlers(self):
except OSError as e:
logger.warn("Failed to remove extension package {0}: {1}".format(pkg, e.strerror))

def _extension_processing_allowed(self):
if not conf.get_extensions_enabled():
logger.verbose("Extension handling is disabled")
return False

def _extensions_on_hold(self):
if conf.get_enable_overprovisioning():
if self.protocol.get_goal_state().extensions_goal_state.on_hold:
logger.info("Extension handling is on hold")
return False
msg = "Extension handling is on hold"
logger.info(msg)
add_event(op=WALAEventOperation.ExtensionProcessing, message=msg)
return True

return True
return False

@staticmethod
def __get_dependency_level(tup):
Expand Down Expand Up @@ -478,10 +476,25 @@ def handle_ext_handlers(self, goal_state_id):
max_dep_level = self.__get_dependency_level(all_extensions[-1]) if any(all_extensions) else 0

depends_on_err_msg = None
extensions_enabled = conf.get_extensions_enabled()
for extension, ext_handler in all_extensions:

handler_i = ExtHandlerInstance(ext_handler, self.protocol, extension=extension)

# In case of extensions disabled, we skip processing extensions. But CRP is still waiting for some status
# back for the skipped extensions. In order to propagate the status back to CRP, we will report status back
# here with an error message.
if not extensions_enabled:
msg = "Extension will not be processed since extension processing is disabled. To enable extension " \
"processing, set Extensions.Enabled=y in '/etc/waagent.conf'"
handler_i.set_handler_status(message=msg, code=-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you pass status=ExtHandlerStatusValue.not_ready explicitly? this parameter shouldn't have a default value in the first place, the status is very important and having a default makes the code harder to read (especially because 'code' defaults to 0, success, while status defaults to notready, failure)

we should remove those defaults, but on a different PR

handler_i.create_status_file_if_not_exist(extension,
status=ExtensionStatusValue.error,
code=-1,
operation=handler_i.operation,
message=msg)
continue

# In case of depends-on errors, we skip processing extensions if there was an error processing dependent extensions.
# But CRP is still waiting for some status back for the skipped extensions. In order to propagate the status back to CRP,
# we will report status back here with the relevant error message for each of the dependent extension.
Expand Down Expand Up @@ -991,7 +1004,9 @@ def report_ext_handler_status(self, vm_status, ext_handler, goal_state_changed):
# For MultiConfig, we need to report status per extension even for Handler level failures.
# If we have HandlerStatus for a MultiConfig handler and GS is requesting for it, we would report status per
# extension even if HandlerState == NotInstalled (Sample scenario: ExtensionsGoalStateError, DecideVersionError, etc)
if handler_state != ExtHandlerState.NotInstalled or ext_handler.supports_multi_config:
# We also need to report extension status for an uninstalled handler if extensions are disabled because CRP
# waits for extension runtime status before failing the extension operation.
if handler_state != ExtHandlerState.NotInstalled or ext_handler.supports_multi_config or not conf.get_extensions_enabled():

# Since we require reading the Manifest for reading the heartbeat, this would fail if HandlerManifest not found.
# Only try to read heartbeat if HandlerState != NotInstalled.
Expand Down Expand Up @@ -1668,6 +1683,7 @@ def collect_ext_status(self, ext):
# We did not encounter InvalidJsonFile/CouldNotReadStatusFile and thus the status file was correctly written
# and has valid json.
try:
msg = "data in status file is: {0}".format(data)
parse_ext_status(ext_status, data)
if len(data_str) > _MAX_STATUS_FILE_SIZE_IN_BYTES:
raise ExtensionStatusError(msg="For Extension Handler {0} for the sequence number {1}, the status "
Expand Down
14 changes: 13 additions & 1 deletion tests/ga/test_extension.py
Original file line number Diff line number Diff line change
Expand Up @@ -1649,7 +1649,19 @@ def test_extensions_disabled(self, _, *args):
exthandlers_handler.run()
exthandlers_handler.report_ext_handlers_status()

self._assert_no_handler_status(protocol.report_vm_status)
report_vm_status = protocol.report_vm_status
self.assertTrue(report_vm_status.called)
args, kw = report_vm_status.call_args # pylint: disable=unused-variable
vm_status = args[0]
self.assertEqual(1, len(vm_status.vmAgent.extensionHandlers))
exthandler = vm_status.vmAgent.extensionHandlers[0]
self.assertEqual(-1, exthandler.code)
self.assertEqual('NotReady', exthandler.status)
self.assertEqual("Extension will not be processed since extension processing is disabled. To enable extension processing, set Extensions.Enabled=y in '/etc/waagent.conf'", exthandler.message)
ext_status = exthandler.extension_status
self.assertEqual(-1, ext_status.code)
self.assertEqual('error', ext_status.status)
self.assertEqual("Extension will not be processed since extension processing is disabled. To enable extension processing, set Extensions.Enabled=y in '/etc/waagent.conf'", ext_status.message)

def test_extensions_deleted(self, *args):
# Ensure initial enable is successful
Expand Down
48 changes: 39 additions & 9 deletions tests_e2e/tests/extensions_disabled/extensions_disabled.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
from tests_e2e.tests.lib.agent_test import AgentTest
from tests_e2e.tests.lib.identifiers import VmExtensionIds
from tests_e2e.tests.lib.logging import log
from tests_e2e.tests.lib.shell import CommandError
from tests_e2e.tests.lib.ssh_client import SshClient
from tests_e2e.tests.lib.virtual_machine_client import VirtualMachineClient
from tests_e2e.tests.lib.virtual_machine_extension_client import VirtualMachineExtensionClient
Expand All @@ -47,27 +48,43 @@ def run(self):
output = ssh_client.run_command("update-waagent-conf Extensions.Enabled=n", use_sudo=True)
log.info("Disable completed:\n%s", output)

# From now on, extensions will time out; set the timeout to the minimum allowed(15 minutes)
# Set the timeout to the minimum allowed(15 minutes) so we can check that the extension fails before timeout
narrieta marked this conversation as resolved.
Show resolved Hide resolved
log.info("Setting the extension timeout to 15 minutes")
vm: VirtualMachineClient = VirtualMachineClient(self._context.vm)

vm.update({"extensionsTimeBudget": "PT15M"})

disabled_timestamp: datetime.datetime = datetime.datetime.utcnow() - datetime.timedelta(minutes=60)

#
# Validate that the agent is not processing extensions by attempting to run CustomScript
# Validate that the agent is not processing extensions by attempting to run CSE & checking that provisioning fails fast
#
log.info("Executing CustomScript; it should time out after 15 min or so.")
log.info("Executing CustomScript; the agent should report a VMExtensionProvisioningError without processing the extension")
custom_script = VirtualMachineExtensionClient(self._context.vm, VmExtensionIds.CustomScript, resource_name="CustomScript")
try:
custom_script.enable(settings={'commandToExecute': "date"}, force_update=True, timeout=20 * 60)
fail("CustomScript should have timed out")
# Use mkdir in script settings so we can check if the command was executed
custom_script.enable(settings={'commandToExecute': "mkdir /var/lib/waagent/testdir"}, force_update=True, timeout=6 * 60)
fail("The agent should have reported an error processing the goal state")
except Exception as error:
assert_that("VMExtensionProvisioningTimeout" in str(error)) \
.described_as(f"Expected a VMExtensionProvisioningTimeout: {error}") \
assert_that("VMExtensionProvisioningError" in str(error)) \
narrieta marked this conversation as resolved.
Show resolved Hide resolved
.described_as(f"Expected a VMExtensionProvisioningError error, but actual error was: {error}") \
.is_true()
log.info("CustomScript timed out as expected")
assert_that("Extension will not be processed since extension processing is disabled" in str(error)) \
.described_as(f"Error message should communicate that extension will not be processed, but actual error "
f"was: {error}").is_true()
log.info("Goal state processing failed as expected")

#
# Validate the agent is not processing extensions by checking it did not execute the extension settings
#
try:
ssh_client.run_command("dir /var/lib/waagent/testdir", use_sudo=True)
fail("The extension settings should not have been executed")
except CommandError as error:
assert_that(error.stderr)\
.described_as("Extension should not be processed, but extension settings were executed")\
.contains('No such file or directory')
log.info("The agent did not execute the extension settings as expected")
narrieta marked this conversation as resolved.
Show resolved Hide resolved


#
# Validate that the agent continued reporting status even if it is not processing extensions
Expand All @@ -82,6 +99,19 @@ def run(self):
.is_greater_than(pytz.utc.localize(disabled_timestamp))
log.info("The VM Agent reported status after extensions were disabled, as expected.")

#
# Validate that the agent processes extensions after re-enabling extension processing
#
log.info("Enabling extension processing on the test VM [%s]", self._context.vm.name)
output = ssh_client.run_command("update-waagent-conf Extensions.Enabled=y", use_sudo=True)
log.info("Enable completed:\n%s", output)

try:
custom_script.enable(settings={'commandToExecute': "date"}, force_update=True, timeout=15 * 60)
except Exception as error:
log.info("Goal state processing unexpectedly failed after re-enabling extension processing")


maddieford marked this conversation as resolved.
Show resolved Hide resolved
def get_ignore_error_rules(self) -> List[Dict[str, Any]]:
return [
{'message': 'No handler status found for Microsoft.Azure.Extensions.CustomScript'},
Expand Down
Loading