Skip to content

Commit

Permalink
[config reload] Call systemctl reset-failed for snmp,telemetry,mgmt-f…
Browse files Browse the repository at this point in the history
…ramework services (#1773)

#### What I did

When issue `config reload -y` or `config load_minigraph -y` command, most of the sonic services will be reset by command `systemctl reset-failed <service_name>`. The purpose is to avoid services reach to its start retry limit and cannot be started. However, `systemctl reset-failed` only resets those services belong to sonic.target, snmp, telemetry and mgmt-framework are not part of them. So if we run `config reload -y` or `config load_minigraph -y` continues, snmp, telemetry and mgmt-framework services might enter into failed state. This PR is to fix the issue.

I would like to cherry-pick this fix to 202012 branch, but this fix also depends on PR #7846. So if we decide to cherry-pick this PR to 202012, we need cherry-pick #7846 first.

#### How I did it

Also call `systemctl reset-failed` for services like snmp, telemetry and mgmt-framework. 

#### How to verify it

Manual test.
  • Loading branch information
Junchao-Mellanox authored Aug 26, 2021
1 parent 6fd0675 commit f5ce87a
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 4 deletions.
9 changes: 7 additions & 2 deletions config/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -691,11 +691,16 @@ def _stop_services():

def _get_sonic_services():
out = clicommon.run_command("systemctl list-dependencies --plain sonic.target | sed '1d'", return_cmd=True)
return [unit.strip() for unit in out.splitlines()]
return (unit.strip() for unit in out.splitlines())


def _get_delayed_sonic_services():
out = clicommon.run_command("systemctl list-dependencies --plain sonic-delayed.target | sed '1d'", return_cmd=True)
return (unit.strip().rstrip('.timer') for unit in out.splitlines())


def _reset_failed_services():
for service in _get_sonic_services():
for service in itertools.chain(_get_sonic_services(), _get_delayed_sonic_services()):
clicommon.run_command("systemctl reset-failed {}".format(service))


Expand Down
14 changes: 12 additions & 2 deletions tests/config_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,13 @@ def mock_run_command_side_effect(*args, **kwargs):
click.echo(click.style("Running command: ", fg='cyan') + click.style(command, fg='green'))

if kwargs.get('return_cmd'):
return ''
if command == "systemctl list-dependencies --plain sonic-delayed.target | sed '1d'":
return 'snmp.timer'
elif command == "systemctl list-dependencies --plain sonic.target | sed '1d'":
return 'swss'
else:
return ''


class TestLoadMinigraph(object):
@classmethod
Expand All @@ -55,7 +61,11 @@ def test_load_minigraph(self, get_cmd_module, setup_single_broadcom_asic):
traceback.print_tb(result.exc_info[2])
assert result.exit_code == 0
assert "\n".join([l.rstrip() for l in result.output.split('\n')]) == load_minigraph_command_output
assert mock_run_command.call_count == 7
# Verify "systemctl reset-failed" is called for services under sonic.target
mock_run_command.assert_any_call('systemctl reset-failed swss')
# Verify "systemctl reset-failed" is called for services under sonic-delayed.target
mock_run_command.assert_any_call('systemctl reset-failed snmp')
assert mock_run_command.call_count == 10

def test_load_minigraph_with_port_config_bad_format(self, get_cmd_module, setup_single_broadcom_asic):
with mock.patch(
Expand Down

0 comments on commit f5ce87a

Please sign in to comment.