Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] Fix multi-asic flaky lldp not ready issue #9453

Merged
merged 2 commits into from
Aug 16, 2023

Conversation

yejianquan
Copy link
Collaborator

Description of PR

In the recover of sanity check, for config_reload, we simply sleep 120s.
In some topology, like multi-asic scenario, it's not enough.
lldp services get a big chance that not ready after 120s, then sanity failed.

Use config_reload with safe_reload to better handle this flaky issue.

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 201911
  • 202012
  • 202205

Approach

What is the motivation for this PR?

Fix multi-asic lldp flaky issue.

How did you do it?

Use config_reload with safe_reload to better handle this flaky issue.

How did you verify/test it?

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

wangxin
wangxin previously approved these changes Aug 15, 2023
@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The pre-commit check is a mandatory check, please fix detected issues.

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

tests/common/plugins/sanity_check/recover.py:8:1: F401 'tests.common.config_reload.config_force_option_supported' imported but unused
tests/common/reboot.py:195:82: E203 whitespace before ','

flake8...............................................(no files to check)Skipped
check conditional mark sort..........................(no files to check)Skipped

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

@yejianquan
Copy link
Collaborator Author

20/20 verification passed:
image

Copy link
Collaborator

@wangxin wangxin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yejianquan yejianquan merged commit 572793b into sonic-net:master Aug 16, 2023
14 checks passed
@mssonicbld
Copy link
Collaborator

@yejianquan PR conflicts with 202012 branch

@mssonicbld
Copy link
Collaborator

@yejianquan PR conflicts with 202205 branch

mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Aug 16, 2023
Description of PR
In the recover of sanity check, for config_reload, we simply sleep 120s.
In some topology, like multi-asic scenario, it's not enough.
lldp services get a big chance that not ready after 120s, then sanity failed.

Use config_reload with safe_reload to better handle this flaky issue.

Approach
What is the motivation for this PR?
Fix multi-asic lldp flaky issue.

How did you do it?
Use config_reload with safe_reload to better handle this flaky issue.

co-authorized by: jianquanye@microsoft.com
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202305: #9476

yejianquan added a commit to yejianquan/sonic-mgmt that referenced this pull request Aug 16, 2023
Description of PR
In the recover of sanity check, for config_reload, we simply sleep 120s.
In some topology, like multi-asic scenario, it's not enough.
lldp services get a big chance that not ready after 120s, then sanity failed.

Use config_reload with safe_reload to better handle this flaky issue.

Approach
What is the motivation for this PR?
Fix multi-asic lldp flaky issue.

How did you do it?
Use config_reload with safe_reload to better handle this flaky issue.

co-authorized by: jianquanye@microsoft.com
yejianquan added a commit to yejianquan/sonic-mgmt that referenced this pull request Aug 16, 2023
Description of PR
In the recover of sanity check, for config_reload, we simply sleep 120s.
In some topology, like multi-asic scenario, it's not enough.
lldp services get a big chance that not ready after 120s, then sanity failed.

Use config_reload with safe_reload to better handle this flaky issue.

Approach
What is the motivation for this PR?
Fix multi-asic lldp flaky issue.

How did you do it?
Use config_reload with safe_reload to better handle this flaky issue.

co-authorized by: jianquanye@microsoft.com
@yejianquan
Copy link
Collaborator Author

@yejianquan PR conflicts with 202012 branch

#9479

@yejianquan
Copy link
Collaborator Author

@yejianquan PR conflicts with 202205 branch

#9478

yejianquan added a commit that referenced this pull request Aug 16, 2023
cherry-pick #9453
Description of PR
In the recover of sanity check, for config_reload, we simply sleep 120s.
In some topology, like multi-asic scenario, it's not enough.
lldp services get a big chance that not ready after 120s, then sanity failed.

Use config_reload with safe_reload to better handle this flaky issue.

Approach
What is the motivation for this PR?
Fix multi-asic lldp flaky issue.

How did you do it?
Use config_reload with safe_reload to better handle this flaky issue.

co-authorized by: jianquanye@microsoft.com
yejianquan added a commit that referenced this pull request Aug 16, 2023
cherry-pick #9453
Description of PR
In the recover of sanity check, for config_reload, we simply sleep 120s.
In some topology, like multi-asic scenario, it's not enough.
lldp services get a big chance that not ready after 120s, then sanity failed.

Use config_reload with safe_reload to better handle this flaky issue.

Approach
What is the motivation for this PR?
Fix multi-asic lldp flaky issue.

How did you do it?
Use config_reload with safe_reload to better handle this flaky issue.

co-authorized by: jianquanye@microsoft.com

Co-authored-by: Ye Jianquan <jianquanye@microsoft.com>
yejianquan added a commit that referenced this pull request Aug 16, 2023
Cherry-pick: #9453
Description of PR
In the recover of sanity check, for config_reload, we simply sleep 120s.
In some topology, like multi-asic scenario, it's not enough.
lldp services get a big chance that not ready after 120s, then sanity failed.

Use config_reload with safe_reload to better handle this flaky issue.

Approach
What is the motivation for this PR?
Fix multi-asic lldp flaky issue.

How did you do it?
Use config_reload with safe_reload to better handle this flaky issue.

co-authorized by: jianquanye@microsoft.com
AharonMalkin pushed a commit to AharonMalkin/sonic-mgmt that referenced this pull request Jan 25, 2024
Description of PR
In the recover of sanity check, for config_reload, we simply sleep 120s.
In some topology, like multi-asic scenario, it's not enough.
lldp services get a big chance that not ready after 120s, then sanity failed.

Use config_reload with safe_reload to better handle this flaky issue.

Approach
What is the motivation for this PR?
Fix multi-asic lldp flaky issue.

How did you do it?
Use config_reload with safe_reload to better handle this flaky issue.

co-authorized by: jianquanye@microsoft.com
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants