Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Mellanox] Support DSCP remapping in dual ToR topo on T0 switch #12605

Merged
merged 19 commits into from
Feb 7, 2023

Conversation

stephenxs
Copy link
Collaborator

@stephenxs stephenxs commented Nov 4, 2022

Why I did it

Support DSCP remapping in dual ToR topo on T0 switch for SKU Mellanox-SN4600c-C64, Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8.

How I did it

Regarding buffer settings, originally, there are two lossless PGs and queues 3, 4. In dual ToR scenario, the lossless traffic from the leaf switch to the uplink of the ToR switch can be bounced back.
To avoid PFC deadlock, we need to map the bounce-back lossless traffic to different PGs and queues. Therefore, 2 additional lossless PGs and queues are allocated on uplink ports on ToR switches.

  • On uplink ports, map DSCP 2/6 to TC 2/6 respectively
  • On downlink ports, both DSCP 2/6 are still mapped to TC 1
  • Buffer adjusted according to the ports information:
    • Mellanox-SN4600c-C64:
      • 56 downlinks 50G + 8 uplinks 100G
    • Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8:
      • 24 downlinks 50G + 8 uplinks 100G

How to verify it

Unit test.

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211

Description for the changelog

Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@stephenxs
Copy link
Collaborator Author

Building vs failed due to the following error. Retrying.

fatal: No names found, cannot describe anything.
fatal: No names found, cannot describe anything.
fatal: No names found, cannot describe anything.
fatal: No names found, cannot describe anything.
fatal: No names found, cannot describe anything.

@stephenxs
Copy link
Collaborator Author

/azpw run azure.sonic-buildimage

@mssonicbld
Copy link
Collaborator

/AzurePipelines run azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@stephenxs stephenxs force-pushed the dual-tor-dscp-remapping branch from 0bb04f8 to 4dc91f3 Compare November 23, 2022 06:02
@stephenxs
Copy link
Collaborator Author

Failed due to environmental issue. Retriggering.

Traceback (most recent call last):
  File "/usr/lib/python3.9/shutil.py", line 806, in move
    os.rename(src, real_dst)
FileNotFoundError: [Errno 2] No such file or directory: '/var/AzDevOps/.local/lib/python3.9/site-packages/tests/__init__.py' -> '/tmp/pip-uninstall-ypwbu40b/__init__.py'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/pip/_internal/cli/base_command.py", line 160, in exc_logging_wrapper
    status = run_func(*args)
  File "/usr/local/lib/python3.9/dist-packages/pip/_internal/commands/uninstall.py", line 98, in run
    uninstall_pathset = req.uninstall(
  File "/usr/local/lib/python3.9/dist-packages/pip/_internal/req/req_install.py", line 660, in uninstall
    uninstalled_pathset.remove(auto_confirm, verbose)
  File "/usr/local/lib/python3.9/dist-packages/pip/_internal/req/req_uninstall.py", line 373, in remove
    moved.stash(path)
  File "/usr/local/lib/python3.9/dist-packages/pip/_internal/req/req_uninstall.py", line 271, in stash
    renames(path, new_path)
  File "/usr/local/lib/python3.9/dist-packages/pip/_internal/utils/misc.py", line 311, in renames
    shutil.move(old, new)
  File "/usr/lib/python3.9/shutil.py", line 820, in move
    copy_function(src, real_dst)
  File "/usr/lib/python3.9/shutil.py", line 435, in copy2
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/usr/lib/python3.9/shutil.py", line 264, in copyfile
    with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
FileNotFoundError: [Errno 2] No such file or directory: '/var/AzDevOps/.local/lib/python3.9/site-packages/tests/__init__.py'
make: *** [slave.mk:817: target/python-wheels/bullseye/sonic_utilities-1.2-py3-none-any.whl] Error 2
make: *** Waiting for unfinished jobs....
[ finished ] [ target/python-wheels/bullseye/sonic_thermalctld-1.0-py3-none-any.whl ] 
[ finished ] [ target/python-wheels/bullseye/sonic_psud-1.0-py3-none-any.whl ] 
e9ac64c622531679b1968c428d1c98ed7ee618bd97cc59d7c711b7ebf3f57ef0
docker-config-engine-bullseye
[ finished ] [ target/docker-config-engine-bullseye.gz ] 
WARNING: Ignoring invalid distribution -onic-utilities (/var/AzDevOps/.local/lib/python3.9/site-packages)
make[1]: *** [Makefile.work:535: target/sonic-aboot-broadcom.swi] Error 2
make[1]: Leaving directory '/agent/_work/2/s'
make: *** [Makefile:41: target/sonic-aboot-broadcom.swi] Error 2

@stephenxs
Copy link
Collaborator Author

/azpw run azure.sonic-buildimage

@mssonicbld
Copy link
Collaborator

/AzurePipelines run azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@stephenxs
Copy link
Collaborator Author

/azpw run azure.sonic-buildimage

@mssonicbld
Copy link
Collaborator

/AzurePipelines run azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@stephenxs
Copy link
Collaborator Author

/azpw run azure.sonic-buildimage

@mssonicbld
Copy link
Collaborator

/AzurePipelines run azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Mellanox-SN4600c-C64:
- 56 downlinks 50G + 8 uplinks 100G
Mellanox-SN4600c-D48C40/Mellanox-SN2700/Mellanox-SN2700-D48C8:
- 24 downlinks 50G + 8 uplinks 100G

Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
…l ToR t0 test cases

Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@stephenxs
Copy link
Collaborator Author

/azpw run azure.sonic-buildimage

@mssonicbld
Copy link
Collaborator

/AzurePipelines run azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@liat-grozovik liat-grozovik merged commit e3ff088 into sonic-net:master Feb 7, 2023
@stephenxs stephenxs deleted the dual-tor-dscp-remapping branch February 7, 2023 14:50
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Feb 9, 2023
…c-net#12605)

- Why I did it
Support DSCP remapping in dual ToR topo on T0 switch for SKU Mellanox-SN4600c-C64, Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8.

- How I did it
Regarding buffer settings, originally, there are two lossless PGs and queues 3, 4. In dual ToR scenario, the lossless traffic from the leaf switch to the uplink of the ToR switch can be bounced back.
To avoid PFC deadlock, we need to map the bounce-back lossless traffic to different PGs and queues. Therefore, 2 additional lossless PGs and queues are allocated on uplink ports on ToR switches.

On uplink ports, map DSCP 2/6 to TC 2/6 respectively
On downlink ports, both DSCP 2/6 are still mapped to TC 1
Buffer adjusted according to the ports information:
Mellanox-SN4600c-C64:
56 downlinks 50G + 8 uplinks 100G
Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8:
24 downlinks 50G + 8 uplinks 100G

- How to verify it
Unit test.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202205: #13745

yxieca pushed a commit that referenced this pull request Feb 10, 2023
…) (#13745)

- Why I did it
Support DSCP remapping in dual ToR topo on T0 switch for SKU Mellanox-SN4600c-C64, Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8.

- How I did it
Regarding buffer settings, originally, there are two lossless PGs and queues 3, 4. In dual ToR scenario, the lossless traffic from the leaf switch to the uplink of the ToR switch can be bounced back.
To avoid PFC deadlock, we need to map the bounce-back lossless traffic to different PGs and queues. Therefore, 2 additional lossless PGs and queues are allocated on uplink ports on ToR switches.

On uplink ports, map DSCP 2/6 to TC 2/6 respectively
On downlink ports, both DSCP 2/6 are still mapped to TC 1
Buffer adjusted according to the ports information:
Mellanox-SN4600c-C64:
56 downlinks 50G + 8 uplinks 100G
Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8:
24 downlinks 50G + 8 uplinks 100G

- How to verify it
Unit test.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
Co-authored-by: Stephen Sun <5379172+stephenxs@users.noreply.github.com>
@stephenxs
Copy link
Collaborator Author

@StormLiangMS Can you help cherry-pick this PR to 202211? Thanks

mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Feb 13, 2023
…c-net#12605)

- Why I did it
Support DSCP remapping in dual ToR topo on T0 switch for SKU Mellanox-SN4600c-C64, Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8.

- How I did it
Regarding buffer settings, originally, there are two lossless PGs and queues 3, 4. In dual ToR scenario, the lossless traffic from the leaf switch to the uplink of the ToR switch can be bounced back.
To avoid PFC deadlock, we need to map the bounce-back lossless traffic to different PGs and queues. Therefore, 2 additional lossless PGs and queues are allocated on uplink ports on ToR switches.

On uplink ports, map DSCP 2/6 to TC 2/6 respectively
On downlink ports, both DSCP 2/6 are still mapped to TC 1
Buffer adjusted according to the ports information:
Mellanox-SN4600c-C64:
56 downlinks 50G + 8 uplinks 100G
Mellanox-SN4600c-D48C40, Mellanox-SN2700, Mellanox-SN2700-D48C8:
24 downlinks 50G + 8 uplinks 100G

- How to verify it
Unit test.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202211: #13787

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants