Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sonic-swss changes for MPLS #1686

Merged
merged 2 commits into from
Jun 28, 2021
Merged

sonic-swss changes for MPLS #1686

merged 2 commits into from
Jun 28, 2021

Conversation

qbdwlr
Copy link
Contributor

@qbdwlr qbdwlr commented Mar 31, 2021

What I did
SONiC swss support for MPLS:

  1. RouteOrch support for SAI MPLS inseg
  2. NeighOrch support for MPLS NHs.
  3. CrmOrch support for MPLS inseg and NH accounting.
  4. New sonic-swss/tests/test_mpls.py for verification.

Why I did it
SONiC swss support for MPLS

How I verified it
Unit-tests in sonic-swss/tests/test_mpls.py and sonic-utilities/tests
System tests in sonic-mgmt

Details if related
Refer to HLD: sonic-net/SONiC#706

orchagent/crmorch.cpp Outdated Show resolved Hide resolved
fpmsyncd/routesync.cpp Outdated Show resolved Hide resolved
fpmsyncd/routesync.cpp Outdated Show resolved Hide resolved
@@ -1042,6 +1076,10 @@ bool IntfsOrch::addRouterIntfs(sai_object_id_t vrf_id, Port &port)
attr.value.u32 = port.m_mtu;
attrs.push_back(attr);

attr.id = SAI_ROUTER_INTERFACE_ATTR_ADMIN_MPLS_STATE;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check if mpls is supported similar to NAT below?

Copy link
Contributor Author

@qbdwlr qbdwlr Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prsunny IntfsOrch::addRouterIntfs() has been updated to only include SAI_ROUTER_INTERFACE_ATTR_ADMIN_MPLS_STATE if non-default (ie, enable) is configured. Explicit configuration to enable MPLS presumes platform that supports MPLS. MPLS attr is no longer set on platforms unless it is explicitly part of the configuration.

Copy link

@FLYFLYFL FLYFLYFL Dec 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm puzzled that what the rif type SAI_ROUTER_INTERFACE_TYPE_MPLS_ROUTER is used for ? I didn't find it in current SWSS code. @qbdwlr

orchagent/neighorch.cpp Outdated Show resolved Hide resolved
orchagent/neighorch.cpp Outdated Show resolved Hide resolved

if (status != SAI_STATUS_ITEM_NOT_FOUND)
/* If we can't remove the next hop, return false. */
if (!removeNextHop(nexthop))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flow is slightly changed here. Previously we remove this nexthop from the map only after neighbor is successfully removed. What is the motivation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TACappleman Tom made these changes and can address your question.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main part of this change is to move the actual deletion of the next hop over the SAI into the removeNextHop (so that when e.g. routeorch calls this method it does what you'd expect it to). It then makes sense for the deletion of the next hop from the map to happen at the point that it no longer exists on the SAI, and so the small functional change as a result of this set of changes seems reasonable. (Otherwise you could delete the next hop and fail to delete the neighbor, and so not record that the next hop has been deleted).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change it to original flow and have only the changes for mpls addition. Else its difficult to understand what is optimized here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prsunny You need to re-examine the original code. You will see there is NextHop clean-up code in ::removeNeighbor instead of in ::removeNextHop. This only worked because for IP there is a 1-1 mapping between NextHop and Neighbor, so the NextHop was only removed when the Neighbor was also being removed. With MPLS NHs, this is no longer true. Now there is n-1 mapping between NH and Neighbor, so NextHop cleanup must occur within the context of ::removeNextHop to ascertain that NH is truly cleaned up. ::removeNeighbor can (and does) call :;removeNextHop to achieve its previous functionality.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previous flow was

  1. SAI Remove nexthop 2. SAI remove neighbor 3. Neighbor map remove 4. Nexthop Map remove

new flow:

  1. SAI Remove nexthop 2 Nexthop Map remove 3. SAI remove neighbor 4. Neighbor map remove

How does mpls n-1:1 mapping between nh and neighbor cover new flow ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abdosi With MPLS NHs, you can no longer assume that Neighbor is removed when NextHop is removed. I am puzzled as to why this flow does not remove Neighbor/Nexthop in the reverse order that Neighbor/Nexthop is created. Can you explain why addNextHop() has SAI API calls, but removeNextHop() does not? There seems to be some (possible unintentional) asymmetry here that needs to be explained.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are assumptions made on the existence of NH map if Neighbor is failed to remove. I'm unable to understand what is the n:1 relation here. If there is mpls specific nh that is not related to neighbor, it should be in a different function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prsunny Per the discussion, I have reverted NeighOrch::removeNextHop and added separate NeighOrch::removeMplsNextHop.

@@ -1951,3 +2005,767 @@ bool RouteOrch::removeOverlayNextHops(sai_object_id_t vrf_id, const NextHopGroup
return true;
}

void RouteOrch::doLabelTask(Consumer& consumer)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is considerable code to routeorch. Suggest moving to a different file mplsrouteorch.cpp, @abdosi ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functionality to process LABEL_ROUTE_TABLE and ROUTE_TABLE is almost identical in nature. Both process NH/NHG association with label or prefix. Moving to a separate file file implies that the functionality is inherently different, which isn't true.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree but at the same time why do you need a separate task for label route? Other than SAI API what is different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved the MPLS related functions into a separate file: routeorch_mpls.cpp

Copy link
Collaborator

@prsunny prsunny May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets not have _ in filename. Suggest follow existing naming for orch files and rename to mplsrouteorch.cpp

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still it doesn't mean we need to have a different new naming convention.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see this open. Please address this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have checked the file lengths in sonic-swss/orchagent and found that there are three other Orch components with file sizes larger than routeorch.cpp. Subdividing RouteOrch for file size reasons will not be done within the context of this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I dont prefer mpls changes to be present as part of routeorch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prsunny Done

orchagent/routeorch.cpp Outdated Show resolved Hide resolved
@lguohan
Copy link
Contributor

lguohan commented Apr 9, 2021

missing pytest

fpmsyncd/routesync.cpp Outdated Show resolved Hide resolved
fpmsyncd/routesync.cpp Outdated Show resolved Hide resolved
@smaheshm
Copy link
Contributor

Can you update in the description your code changes at high level.

Copy link
Contributor

@lguohan lguohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

placeholder, we need to sonic-mgmt plan, otherwise, the feature cannot be tested end-to-end.

next_hop_attr.value.s32 = SAI_NEXT_HOP_TYPE_MPLS;
next_hop_attrs.push_back(next_hop_attr);

next_hop_attr.id = SAI_NEXT_HOP_ATTR_LABELSTACK;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add SAI_NEXT_HOP_ATTR_OUTSEG_TYPE with push/swap type. Default operation in SAI is swap

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kperumalbfn: If I program the following two routes in FRR: "ip route 20.20.1.1/32 10.0.0.3 label 2021" and "mpls lsp 2020 10.0.0.3 2021", how many SAI NHs are required? It seems like this OUTSEG_TYPE necessitates the NH to have knowledge of the routes reference it and the NH attr content can change depending on the type of route.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes @qbdwlr...in this case, there will be 2 MPLS nexthops in SAI. For IP route, MPLS outseg_nexthop with push operation and label 2021. For label 2020, MPLS outseg_nexthop with swap operation and same label 2021. Based on the lookup, select push/swap mpls nexthop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qbdwlr Can you also add an option to specify uniform/pipe mode for TTL and EXP in the DB which translates to MPLS nexthop's attribute SAI_NEXT_HOP_ATTR_OUTSEG_TTL_MODE? Default mode is uniform, but in some cases pipe mode is useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OUTSEG_TYPE has been added, but not OUTSEG_TTL_MODE since default of uniform is only used for current support.

Copy link
Collaborator

@prsunny prsunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As comments

*
* Return concatenation of NHs: nh0 + "," + nh1 + .... + "," + nhN
*/
string RouteSync::getNextHopList(struct rtnl_route *route_obj)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests for this API. For example, provide different values of "route_obj" and compare with the expected string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to the tests provided in sonic-swss/tests/test_route.py:TestMplsRoute

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are now moved to new location: sonic-swss/tests/test_route.py:TestMplsRouteFpm

fpmsyncd/routesync.cpp Outdated Show resolved Hide resolved
cfgmgr/intfmgr.cpp Outdated Show resolved Hide resolved
fpmsyncd/routesync.cpp Outdated Show resolved Hide resolved
fpmsyncd/routesync.cpp Outdated Show resolved Hide resolved

daddr = rtnl_route_get_dst(route_obj);
nl_addr2str(daddr, destaddr, MAX_ADDR_SIZE);
SWSS_LOG_INFO("Receive new route message dest addr: %s", destaddr);
Copy link
Contributor

@smaheshm smaheshm Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"new route" -> "new label route". Would be easier for debugging. Also include the action.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onRouteMsg is used for IP routes, not label routes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API is for Label routes only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smahesh You are correct. I will fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smahesh This is fixed.

fpmsyncd/routesync.cpp Outdated Show resolved Hide resolved
Comment on lines 814 to 816
string nexthops = getNextHopList(route_obj);
string ifnames = getNextHopIf(route_obj);
Copy link
Contributor

@smaheshm smaheshm Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's optional, but I think these two should be in one API that returns a list of Object, which can have "getNexthopList().str()" and "getNextHopIf().str()". May be a list of 'NextHopKey' objects.

fpmsyncd/routesync.cpp Outdated Show resolved Hide resolved
*
* Return concatenation of NHs: nh0 + "," + nh1 + .... + "," + nhN
*/
string RouteSync::getNextHopList(struct rtnl_route *route_obj)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explore if you can resuse "getNextHopGw()" to reduce code duplication?

@@ -13,22 +15,40 @@ struct NextHopKey
{
IpAddress ip_address; // neighbor IP address
string alias; // incoming interface alias
sai_next_hop_type_t nh_type; // NH type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole issue with force-push is we lost the commit history and I'm not sure on which commit introduced this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prsunny Are you saying that the nh_type is an issue now? This is something that @smaheshm specifically asked for. I am not sure why having the commit history would be helpful here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a big change; we cannot review the entire code for each commit. So individual commits helps in reviewing the incremental code. Not sure why you've to do force-push multiple times

@qbdwlr qbdwlr requested review from smaheshm and prsunny June 23, 2021 20:44
@smaheshm
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@smaheshm
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@lguohan lguohan dismissed their stale review June 25, 2021 16:44

sonic-mgmt tests added

@@ -131,45 +118,30 @@ struct NextHopKey

bool isMplsNextHop() const
{
return (nh_type == SAI_NEXT_HOP_TYPE_MPLS);
return (!label_stack.empty());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@smaheshm
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@smaheshm smaheshm merged commit e86b900 into sonic-net:master Jun 28, 2021
liat-grozovik pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Jun 29, 2021
Advance submodule head for sonic-swss

3226163 [BufferOrch] Don't call SAI API for BUFFER_POOL/PROFILE handling in case the op is DEL and the SAI OID is NULL (sonic-net/sonic-swss#1786)
6c88e47 [Dynamic Buffer Calc][Mellanox] Bug fixes and enhancements for the lua plugins for buffer pool calculation and headroom checking (sonic-net/sonic-swss#1781)
e86b900 [MPLS] sonic-swss changes for MPLS (sonic-net/sonic-swss#1686)
4c8e2b5 [Dynamic Buffer Calc] Avoid creating lossy PG for admin down ports during initialization (sonic-net/sonic-swss#1776)
3602124 [VS test stability] Skip flaky test for DPB (sonic-net/sonic-swss#1807)
c37cc1c Support for in-band-mgmt via management VRF (sonic-net/sonic-swss#1726)
1e3a532 Fix config prompt question issue (sonic-net/sonic-swss#1799)

Signed-off-by: Stephen Sun <stephens@nvidia.com>
abdosi pushed a commit that referenced this pull request Aug 5, 2021
VOQ nexthop for remote neighbors should be created on local inband port only for the kernel purpose. SAI should use actual RIF of the remote system port interface. #1686 seems to be break this condition and this change address it.
carl-nokia pushed a commit to carl-nokia/sonic-buildimage that referenced this pull request Aug 7, 2021
Advance submodule head for sonic-swss

3226163 [BufferOrch] Don't call SAI API for BUFFER_POOL/PROFILE handling in case the op is DEL and the SAI OID is NULL (sonic-net/sonic-swss#1786)
6c88e47 [Dynamic Buffer Calc][Mellanox] Bug fixes and enhancements for the lua plugins for buffer pool calculation and headroom checking (sonic-net/sonic-swss#1781)
e86b900 [MPLS] sonic-swss changes for MPLS (sonic-net/sonic-swss#1686)
4c8e2b5 [Dynamic Buffer Calc] Avoid creating lossy PG for admin down ports during initialization (sonic-net/sonic-swss#1776)
3602124 [VS test stability] Skip flaky test for DPB (sonic-net/sonic-swss#1807)
c37cc1c Support for in-band-mgmt via management VRF (sonic-net/sonic-swss#1726)
1e3a532 Fix config prompt question issue (sonic-net/sonic-swss#1799)

Signed-off-by: Stephen Sun <stephens@nvidia.com>
judyjoseph pushed a commit that referenced this pull request Aug 18, 2021
VOQ nexthop for remote neighbors should be created on local inband port only for the kernel purpose. SAI should use actual RIF of the remote system port interface. #1686 seems to be break this condition and this change address it.
raphaelt-nvidia pushed a commit to raphaelt-nvidia/sonic-swss that referenced this pull request Oct 5, 2021
SONiC swss support for MPLS:

* RouteOrch support for SAI MPLS inseg
* NeighOrch support for MPLS NHs.
* CrmOrch support for MPLS inseg and NH accounting.
* New sonic-swss/tests/test_mpls.py for verification.

Unit-tests in sonic-swss/tests/test_mpls.py and sonic-utilities/tests
raphaelt-nvidia pushed a commit to raphaelt-nvidia/sonic-swss that referenced this pull request Oct 5, 2021
…c-net#1823)

VOQ nexthop for remote neighbors should be created on local inband port only for the kernel purpose. SAI should use actual RIF of the remote system port interface. sonic-net#1686 seems to be break this condition and this change address it.
EdenGri pushed a commit to EdenGri/sonic-swss that referenced this pull request Feb 28, 2022
#### What I did
Fix `NameError` from `show version`.
```
admin@str2-7050cx3-acs-02:~$ show version
Traceback (most recent call last):
  File "/usr/local/bin/show", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/show/main.py", line 960, in version
    chassis_info = platform.get_chassis_info()
  File "/usr/local/lib/python3.7/dist-packages/show/platform.py", line 27, in get_chassis_info
    platform_chassis = sonic_platform.platform.Platform().get_chassis()
NameError: name 'sonic_platform' is not defined
```

#### How I did it
Import `sonic_platform` before using `sonic_platform`.

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants