Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fabricportsorch] Collect counters for fabric links #1944

Merged
merged 1 commit into from
Oct 5, 2022

Conversation

mlorrillere
Copy link
Contributor

@mlorrillere mlorrillere commented Oct 6, 2021

This change is enabling collections of fabric link counters as port and queue stats.

Currently only fabric asics can be collected, so collection is disabled for switch asics.
The reasons are that BCM SAI supports up to 256 logical ports, which is not enough
for ports + fabric lanes, and fabric queue counters are not supported by SAI at
this moment.

The following sonic-swss-common PR is required: PR #551
sonic-utilities PR #1860 adds the
corresponding show fabric counters CLI command.

@rlhui
Copy link
Contributor

rlhui commented Oct 20, 2021

@skeesara-nokia , @vganesan-nokia - please help review, thanks.

@rlhui rlhui requested review from abdosi and arlakshm October 20, 2021 17:31
orchagent/fabricportsorch.cpp Outdated Show resolved Hide resolved
orchagent/fabricportsorch.cpp Outdated Show resolved Hide resolved
}
int num_queues = attr.value.u32;

if (num_queues > 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be any chance we'll get num_queues <= 0? If yes, please add logic for error logging. If not, we do not need this check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely >= 0 as the return value of the SAI call is u32. I think we should expect a value >= 1. I'll check if that is correct and update the code accordingly.

@@ -636,6 +636,9 @@ int main(int argc, char **argv)
if (gMySwitchType == "voq")
{
orchDaemon->setFabricEnabled(true);
// SAI doesn't fully support counters for non fabric asics
orchDaemon->setFabricPortStatEnabled(false);
orchDaemon->setFabricQueueStatEnabled(false);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are initialized as "true" in declaration (line#s 77 and 78 below). For regular switch (i.e, gMySwitchType is not "fabric" and not "voq") are we going to have these m_fabricPortStatEnabled, m_fabricQueueStatEnabled unchanged?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FabricPortsOrch is only created when gMySwitchType is "voq" or "fabric". These settings are set here to be used when FabricPortsOrch is instanciated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check the build failure?

We will need sonic-net/sonic-swss-common#551 to merge for build to pass.

@@ -42,7 +45,8 @@ static const vector<sai_queue_stat_t> queue_stat_ids =
SAI_QUEUE_STAT_CURR_OCCUPANCY_LEVEL,
};

FabricPortsOrch::FabricPortsOrch(DBConnector *appl_db, vector<table_name_with_pri_t> &tableNames) :
FabricPortsOrch::FabricPortsOrch(DBConnector *appl_db, vector<table_name_with_pri_t> &tableNames,
bool fabricPortStatEnabled, bool fabricQueueStatEnabled) :
Orch(appl_db, tableNames),
port_stat_manager(FABRIC_PORT_STAT_COUNTER_FLEX_COUNTER_GROUP, StatsMode::READ,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have unit tests for fabric port flex counter groups similar to test_flex_counter.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't. Let me see if I can add FABRIC_PORT_STAT_COUNTER for test_flex_counter.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VirtualChassisTopology doesn't have any fabric asics defined, so it's not a simple change to test anything for switch_type=fabric in sonic-swss.

I'll work separately from this PR into updating virtual chassis topology and adding a test_fabric_counters.py test.

@mlorrillere mlorrillere force-pushed the pr-fabric-counters branch 2 times, most recently from f6a6770 to 3f08b7e Compare November 8, 2021 22:38
vganesan-nokia
vganesan-nokia previously approved these changes Nov 10, 2021
@arlakshm
Copy link
Contributor

Can you check the build failure?

}
port_stat_manager.setCounterIdList(port, CounterType::PORT, counter_stats);
}
m_portNamePortCounterTable->set("", portNamePortCounterMap);
}

void FabricPortsOrch::generateQueueStats()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is function generateQueueStats used anywhere ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generateQueueStats is called by FlexCounterOrch. This code was already in place.

@arlakshm
Copy link
Contributor

/Azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mlorrillere
Copy link
Contributor Author

Can you check the build failure?

We will need sonic-net/sonic-swss-common#551 to merge to get a passing build.

@prsunny
Copy link
Collaborator

prsunny commented May 4, 2022

Please add VS tests

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Aug 2, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: mlorrillere / name: Maxime Lorrillere (dde035b)

@mlorrillere
Copy link
Contributor Author

/easycla

@abdosi
Copy link
Contributor

abdosi commented Aug 17, 2022

@mlorrillere please check on code coverage failures.

@rlhui
Copy link
Contributor

rlhui commented Sep 20, 2022

@mlorrillere - any update?

@mlorrillere
Copy link
Contributor Author

@mlorrillere - any update?

Hi @rlhui
I was waiting for sonic-buildimage PR #11997 to merge before updating this review.

It looks like there was a duplicate sonic-buildimage PR #12098 merged a few days ago. I'll update this PR shortly with test coverage.

@mlorrillere mlorrillere reopened this Sep 20, 2022
@mlorrillere
Copy link
Contributor Author

Tagging @jfeng-arista

@mlorrillere
Copy link
Contributor Author

@rlhui - all checks have passed.

Counters are port stats and queue stats. Currently only fabric asics
could be collected. J2 fabric counter collection doesn't work yet.
J2 fabric port counters fail to be collected due to logical port id
for fabric links is set up to 512 while SAI supports at most 256 ports.
J2 fabric queue counters are not supported by SAI at this moment (BCM
confirmed).

Signed-off-by: Maxime Lorrillere <mlorrillere@arista.com>
@kenneth-arista
Copy link
Contributor

@arlakshm @rlhui Can we merge this?

@arlakshm arlakshm merged commit 31c9321 into sonic-net:master Oct 5, 2022
yxieca pushed a commit that referenced this pull request Oct 25, 2022
Counters are port stats and queue stats. Currently only fabric asics
could be collected. J2 fabric counter collection doesn't work yet.
J2 fabric port counters fail to be collected due to logical port id
for fabric links is set up to 512 while SAI supports at most 256 ports.
J2 fabric queue counters are not supported by SAI at this moment (BCM
confirmed).

Signed-off-by: Maxime Lorrillere <mlorrillere@arista.com>

Signed-off-by: Maxime Lorrillere <mlorrillere@arista.com>
liat-grozovik pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Nov 10, 2022
Advance sonic-swss submodule to pick up new commits:

dbdf31c [counters] Improve performance by polling only configured ports buffer queue/pg counters sonic-net/sonic-swss#2473
ab4f804 [portsorch] remove port OID from saiOidToAlias map on port deletion sonic-net/sonic-swss#2483
ab29920 [QoS] Support dynamic headroom calculation for Barefoot platforms sonic-net/sonic-swss#2412
15beee4 Add support for voq counters in portsorch. sonic-net/sonic-swss#2467
c8d4905 [vlanmgr] Disable arp_evict_nocarrier for vlan host intf sonic-net/sonic-swss#2469
31c9321 [chassis][voq]Collect counters for fabric links sonic-net/sonic-swss#1944

Signed-off-by: Kebo Liu <kebol@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

8 participants