-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AclOrch] aclOrch enhancement to handle LAG port not configured case #494
Conversation
orchagent/main.cpp
Outdated
@@ -249,8 +249,9 @@ int main(int argc, char **argv) | |||
/* Initialize orchestration components */ | |||
DBConnector *appl_db = new DBConnector(APPL_DB, DBConnector::DEFAULT_UNIXSOCKET, 0); | |||
DBConnector *config_db = new DBConnector(CONFIG_DB, DBConnector::DEFAULT_UNIXSOCKET, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new [](start = 29, length = 3)
There are memory leak risk at extreme memory situation. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qiluo-msft Hi Qi, could you elaborate the concern here? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several 'new' statements here. If memory is used up in the middle, there will be memory leak issue.
The bug should be here for a long time, and your added code make it worse a little bit.
In reply to: 186248077 [](ancestors = 186248077)
orchagent/aclorch.h
Outdated
@@ -248,6 +248,10 @@ class AclTable { | |||
std::map<sai_object_id_t, sai_object_id_t> ports; | |||
// Map rule name to rule data | |||
map<string, shared_ptr<AclRule>> rules; | |||
// Set to store the ACL table port alias | |||
set<string> portListsSet; | |||
// Set tje store the not cofigured ACL table port alias |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo #Closed
orchagent/aclorch.cpp
Outdated
} | ||
else | ||
{ | ||
it = consumer.m_toSync.erase(it); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it = consumer.m_toSync.erase(it); [](start = 12, length = 33)
erase is common for all cases, so you can do it outside the if-else. #Closed
orchagent/aclorch.cpp
Outdated
@@ -1740,6 +1812,7 @@ bool AclOrch::processPorts(string portsList, std::function<void (sai_object_id_t | |||
split(portsList, strList, ','); | |||
|
|||
set<string> strSet(strList.begin(), strList.end()); | |||
aclTable.portListsSet = strSet; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
portListsSet [](start = 13, length = 12)
Call it portSet? #Closed
orchagent/aclorch.cpp
Outdated
|
||
return true; | ||
} | ||
|
||
bool AclOrch::processAclTableType(string type, acl_table_type_t &table_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
processAclTableType [](start = 14, length = 19)
Lots of repeated code, you can call one from the other. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I mean processPorts() and processPendingPort() share some code, it is possible to call one from the other?
In reply to: 186252087 [](ancestors = 186252087)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qiluo-msft They do share some same code, I also have considered calling processPendingPort(previously processSinglePort) in processPorts. But if we look into the whole procedure, processPorts only "link" ports to tables and the action to "bind" table to ports are not included.
For processPendingPort, it's dedicated to receiving the notification and handle the pending port, in this one function "link" and "bind" all performed. To keep the code more straightforward and readable I decide to have these two functions separated through will have some slight redundant code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can move aclTable.bind out of the processPendingPort() function and make the code reuse.
you can do aclTable.bind in doAclTablePortUpdateTask().
In reply to: 186911317 [](ancestors = 186911317)
orchagent/aclorch.cpp
Outdated
{ | ||
auto table = itmap.second; | ||
|
||
if (table.pendingPortListSet.find(port_alias) != table.pendingPortListSet.end()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pendingPortListSet [](start = 26, length = 18)
Call it pendingPortSet? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As comments.
@qiluo-msft all comments handled, please check the latest commit. #Closed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As comments.
|
||
OrchDaemon *orchDaemon = new OrchDaemon(appl_db, config_db); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we would need to have orchDaemon allocate from stack. This could still be retained as it is (allocate via new
) so we wouldn't have to worry about the class holding more data in future just in case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revised.
orchagent/aclorch.cpp
Outdated
|
||
if (table.pendingPortSet.find(port_alias) != table.pendingPortSet.end()) | ||
{ | ||
SWSS_LOG_NOTICE("found the port: %s in ACL table: %s pending port list, bind it to ACL table.", port_alias.c_str(), table.description.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to put this as INFO instead of NOTICEs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revised.
orchagent/aclorch.cpp
Outdated
for (auto itmap : m_AclTables) | ||
{ | ||
auto table = itmap.second; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove the extra lines. There are few other redundant lines as well in the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted.
orchagent/aclorch.cpp
Outdated
@@ -1898,18 +2024,17 @@ sai_status_t AclOrch::bindAclTable(sai_object_id_t table_oid, AclTable &aclTable | |||
sai_status_t status = SAI_STATUS_SUCCESS; | |||
|
|||
SWSS_LOG_INFO("%s table %s to ports", bind ? "Bind" : "Unbind", aclTable.id.c_str()); | |||
|
|||
if (aclTable.portSet.empty()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this condition will be ever true in this flow. Can you check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are correct, I removed this flow.
orchagent/aclorch.cpp
Outdated
{ | ||
return SAI_STATUS_SUCCESS; | ||
} | ||
SWSS_LOG_WARN("Binding port list is empty for %s table", aclTable.id.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be a normal flow to delete aclTable which is not yet bind to any port list. We wouldn't want to log warning if it is in the delete flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
condition check added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also get Prince's approval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes looks good. Could you please add/update VS test cases for ACL to verify this changes?
As for VS, it should be added but for now we have the testbed to cover this flow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need vs switch test.
3 new VS test case added
|
orchagent/aclorch.cpp
Outdated
{ | ||
SWSS_LOG_ENTER(); | ||
|
||
sai_object_id_t port_id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sai_object_id_t port_id == SAI_NULL_OBJECT_ID; do it here. then you do not need to assign later in the function.
orchagent/aclorch.cpp
Outdated
if (port.m_lag_member_id != SAI_NULL_OBJECT_ID) | ||
{ | ||
SWSS_LOG_ERROR("Failed to process port. Bind table to LAG member %s is not allowed", alias.c_str()); | ||
port_id = SAI_NULL_OBJECT_ID; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
port_id = SAI_NULL_OBJECT_ID; [](start = 12, length = 29)
remove this.
orchagent/aclorch.cpp
Outdated
if (table.portSet.find(port_alias) != table.portSet.end()) | ||
{ | ||
table.pendingPortSet.emplace(port_alias); | ||
SWSS_LOG_WARN("Add deleted port: %s to the pending list of ACL table: %s", port_alias.c_str(), table.description.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this warning level?
orchagent/aclorch.cpp
Outdated
table.pendingPortSet.emplace(port_alias); | ||
SWSS_LOG_WARN("Add deleted port: %s to the pending list of ACL table: %s", port_alias.c_str(), table.description.c_str()); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the scenario here? when a port is removed, then to update the acl table? In this case, you need to unbind the port from acl table, and then put the port to pending list. You cannot simply put the port into the pending list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for this one maybe need to marked as TODO here. if one port or LAG deleted, do will still have the chance to unbind it? I mean maybe some DB entries already destroyed and for SDK level all the configuration should already be cleared?
orchagent/aclorch.cpp
Outdated
string port_alias = key.substr(0, found); | ||
string op = kfvOp(t); | ||
|
||
SWSS_LOG_INFO("doAclTablePortUpdateTask: OP: %s, port_alias: %s", op.c_str(), port_alias.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SWSS_LOG_INFO("doAclTablePortUpda [](start = 7, length = 34)
this is debug level message.
orchagent/aclorch.cpp
Outdated
SWSS_LOG_ENTER(); | ||
|
||
sai_object_id_t port_id; | ||
|
||
vector<string> strList; | ||
|
||
SWSS_LOG_INFO("Processing ACL table port list %s", portsList.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SWSS_LOG_INFO(" [](start = 4, length = 15)
this is debug level message.
orchagent/aclorch.cpp
Outdated
@@ -1754,30 +1856,50 @@ bool AclOrch::processPorts(string portsList, std::function<void (sai_object_id_t | |||
Port port; | |||
if (!gPortsOrch->getPort(alias, port)) | |||
{ | |||
SWSS_LOG_ERROR("Failed to process port. Port %s doesn't exist", alias.c_str()); | |||
return false; | |||
SWSS_LOG_WARN("Port %s not configured yet, add it to ACL table %s pending list", alias.c_str(), aclTable.description.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SWSS_LOG_WARN [](start = 12, length = 13)
why warning level? this should be info level since this is a common situation your code is handling, so this should not be warning level.
orchagent/aclorch.cpp
Outdated
{ | ||
SWSS_LOG_ENTER(); | ||
|
||
SWSS_LOG_INFO("Processing ACL table port %s", portAlias.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SWSS_LOG_INFO [](start = 4, length = 13)
debug level.
orchagent/aclorch.cpp
Outdated
Port port; | ||
if (!gPortsOrch->getPort(portAlias, port)) | ||
{ | ||
SWSS_LOG_WARN("Port %s not configured yet, add it to ACL table %s pending list", portAlias.c_str(), aclTable.description.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SWSS_LOG_WARN [](start = 8, length = 13)
info level.
tests/test_acl.py
Outdated
self.verify_acl_group_member(adb, 0, test_acl_table_id) | ||
|
||
# check port binding | ||
self.verify_acl_lag_binding(adb, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this after port channel member creation, before state db creation.
tests/test_acl.py
Outdated
keys = atbl.getKeys() | ||
assert len(keys) == 0 | ||
|
||
def verify_acl_group(self, adb, expt): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expt [](start = 36, length = 4)
rename to "acl_group_num" for clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already changed?
tests/test_acl.py
Outdated
self.verify_acl_group(adb, 2) | ||
|
||
# check acl table group member | ||
self.verify_acl_group_member(adb, 2, test_acl_table_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
verify_acl_group_member [](start = 13, length = 23)
this function signature should be
verify_acl_group_member(adb, [acl_group_ids], acl_table_id)
it should verify the acl_groups have corresponding acl_group_member which has the acl_table_id.
tests/test_acl.py
Outdated
assert set(port_groups) == set(acl_table_groups) | ||
|
||
|
||
def verify_acl_lag_binding(self, adb, expt): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
verify_acl_lag_binding(self, adb, expt): [](start = 8, length = 40)
like the verifying the port binding, we need to give the lag name as input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find out how a lag port name
was mapped to a SAI_OBJECT_TYPE_LAG_MEMBER in ASIC DB, unless you create one LAG and then know that the only SAI_OBJECT_TYPE_LAG_MEMBER record in ASIC DB is yours?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests/test_acl.py
Outdated
|
||
# check port binding | ||
|
||
def verify_acl_port_binding(self, dvs, adb, expt, bind_ports): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expt [](start = 48, length = 4)
you do not need give expt here, just use the len(bind_ports)
tests/test_acl.py
Outdated
assert set(port_groups) == set(acl_table_groups) | ||
|
||
|
||
def verify_acl_lag_binding(self, adb, expt): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expt [](start = 42, length = 4)
change to bind_ports as well, do not need expt here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for lag case it different with port case, bind_ports len may not equal to the binding lag number if some LAG not configured yet, so I think we still need a expt?
tests/test_acl.py
Outdated
|
||
# check acl table group member | ||
|
||
def verify_acl_group_member(self, adb, expt, test_acl_table_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expt [](start = 43, length = 4)
change to acl_group_id list.
orchagent/aclorch.h
Outdated
bool validateAclTable(AclTable &aclTable); | ||
sai_object_id_t getValidPortId(string alias, Port port); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getValidPortId [](start = 20, length = 14)
to reflect the true meaning, it is better to rename it to getBindPortId. Besides, it makes more sense to move this into portsyncd since it is retrieving a port property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed and move it to portOrch.
orchagent/portsorch.cpp
Outdated
break; | ||
default: | ||
SWSS_LOG_ERROR("Failed to process port. Incorrect port %s type %d", alias.c_str(), port.m_type); | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
align spaces
tests/test_acl.py
Outdated
self.verify_acl_group_member(adb, acl_group_ids, test_acl_table_id) | ||
|
||
# check port binding | ||
self.verify_acl_lag_binding(adb, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- [](start = 40, length = 3)
use lag id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…ed ACL table configuration (#1712) * Fix minigraph parser issue when handling LAG related ACL table configuration * rephrase the warning message. * pick up swss change in sonic-net/sonic-swss#494
…494) * enhance acl orch to handle the LAG port not configured case * rename variable, function; redunce redundant code; fix memory issue * revise according to comments * one more blank line * add new VS test cases for acl on LAG * update with PR comments fix * rename getValidPortId and move it to portOrch * fix indent * fix test case comments
…ed ACL table configuration (sonic-net#1712) * Fix minigraph parser issue when handling LAG related ACL table configuration * rephrase the warning message. * pick up swss change in sonic-net/sonic-swss#494
…ed ACL table configuration (#1712) * Fix minigraph parser issue when handling LAG related ACL table configuration * rephrase the warning message. * pick up swss change in sonic-net/sonic-swss#494
Signed-off-by: Wenda Ni <wenni@microsoft.com>
What I did
Why I did it
it's possible that LAG port configured later or not configured yet while ACL table created, in this case we can add this LAG to the pending list and wait for it created, instead of fail the whole ACL table creating.
How I verified it
manual test, run ACL and Everflow test case on T1 and T1-LAG topo.
Details if related