Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DPB:orchagent:portsorch] Rework port dependency (#23) #1219

Merged
merged 3 commits into from
Mar 20, 2020

Conversation

vasant17
Copy link
Contributor

@vasant17 vasant17 commented Mar 6, 2020

  • Use SAI REDIS return code to track dependency on port.

What I did
While deleting a port, if SAI call return OBJECT_IN_USE error code, we will continue to wait/loop in orchagent to delete the port. Once all the dependencies are removed, we will go-ahead remove the port. Note that dependency tracking itself is done in SAI REDIS layer. Also note that identifying dependencies and removing the dependencies is done in config mgmt layer.

Why I did it
As part Dynamic Port Breakout feature, we need to delete and re-create ports. During deletion of a port, we need to ensure that all dependencies are also removed. For example, port being part of ACL, VLAN, QoS, etc. Initial approach was to track these dependencies in orchagent itself. But after discussion with Microsoft team, we decided that we can use the dependency check that is already being done at the SAI REDIS layer. So, here we are going to check the return error code of SAI call to deduce if the port has any dependency.

How I verified it
Tested Dynamic Port Breakout code with port in VLAN and ACL tables. Verified from syslog that we SAI redis call to remove/delete the port succeeds when the dependencies are removed.

Details if related

@vasant17 vasant17 changed the title Rework port dependency (#23) [DPB:orchagent] Rework port dependency (#23) Mar 6, 2020
@vasant17 vasant17 changed the title [DPB:orchagent] Rework port dependency (#23) [DPB:orchagent:portsorch] Rework port dependency (#23) Mar 6, 2020
@vasant17 vasant17 force-pushed the DPB_DEPENDENCY_CHECK branch from 045b2f5 to 6eee225 Compare March 6, 2020 07:55
@lguohan lguohan requested a review from qiluo-msft March 6, 2020 21:40
@zhenggen-xu
Copy link
Collaborator

@vasant17 Please resolve the conflicts, and list this PR sonic-net/sonic-sairedis#565 as dependency.

@vasant17
Copy link
Contributor Author

retest this please

orchagent/portsorch.cpp Outdated Show resolved Hide resolved
// Port has one or more dependencies, cannot remove
SWSS_LOG_WARN("Cannot to remove port because of dependency");
// Bridge port OID is set on a port as long as
// port is part of at-least one VLAN.
Copy link
Contributor

@qiluo-msft qiluo-msft Mar 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VLAN [](start = 48, length = 4)

how about vxlan? #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool PortsOrch::removePort(sai_object_id_t port_id)

The function behavior is weird:

  1. if removed, return true
  2. if in use, return false
  3. if other error, throw.

Suggest return an enum, or sai_status_t. No throw.

Refers to: orchagent/portsorch.cpp:1569 in ca4812c. [](commit_id = ca4812c, deletion_comment = False)

Half heartedly, I agree. I am NOT too happy because, we always handle the sai_status in the inner most function (where we call sai API's) and return only true of false to the calling function. And its pretty consistent throughout the code and I liked it. But here I am making an exception by returning sai_status to the caller. I did think about introducing our own enum, but that seemed like over kill. So, I will stick to this for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VLAN [](start = 48, length = 4)

how about vxlan?

For VxLAN I do NOT see any direct dependency on physical port. If your question is more about handling tunnel interface dependency on physical port, that is NOT in scope of this PR. Because, we expect every dependency to be tracked in SAI REDIS layer. Please let me know if I dint interpret your comment correctly


if (m_portList[alias].has_dependency())
Copy link
Contributor

@qiluo-msft qiluo-msft Mar 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (m_portList[alias].has_dependency()) [](start = 12, length = 39)

We can still keep this if-block. No harm to check dependency in orchagent first. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. we had a lengthy discussion on this with Guhan and Renuka. Will rely on SAI REDIS return code and do NOT maintain any code in orchagent to track the dependency. In-fact I am removing the dependency tracking code (which went in as part of ACL/DPB code) in this PR which

orchagent/portsorch.cpp Outdated Show resolved Hide resolved
@qiluo-msft
Copy link
Contributor

qiluo-msft commented Mar 12, 2020

        m_portList.erase(alias);

We should move this part into deInitPort(). Then no need to add m_init field. @zhenggen-xu, please check. #Closed


Refers to: orchagent/portsorch.cpp:2304 in dc8477f. [](commit_id = dc8477f, deletion_comment = False)

@qiluo-msft
Copy link
Contributor

qiluo-msft commented Mar 12, 2020

bool PortsOrch::removePort(sai_object_id_t port_id)

The function behavior is weird:

  1. if removed, return true
  2. if in use, return false
  3. if other error, throw.

Suggest return an enum, or sai_status_t. No throw. #Closed


Refers to: orchagent/portsorch.cpp:1569 in dc8477f. [](commit_id = dc8477f, deletion_comment = False)

if (status != SAI_STATUS_OBJECT_IN_USE)
{
SWSS_LOG_ERROR("Failed to remove port %" PRIx64 ", rv:%d", port_id, status);
throw runtime_error("Delete port failed");
Copy link
Contributor

@qiluo-msft qiluo-msft Mar 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw runtime_error("Delete port failed"); [](start = 12, length = 42)

Treat it as an error, but not to crash the orchagent. #Closed

Copy link
Contributor Author

@vasant17 vasant17 Mar 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things:

  1. I did NOT want to change the existing behavior.
  2. We need to think if this can ever happen, if our theory says that it can never happen, we should throw a runtime_error.

At this point in time, if a port does NOT have any dependency, I do NOT see any reason why a removal/deletion would fail (please let me know if you see one). If at all it happens, I would prefer to catch it right then and there instead of letting the system go in in-consistent state.

@qiluo-msft qiluo-msft requested a review from zhenggen-xu March 12, 2020 23:19
Copy link
Contributor

@qiluo-msft qiluo-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As comments

* Use SAI REDIS return code to track dependency on port.
@vasant17
Copy link
Contributor Author

        m_portList.erase(alias);

We should move this part into deInitPort(). Then no need to add m_init field. @zhenggen-xu, please check.

Refers to: orchagent/portsorch.cpp:2304 in ca4812c. [](commit_id = ca4812c, deletion_comment = False)

No. That wont work. Because, if a port is present in m_portList, it means an object for it exists in SAI REDIS. Suppose, if you remove it in deInit and subsequent call to port removal fails due to OBJECT_IN_USE, then while removing the dependency, other modules(VLAN, ACL, etc) think the port does NOT exist(because they call getPort()) and they will never remove the dependency.

@vasant17 vasant17 force-pushed the DPB_DEPENDENCY_CHECK branch from ca4812c to dc8477f Compare March 13, 2020 07:11
orchagent/portsorch.cpp Outdated Show resolved Hide resolved
@lguohan
Copy link
Contributor

lguohan commented Mar 16, 2020

retest this please

@sonic-net sonic-net deleted a comment from qiluo-msft Mar 16, 2020
@vasant17
Copy link
Contributor Author

retest this please

@lguohan lguohan merged commit 29dc62c into sonic-net:master Mar 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants