-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Reclaim buffer][202012] Reclaim unused buffer for dynamic buffer model #1985
Merged
liat-grozovik
merged 4 commits into
sonic-net:202012
from
stephenxs:reclaim-buffer-202012
Dec 9, 2021
Merged
[Reclaim buffer][202012] Reclaim unused buffer for dynamic buffer model #1985
liat-grozovik
merged 4 commits into
sonic-net:202012
from
stephenxs:reclaim-buffer-202012
Dec 9, 2021
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
stephenxs
changed the title
[Reclaim buffer][202012] Reclaim unused buffer for both traditional and dynamic buffer model
[Reclaim buffer][202012] Reclaim unused buffer for dynamic buffer model
Dec 3, 2021
stephenxs
force-pushed
the
reclaim-buffer-202012
branch
3 times, most recently
from
December 3, 2021 13:52
fc501ae
to
1cfa3b2
Compare
This is to backport community PR 1910 to 202012 branch. **What I did** Reclaim reserved buffer of unused ports for both dynamic and traditional models. This is done by - Removing lossless priority groups on unused ports. - Applying zero buffer profiles on the buffer objects of unused ports. - In the dynamic buffer model, the zero profiles are loaded from a JSON file and applied to `APPL_DB` if there are admin down ports. The default buffer configuration will be configured on all ports. Buffer manager will apply zero profiles on admin down ports. - In the static buffer model, the zero profiles are loaded by the buffer template. Signed-off-by: Stephen Sun <stephens@nvidia.com> **Why I did it** **How I verified it** Regression test and vs test. **Details if related** ***Static buffer model*** Remove the lossless buffer priority group if the port is admin-down and the buffer profile aligns with the speed and cable length of the port. ***Dynamic buffer model*** ****Handle zero buffer pools and profiles**** 1. buffermgrd: add a CLI option to load the JSON file for zero profiles. 2. Load them from JSON file into the internal buffer manager's data structure 3. Apply them to APPL_DB once there is at least one admin-down port - Record zero profiles' names in the pool object it references. By doing so, the zero profile lists can be constructed according to the normal profile list. There should be one profile for each pool on the ingress/egress side. - And then apply the zero profiles to the buffer objects of the port. - Unload them from APPL_DB once all ports are admin-up since the zero pools and profiles are no longer referenced. Remove buffer pool counter id when the zero pool is removed. 4. Now that it's possible that a pool will be removed from the system, the watermark counter of the pool is removed ahead of the pool itself being removed. ****Handle port admin status change**** 1. Currently, there is a logic of removing buffer priority groups of admin down ports. This logic will be reused and extended for all buffer objects, including `BUFFER_QUEUE`, `BUFFER_PORT_INGRESS_PROFILE_LIST`, and `BUFFER_PORT_EGRESS_PROFILE_LIST`. - When the port is admin down, - The normal profiles are removed from the buffer objects of the port - The zero profiles, if provided, are applied to the port - When the port is admin up, - The zero profiles, if applied, are removed from the port - The normal profiles are applied to the port. 2. Ports orchagent exposes the number of queues and priority groups to STATE_DB. Buffer manager can take advantage of these values to apply zero profiles on all the priority groups and queues of the admin-down ports. In case it is not necessary to apply zero profiles on all priority groups or queues on a certain platform, `ids_to_reclaim` can be customized in the JSON file. 3. Handle all buffer tables, including `BUFFER_PG`, `BUFFER_QUEUE`, `BUFFER_PORT_INGRESS_PROFILE_LIST` and `BUFFER_PORT_EGRESS_PROFILE_LIST` - Originally, only the `BUFFER_PG` table was cached in the dynamic buffer manager. - Now, all tables are cached in order to apply zero profiles when a port is admin down and apply normal profiles when it's up. - The index of such tables can include a single port or a list of ports, like `BUFFER_PG|Ethernet0|3-4` or `BUFFER_PG|Ethernet0,Ethernet4,Ethernet8|3-4`. Originally, there is a logic to handle such indexes for the `BUFFER_PG` table. Now it is reused and extended to handle all the tables. 4. [Mellanox] Plugin to calculate buffer pool size: - Originally, buffer for the queue, buffer profile list, etc. were not reclaimed for admin-down ports so they are reserved for all ports. - Now, they are reserved for admin-up ports only. ****Accelerate the progress of applying buffer tables to APPL_DB**** This is an optimization on top of reclaiming buffer. 1. Don't apply buffer profiles, buffer objects to `APPL_DB` before buffer pools are applied when the system is starting. This is to apply the items in an order from referenced items to referencing items and try to avoid buffer orchagent retrying due to referenced table items. However, it is still possible that the referencing items are handled before referenced items. In that case, there should not be any error message. 2. [Mellanox] Plugin to calculate buffer pool size: Return the buffer pool sizes value currently in APPL_DB if the pool sizes are not able to be calculated due to lacking some information. This typically happens at the system start. This is to accelerate the progress of pushing tables to APPL_DB.
stephenxs
force-pushed
the
reclaim-buffer-202012
branch
from
December 3, 2021 13:56
1cfa3b2
to
b1f03a9
Compare
neethajohn
approved these changes
Dec 3, 2021
/azp run Azure.sonic-swss |
Azure Pipelines successfully started running 1 pipeline(s). |
We need to merge origin/202012 back to this branch to make the all the CIs pass. |
Merged upstream/202012 back to get the checkers passed. |
- Improve admin down test cases - Restore cable length to 0m after test in order to prevent traditional buffer manager from creating lossless profiles Signed-off-by: Stephen Sun <stephens@nvidia.com> Conflicts: tests/test_buffer_dynamic.py
Signed-off-by: stephens <stephens@contoso.com>
neethajohn
approved these changes
Dec 8, 2021
5 tasks
qiluo-msft
pushed a commit
to sonic-net/sonic-buildimage
that referenced
this pull request
Dec 21, 2021
#### Why I did it Update sonic-swss-common 54879741 [202012][schema] Add vnet route tunnel and advertise network tables for state_db (sonic-net/sonic-swss-common#563) a5394f9d Update for BFD, default route table (sonic-net/sonic-swss-common#550) Update sonic-swss fbbe5bcc [202012][pfc_detect] fix RedisReply errors (sonic-net/sonic-swss#2078) 5762b0c2 [Reclaim buffer][202012] Reclaim unused buffer for dynamic buffer model (sonic-net/sonic-swss#1985) 33e9bd19 [Document][202012] Supply the missing ingress/egress port profile list in document (sonic-net/sonic-swss#2066) 1b6ffba1 [Reclaiming buffer][202012] Support reclaiming buffer in traditional buffer model (sonic-net/sonic-swss#2063) afb33f16 [202012] Update default route status to state DB (sonic-net/sonic-swss#2009) (sonic-net/sonic-swss#2067) b9c44f75 Common code update for reclaiming buffer (backport community PR sonic-net/sonic-swss#1996 to 202106/202012) (sonic-net/sonic-swss#2061) cf5182d8 [request parser] Allow request parser to parse multiple values
EdenGri
pushed a commit
to EdenGri/sonic-swss
that referenced
this pull request
Feb 28, 2022
…net#1985) Added the Event Driven Tech Support related information to the Command Reference Guide. HLD
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is to backport community PR #1910 to 202012 branch.
Depends on #2061 which backports #1996 to 202012 branch.
What I did
Reclaim reserved buffer of unused ports for both dynamic and traditional models.
This is done by
APPL_DB
if there are admin down ports.The default buffer configuration will be configured on all ports. Buffer manager will apply zero profiles on admin down ports.
Signed-off-by: Stephen Sun stephens@nvidia.com
Why I did it
How I verified it
Regression test and vs test.
Details if related
Static buffer model
Remove the lossless buffer priority group if the port is admin-down and the buffer profile aligns with the speed and cable length of the port.
Dynamic buffer model
Handle zero buffer pools and profiles
By doing so, the zero profile lists can be constructed according to the normal profile list. There should be one profile for each pool on the ingress/egress side.
Remove buffer pool counter id when the zero pool is removed.
Handle port admin status change
BUFFER_QUEUE
,BUFFER_PORT_INGRESS_PROFILE_LIST
, andBUFFER_PORT_EGRESS_PROFILE_LIST
.Buffer manager can take advantage of these values to apply zero profiles on all the priority groups and queues of the admin-down ports.
In case it is not necessary to apply zero profiles on all priority groups or queues on a certain platform,
ids_to_reclaim
can be customized in the JSON file.BUFFER_PG
,BUFFER_QUEUE
,BUFFER_PORT_INGRESS_PROFILE_LIST
andBUFFER_PORT_EGRESS_PROFILE_LIST
BUFFER_PG
table was cached in the dynamic buffer manager.BUFFER_PG|Ethernet0|3-4
orBUFFER_PG|Ethernet0,Ethernet4,Ethernet8|3-4
. Originally, there is a logic to handle such indexes for theBUFFER_PG
table. Now it is reused and extended to handle all the tables.Accelerate the progress of applying buffer tables to APPL_DB
This is an optimization on top of reclaiming buffer.
APPL_DB
before buffer pools are applied when the system is starting.This is to apply the items in an order from referenced items to referencing items and try to avoid buffer orchagent retrying due to referenced table items.
However, it is still possible that the referencing items are handled before referenced items. In that case, there should not be any error message.
Return the buffer pool sizes value currently in APPL_DB if the pool sizes are not able to be calculated due to lacking some information. This typically happens at the system start.
This is to accelerate the progress of pushing tables to APPL_DB.