Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

swss docker warm restart support #1

Closed
wants to merge 112 commits into from
Closed

swss docker warm restart support #1

wants to merge 112 commits into from

Conversation

jipanyang
Copy link
Owner

@jipanyang jipanyang commented Jun 6, 2018

What I did
swss state restore,
port state sync up.
FDB sync up.

Why I did it

How I verified it
Added vs test cases for swss restore, port state sync up, fdb sync up and crm check

test_warm_reboot.py::test_swss_warm_restore PASSED                                                                                                         [ 95%]
test_warm_reboot.py::test_swss_port_state_syncup PASSED                                                                                                    [ 97%]
test_warm_reboot.py::test_swss_fdb_syncup_and_crm PASSED  
jipan@sonic-build:~/igbpatch/vs/sonic-buildimage/src/sonic-swss/tests$ sudo pytest -v --dvsname=vs --notempview 
[sudo] password for jipan: 
====================================================================== test session starts =======================================================================
platform linux2 -- Python 2.7.12, pytest-3.3.0, py-1.5.2, pluggy-0.6.0 -- /usr/bin/python
cachedir: .cache
rootdir: /home/jipan/igbpatch/vs/sonic-buildimage/src/sonic-swss/tests, inifile:
collected 45 items                                                                                                                                               

test_acl.py::TestAcl::test_AclTableCreation PASSED                                                                                                         [  2%]
test_acl.py::TestAcl::test_AclRuleL4SrcPort PASSED                                                                                                         [  4%]
test_acl.py::TestAcl::test_AclTableDeletion PASSED                                                                                                         [  6%]
test_acl.py::TestAcl::test_V6AclTableCreation PASSED                                                                                                       [  8%]
test_acl.py::TestAcl::test_V6AclRuleIPv6Any PASSED                                                                                                         [ 11%]
test_acl.py::TestAcl::test_V6AclRuleIPv6AnyDrop PASSED                                                                                                     [ 13%]
test_acl.py::TestAcl::test_V6AclRuleIpProtocol PASSED                                                                                                      [ 15%]
test_acl.py::TestAcl::test_V6AclRuleSrcIPv6 PASSED                                                                                                         [ 17%]
test_acl.py::TestAcl::test_V6AclRuleDstIPv6 PASSED                                                                                                         [ 20%]
test_acl.py::TestAcl::test_V6AclRuleL4SrcPort PASSED                                                                                                       [ 22%]
test_acl.py::TestAcl::test_V6AclRuleL4DstPort PASSED                                                                                                       [ 24%]
test_acl.py::TestAcl::test_V6AclRuleTCPFlags PASSED                                                                                                        [ 26%]
test_acl.py::TestAcl::test_V6AclRuleL4SrcPortRange PASSED                                                                                                  [ 28%]
test_acl.py::TestAcl::test_V6AclRuleL4DstPortRange PASSED                                                                                                  [ 31%]
test_acl.py::TestAcl::test_V6AclTableDeletion PASSED                                                                                                       [ 33%]
test_acl.py::TestAcl::test_InsertAclRuleBetweenPriorities PASSED                                                                                           [ 35%]
test_acl.py::TestAcl::test_AclTableCreationOnLAGMember PASSED                                                                                              [ 37%]
test_acl.py::TestAcl::test_AclTableCreationOnLAG PASSED                                                                                                    [ 40%]
test_acl.py::TestAcl::test_AclTableCreationBeforeLAG PASSED                                                                                                [ 42%]
test_crm.py::test_CrmFdbEntry PASSED                                                                                                                       [ 44%]
test_crm.py::test_CrmIpv4Route PASSED                                                                                                                      [ 46%]
test_crm.py::test_CrmIpv6Route PASSED                                                                                                                      [ 48%]
test_crm.py::test_CrmIpv4Nexthop PASSED                                                                                                                    [ 51%]
test_crm.py::test_CrmIpv6Nexthop PASSED                                                                                                                    [ 53%]
test_crm.py::test_CrmIpv4Neighbor PASSED                                                                                                                   [ 55%]
test_crm.py::test_CrmIpv6Neighbor PASSED                                                                                                                   [ 57%]
test_crm.py::test_CrmNexthopGroup PASSED                                                                                                                   [ 60%]
test_crm.py::test_CrmNexthopGroupMember PASSED                                                                                                             [ 62%]
test_crm.py::test_CrmAcl PASSED                                                                                                                            [ 64%]
test_dirbcast.py::test_DirectedBroadcast PASSED                                                                                                            [ 66%]
test_fdb.py::test_FDBAddedAfterMemberCreated PASSED                                                                                                        [ 68%]
test_interface.py::test_InterfaceIpChange PASSED                                                                                                           [ 71%]
test_nhg.py::test_route_nhg PASSED                                                                                                                         [ 73%]
test_port.py::test_PortNotification PASSED                                                                                                                 [ 75%]
test_portchannel.py::test_PortChannel PASSED                                                                                                               [ 77%]
test_route.py::test_RouteAdd PASSED                                                                                                                        [ 80%]
test_setro.py::test_SetReadOnlyAttribute PASSED                                                                                                            [ 82%]
test_speed.py::TestSpeedSet::test_SpeedAndBufferSet PASSED                                                                                                 [ 84%]
test_vlan.py::test_VlanMemberCreation PASSED                                                                                                               [ 86%]
test_vrf.py::test_VRFOrch_Comprehensive PASSED                                                                                                             [ 88%]
test_vrf.py::test_VRFOrch PASSED                                                                                                                           [ 91%]
test_vrf.py::test_VRFOrch_Update PASSED                                                                                                                    [ 93%]
test_warm_reboot.py::test_swss_warm_restore PASSED                                                                                                         [ 95%]
test_warm_reboot.py::test_swss_port_state_syncup PASSED                                                                                                    [ 97%]
test_warm_reboot.py::test_swss_fdb_syncup_and_crm PASSED                                                                                                   [100%]

================================================================== 45 passed in 630.11 seconds ===================================================================
jipan@sonic-build:~/igbpatch/vs/sonic-buildimage/src/sonic-swss/tests$ 

Details if related

simone-dell and others added 2 commits June 8, 2018 15:09
* Add files via upload

* Test case: ACL rule with diff subnet masks

Added a test case with ACL rules with different mask /8 /19 ,etc.  Verified the proper subnet masks were reflected in the ASICDB.

* changing addition of acl rules to a loop to avoid redundancy

* added a helper function to improve code efficiency

* Update test_acl.py
…uration was applied (sonic-net#515)

* Don't up ports, until buffer configuration is applied

* Set MTU first, then set port state to UP

* Introduce the test

* Use logical operator && for boolean values
@jipanyang
Copy link
Owner Author

First swss warm restore test case:
sudo pytest -v --dvsname=vs --notempview test_warm_reboot.py

jipanyang and others added 12 commits June 13, 2018 02:05
…t0 (sonic-net#522)

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
* Allocate buffer for 256 entries for ACL resource query

* Changed to MACRO
Signed-off-by: Sihui Han <sihan@microsoft.com>
…tialization (sonic-net#528)

* Pospone QueueMap initialization until activation of counters

* Initialize CounterCheckOrch after we initialized the QueueMaps
* Generate queue maps only for front panel ports
…sonic-net#530)

* Pospone QueueMap initialization until activation of counters

* Check that port ready list has the information before referencing the information
* Pospone QueueMap initialization until activation of counters

* Generate queue maps only for front panel ports

* Create empty buffer lists by default
@jipanyang jipanyang force-pushed the idempotent branch 2 times, most recently from 163bc1a to 0a96942 Compare July 11, 2018 03:28
keboliu and others added 9 commits July 11, 2018 11:43
*  [aclorch] fix acl bind point type issue

* add support for vlan bind point type support

* Revert "add support for vlan bind point type support"

This reverts commit e026cc1.
Add the note when encountering the client/server version mismatch issue.
…net#534)

* Add support for AN and FEC to be specified in port_config.ini

* do not set autoneg if it is already set

* set port adv speed for AN and add test

Signed-off-by: Guohan Lu <gulv@microsoft.com>
…onic-net#537)

This fixes the error:
Failed due to exception: basic_string::_S_construct null not valid

when the environment variable 'platform' is not set.

Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
…#536)

Add the unit test test_mirror.py which currently covers:
- ACL mirror table creation
- mirror session activation
- ACL rule DSCP with/without mask
- ACL mirror table removal

Signed-off-by: Shu0T1an ChenG <shuche@microsoft.com>
* initial barefoot support october 2017

* import changes from telemetry branch

* Fix merge issues w.r.t rel_6_0 branch

* add validate and get port speed APIs - cleanup deletes from earlier checkin

* missed changes from earlier

* missed integration delete

* trying to get close to master version - and not break other vendor

* set the port config in the ASIC_DB

* update hostif oper status besides the DB update

* Fix compilation issues

* merge closer to master - including spaces and comments

* fix typos
enable mirror

* cosmetic fix

* Fix errors due to saiacl.h changes

* Fix more errors due to sai header file changes

* fix the order of nexthop/neighbor delete

* Fix compilation issues and few issues seen in testing

* Changes needed to add new Dtel api support to sairedis

* Report session DTel table related fixes

* Add more error handling code

* Fix logic to read table name and keys from m_toSync map

* Update value for dtel actions

* Handle boolean attributes to accept true/false as well as presence/absence

* Fix merge errors

* Incorporate configdb related changes for dtel

* fix code merge issue

* Incorporate configdb related changes for dtel

* add pfc_detect lua script

* Changes to incorporate new DTel SAI changes

* SONiC changes due to DTel experimental SAI changes

* cleanup configure.ac to allow barefoot platform includes

* fix typo

* revert commented tunneldecap

* test selective tunneldecap

* revert too ambitious an attempt and just push_back into vector!

* change fec mode to string (from integer)

* cleanup based on review

* closer to 201712 - cosmetic cleanup

* address review comments

* fix format

* Support for platforms based on Barefoot Networks' device (sonic-net#452)

* Fix issue with "config save" followed by reboot

* Temp remove crmorch

Currently orchagent crashes if crm is running. Needs to be fixed

* Initial code changes to address community design review comments

* Temp remove crmorch

Currently orchagent crashes if crm is running. Needs to be fixed
P.S This is actual commit that removes crm. Previous commit was to correct a typo, and has wrong commit message

* More changes to address review comments

* Fix build errors

* Fix for orchagent crash

* Fix queue report deletion error

* Bug fixes

* Re-enable crm orchagent (sonic-net#3)

* Add support for new watchlist attribute to enable/disable tail drop reporting

* Add new VS test for Dataplane Telemetry feature

* Add support for AN and FEC to be specified in port_config.ini

* do not set autoneg if it is already set

* Address review comments

* Fix compilation errors

* set port adv speed for AN and add test

Signed-off-by: Guohan Lu <gulv@microsoft.com>

* Convert all the tabs to spaces

* Bring in changes for port an/fec from azure master

* support autoneg and fec in port config ini file
* if autoneg is specified, along with speed; speed is set seperately as port attribute

* Address review comments

* Remove trailing whitespaces

* Fix vs test indentation

* Address review comments

* Fix for VS test

* Fix VS test failure

* Fix for VS test

* More VS test fixes

* Fix for test_mirror VS test that was failing due to new DTel ACL tables
…utes.

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
lguohan and others added 6 commits August 3, 2018 16:06
add acl table created by dtel to acl default tables
* [aclorch]: only bind to port for ACL_TABLE_PFCWD type

Signed-off-by: Sihui Han <sihan@microsoft.com>

* make the code easier to read

Signed-off-by: Sihui Han <sihan@microsoft.com>
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
…ent operations

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
jipanyang and others added 9 commits August 6, 2018 15:52
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
* Simplify PortsOrch::doPortTask
* Revert "Simplify PortsOrch::doPortTask", but keep unit test
* Fix handling m_autoneg, speed, mtu, admin_status, fec_mode before PortConfigDone
* Refine unit test with swsscommon.ProducerStateTable
…sonic-net#553)

* Add basic schema for warm start schema in configDB and application DB.

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>

* Move warm start table for process stats to state DB

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>

* Add reconciliation timer entry in configDB warm restart table.

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>

* Update warm restart timer schema

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>

* There might be more than one timer at system level or individual docker

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
* Fix addExistingData consumer converstion
* Add more addExistingData()
* Warm reboot for PortsOrch
* Remove calling doPortConfigDoneTask in ctor
* Remove unused function signature
…ation

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
…rocesses (sonic-net#547)

* Add common warm start functions to be used by all SWSS processes

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>

* Use updated state schema

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>

* Adapt to the new warm reboot schema

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>

* Use the new Table::getEntry() and Table::setEntry to replace redisClient operations

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>

* use the new Table:hget() and Table:hset() APIs

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>

* Add illustration about warm start knob usage

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
jipanyang and others added 12 commits August 9, 2018 16:08
…mpToSyncTasks()

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
Signed-off-by: stepanb <stepanb@mellanox.com>
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
…or now.

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
* Warm reboot for BufferOrch
* Add log for warm reboot
* Add bake() interface
* OrchDaemon supports warm start
* swss: flush g_asicState after each event is done
* add flush() after event is handled in case some entries are still in buffer, don't wait
* with the changes in sairedis and swss-common, route performance improved by 200~300 routes/sec
* swss-common: remove unnecessary flush() in timeout case and update comment
* remove unnecessary flush() in timeout case and update comment
@jipanyang jipanyang closed this Aug 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.