Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[201803] Orchagent is crashed upon remove IP address from Port channel and shutting it in t0 topology #2301

Closed
chitra-raghavan opened this issue Nov 26, 2018 · 2 comments

Comments

@chitra-raghavan
Copy link
Contributor

Description

While trying to remove ip address and shutting the Port channel interface in t0 topology , orchagent is getting crashed.
script : https://github.com/Azure/sonic-mgmt/blob/master/ansible/roles/test/tasks/vlan_configure.yml
Topology: T0

6400 Bgp routes are learned via Po interfaces

root@sonic-z9100-02:~# show ip bgp sum
BGP router identifier 10.1.0.32, local AS number 65100
RIB entries 12806, using 1401 KiB of memory
Peers 8, using 36 KiB of memory
Peer groups 2, using 112 bytes of memory

Neighbor        V         AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.0.0.57       4 64600    3213    3215        0    0    0 00:09:37     6400
10.0.0.59       4 64600    3212    3216        0    0    0 00:09:37     6400
10.0.0.61       4 64600    3212    3217        0    0    0 00:09:37     6400
10.0.0.63       4 64600    3213      15        0    0    0 00:09:38     6400

Total number of neighbors 4

Steps to reproduce the issue:

  1. flush ip address in Portchannel
  2. shut the Port channel interface
	root@sonic-z9100-02:/var/core# ip addr flush PortChannel0001
	root@sonic-z9100-02:/var/core# ip addr flush PortChannel0002
	root@sonic-z9100-02:/var/core# ip addr flush PortChannel0003
	root@sonic-z9100-02:/var/core# ip addr flush PortChannel0004

	root@sonic-z9100-02:/var/core# ifconfig PortChannel0001 down
	root@sonic-z9100-02:/var/core# ifconfig PortChannel0002 down
	root@sonic-z9100-02:/var/core# ifconfig PortChannel0003 down
	root@sonic-z9100-02:/var/core# ifconfig PortChannel0004 down

Describe the results you received:

Upon removing ip address from the PO interface , "removeRouterIntfs: Router interface is still referenced" Notice is thrown continuously.
After shutting Po interface , through "ip addr flush PortChannel0001" , orchagent crashed

root@sonic-z9100-02:~# pgrep orch -a
root@sonic-z9100-02:~# 

syslog:

Nov 26 10:41:54.452369 sonic-z9100-02 ERR dhcrelay[86]: receive_packet failed on PortChannel0001: Network is down
Nov 26 10:41:54.453624 sonic-z9100-02 NOTICE orchagent: :- removeNeighbor: Removed next hop fc00::72 on PortChannel0001
Nov 26 10:41:54.454143 sonic-z9100-02 NOTICE orchagent: :- removeNeighbor: Removed neighbor 52:54:00:ed:3d:1d on PortChannel0001
Nov 26 10:41:54.455235 sonic-z9100-02 NOTICE orchagent: :- removeRouterIntfs: Remove router interface for port PortChannel0001
Nov 26 10:41:54.455235 sonic-z9100-02 NOTICE orchagent: :- removeRouterIntfs: Router interface is still referenced
Nov 26 10:41:54.455235 sonic-z9100-02 NOTICE orchagent: :- removeRouterIntfs: Router interface is still referenced
Nov 26 10:41:54.455235 sonic-z9100-02 NOTICE orchagent: :- removeRouterIntfs: Router interface is still referenced
Nov 26 10:41:54.456492 sonic-z9100-02 NOTICE orchagent: :- removeLagMember: Remove member Ethernet64 from LAG PortChannel0001 lid:2000000000a94 lmid:1b000000000a9c
Nov 26 10:41:55.050881 sonic-z9100-02 ERR dhcrelay[86]: receive_packet failed on PortChannel0002: Network is down
Nov 26 10:41:55.051189 sonic-z9100-02 NOTICE orchagent: :- removeNeighbor: Removed next hop fc00::76 on PortChannel0002
Nov 26 10:41:55.051692 sonic-z9100-02 NOTICE orchagent: :- removeNeighbor: Removed neighbor 52:54:00:2d:06:d4 on PortChannel0002
Nov 26 10:41:55.052394 sonic-z9100-02 NOTICE orchagent: :- removeRouterIntfs: Remove router interface for port PortChannel0002
Nov 26 10:41:55.052710 sonic-z9100-02 NOTICE orchagent: :- removeRouterIntfs: Router interface is still referenced
Nov 26 10:41:55.054065 sonic-z9100-02 NOTICE orchagent: :- removeRouterIntfs: Router interface is still referenced
Nov 26 10:41:55.054821 sonic-z9100-02 NOTICE orchagent: :- removeLagMember: Remove member Ethernet66 from LAG PortChannel0002 lid:2000000000a96 lmid:1b000000000a9d
Nov 26 10:41:55.055162 sonic-z9100-02 NOTICE orchagent: :- removeRouterIntfs: Router interface is still referenced
Nov 26 10:41:55.055428 sonic-z9100-02 NOTICE orchagent: :- removeRouterIntfs: Router interface is still referenced
Nov 26 10:41:55.862538 sonic-z9100-02 NOTICE orchagent: :- removeRouterIntfs: Router interface is still referenced
Nov 26 10:42:00.862592 sonic-z9100-02 NOTICE orchagent: :- removeRouterIntfs: Router interface is still referenced
Nov 26 10:42:00.945223 sonic-z9100-02 NOTICE orchagent: :- setPortPvid: Set pvid 1 to port: Ethernet66
Nov 26 10:42:00.946347 sonic-z9100-02 NOTICE orchagent: :- addLagMember: Add member Ethernet66 to LAG PortChannel0002 lid:2000000000a96 pid:100000000002f
Nov 26 10:42:00.946515 sonic-z9100-02 NOTICE orchagent: :- removeRouterIntfs: Router interface is still referenced
Nov 26 10:42:00.946643 sonic-z9100-02 NOTICE orchagent: :- removeRouterIntfs: Router interface is still referenced
Nov 26 10:42:00.946643 sonic-z9100-02 NOTICE orchagent: :- removeRouterIntfs: Router interface is still referenced
Nov 26 10:42:00.950466 sonic-z9100-02 ERR syncd: _brcm_sai_lag_member_vlan_add:929 vlan port get failed with error Entry not found (0xfffffff9).
Nov 26 10:42:00.950466 sonic-z9100-02 ERR syncd: brcm_sai_create_lag_member:509 lag member vlan add failed with error -7.
Nov 26 10:42:00.950466 sonic-z9100-02 ERR syncd: :- processEvent: attr: SAI_LAG_MEMBER_ATTR_LAG_ID: oid:0x2000000000a96
Nov 26 10:42:00.950466 sonic-z9100-02 ERR syncd: :- processEvent: attr: SAI_LAG_MEMBER_ATTR_PORT_ID: oid:0x100000000002f
Nov 26 10:42:00.950821 sonic-z9100-02 ERR syncd: :- processEvent: failed to execute api: create, key: SAI_OBJECT_TYPE_LAG_MEMBER:oid:0x1b000000000ac4, status: SAI_STATUS_ITEM_NOT_FOUND
Nov 26 10:42:00.950821 sonic-z9100-02 ERR syncd: :- syncd_main: Runtime error: :- processEvent: failed to execute api: create, key: SAI_OBJECT_TYPE_LAG_MEMBER:oid:0x1b000000000ac4, status: SAI_STATUS_ITEM_NOT_FOUND
Nov 26 10:42:00.950984 sonic-z9100-02 NOTICE syncd: :- exit_and_notify: sending switch_shutdown_request notification to OA
Nov 26 10:42:00.950984 sonic-z9100-02 NOTICE syncd: :- exit_and_notify: notification send successfull
Nov 26 10:42:00.950984 sonic-z9100-02 WARNING syncd: :- exit_and_notify: sleep forever to keep data plane active
Nov 26 10:42:00.951368 sonic-z9100-02 NOTICE orchagent: :- removeRouterIntfs: Router interface is still referenced

Nov 26 10:42:00.951368 sonic-z9100-02 NOTICE orchagent: :- handle_switch_shutdown_request: switch shutdown request
Nov 26 10:42:00.952232 sonic-z9100-02 INFO supervisord: orchagent terminate called after throwing an instance of 'std::invalid_argument'
Nov 26 10:42:00.952432 sonic-z9100-02 INFO supervisord: orchagent   what():  parse error - unexpected end of input
Nov 26 10:42:01.527071 sonic-z9100-02 INFO swss.sh[10995]: 2018-11-26 10:42:01,526 INFO exited: orchagent (terminated by SIGABRT (core dumped); not expected)
Nov 26 10:42:08.493549 sonic-z9100-02 INFO supervisord 2018-11-26 10:42:01,526 INFO exited: orchagent (terminated by SIGABRT (core dumped); not expected)

Additional information you deem important (e.g. issue happens only occasionally):

**Output of `show version`:**

root@sonic-z9100-02:~# show ver
SONiC Software Version: SONiC.HEAD.94-6a24eb4
Distribution: Debian 8.11
Kernel: 3.16.0-6-amd64
Build commit: 6a24eb4
Build date: Tue Nov 20 04:37:06 UTC 2018
Built by: johnar@jenkins-worker-4

Docker images:
REPOSITORY                 TAG                 IMAGE ID            SIZE
docker-syncd-brcm          HEAD.94-6a24eb4     de1a6e23fa55        373 MB
docker-syncd-brcm          latest              de1a6e23fa55        373 MB
docker-orchagent-brcm      HEAD.94-6a24eb4     050b96591213        294.7 MB
docker-orchagent-brcm      latest              050b96591213        294.7 MB
docker-lldp-sv2            HEAD.94-6a24eb4     a14a1bc5b8cf        307.1 MB
docker-lldp-sv2            latest              a14a1bc5b8cf        307.1 MB
docker-dhcp-relay          HEAD.94-6a24eb4     07d780569041        289.1 MB
docker-dhcp-relay          latest              07d780569041        289.1 MB
docker-database            HEAD.94-6a24eb4     d82c6a484b64        289.1 MB
docker-database            latest              d82c6a484b64        289.1 MB
docker-teamd               HEAD.94-6a24eb4     6ce0e30eea98        294.5 MB
docker-teamd               latest              6ce0e30eea98        294.5 MB
docker-snmp-sv2            HEAD.94-6a24eb4     f7c92a6598de        326.8 MB
docker-snmp-sv2            latest              f7c92a6598de        326.8 MB
docker-router-advertiser   HEAD.94-6a24eb4     9052ac210b08        286.6 MB
docker-router-advertiser   latest              9052ac210b08        286.6 MB
docker-platform-monitor    HEAD.94-6a24eb4     54570d4f5d4c        308.5 MB
docker-platform-monitor    latest              54570d4f5d4c        308.5 MB
docker-fpm-quagga          HEAD.94-6a24eb4     3f9c498702a0        300.9 MB
docker-fpm-quagga          latest              3f9c498702a0        300.9 MB

root@sonic-z9100-02:~#

Attach debug file sudo generate_dump:

orchagent.1543228920.47.core.gz

@stcheng
Copy link
Contributor

stcheng commented Nov 26, 2018

In the 201803 branch, this operation is not fully supported.

@chitra-raghavan
Copy link
Contributor Author

Thanks

theasianpianist added a commit to theasianpianist/sonic-buildimage that referenced this issue Sep 2, 2022
Include following commits:

414e239 update unit tests for swap allocator
a91a492 consider swap checking memory in installer
f0ce586 [route_check]: Ignore standalone tunnel routes (sonic-net#2325)
3af8ba4 Replace cmp in acl_loader with operator.eq (sonic-net#2328)
899ba12 Subinterface vrf bind issue fix (sonic-net#2211)
e45b47a [VRF]Adding CLI checks to ensure Vrf is valid in interface bind and static route commands (sonic-net#2333)
f82835e [doc]: Add MACsec CLI doc (sonic-net#2334)
666bdc0 [sonic-package-manager] Drop 'expires_in' (sonic-net#2002)
52ac8ac Handle non-front-panel ports in is_rj45_port (sonic-net#2327)
42ed6d5 [service_mgmt]: Fix fetch MULTI_INST_DEPENDENT bug in service_mgmt.sh.j2 (sonic-net#2319)
d1a2d72 correct an error by changing "show bgp summary" to "show bfd summary" (sonic-net#2324)
7d409a0 Update VRF unbind command (sonic-net#2331)
e14f679 Fix issue: port_type is referenced before initialized (sonic-net#2323)
7704f63 Fix issue: exception in is_rj45_port in multi ASIC env (sonic-net#2313)
6fc4f15 Delete .DS_Store (sonic-net#2244)
ece4049 Fix bug with checking VRF's routes in route_check.py  (sonic-net#2301)
20c6d18 [decode-syseeprom] Fix setting use_db based on support_eeprom_db (sonic-net#2270)
9282e6c Fix vrf UT failed issue (sonic-net#2309)
37eb2b3 add lacp_rate to portchannel (sonic-net#2036)

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
liat-grozovik pushed a commit that referenced this issue Sep 4, 2022
Update sonic-utilities submodule pointer to include the following:

Replace cmp in acl_loader with operator.eq (#2328)
Subinterface vrf bind issue fix (#2211)
[VRF]Adding CLI checks to ensure Vrf is valid in interface bind and static route commands (#2333)
[doc]: Add MACsec CLI doc (#2334)
[sonic-package-manager] Drop 'expires_in' (#2002)
Handle non-front-panel ports in is_rj45_port (#2327)
[service_mgmt]: Fix fetch MULTI_INST_DEPENDENT bug in service_mgmt.sh.j2 (#2319)
correct an error by changing "show bgp summary" to "show bfd summary" (#2324)
Update VRF unbind command (#2331)
Fix issue: port_type is referenced before initialized (#2323)
Fix issue: exception in is_rj45_port in multi ASIC env (#2313)
Delete .DS_Store (#2244)
Fix bug with checking VRF's routes in route_check.py (#2301)
[decode-syseeprom] Fix setting use_db based on support_eeprom_db (#2270)
Fix vrf UT failed issue (#2309)
add lacp_rate to portchannel (#2036)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants