Skip to content

Commit

Permalink
[docker-fpm-frr]: Start bgpd after zebra was started (#5038)
Browse files Browse the repository at this point in the history
fixes #5026
Explanation:
In the log from the issue I found:
```
I see following in the log
Jul 22 21:13:06.574831 vlab-01 WARNING bgp#bgpd[49]: [EC 33554499] sendmsg_nexthop: zclient_send_message() failed
```
Analyzing source code I found that the error message could be issues only when `zclient_send_rnh()` return less than 0.
```
	ret = zclient_send_rnh(zclient, command, p, exact_match,
			       bnc->bgp->vrf_id);
	/* TBD: handle the failure */
	if (ret < 0)
		flog_warn(EC_BGP_ZEBRA_SEND,
			  "sendmsg_nexthop: zclient_send_message() failed");
```
I checked [zclient_send_rnh()](https://github.com/Azure/sonic-frr/blob/88351c8f6df5450e9098f773813738f62abb2f5e/lib/zclient.c#L654) and found that this function will return the exit code which the function gets from [zclient_send_message()](https://github.com/Azure/sonic-frr/blob/88351c8f6df5450e9098f773813738f62abb2f5e/lib/zclient.c#L266) But the latter function could return not 0 in two cases:
1.	bgpd didn’t connect to the zclient socket yet [code](https://github.com/Azure/sonic-frr/blob/88351c8f6df5450e9098f773813738f62abb2f5e/lib/zclient.c#L269)
2.	The socket was closed. But in this case we would receive the error message in the log. (And I can find the message in the log when we reboot sonic) [code](https://github.com/Azure/sonic-frr/blob/88351c8f6df5450e9098f773813738f62abb2f5e/lib/zclient.c#L277)

Also I see from the logs that client connection was set later we had the issue in bgpd.

Bgpd.log
```
Jul 22 21:13:06.574831 vlab-01 WARNING bgp#bgpd[49]: [EC 33554499] sendmsg_nexthop: zclient_send_message() failed
```
Vs
Zebra.log
```
Jul 22 21:13:12.713249 vlab-01 NOTICE bgp#zebra[48]: client 25 says hello and bids fair to announce only static routes vrf=0
Jul 22 21:13:12.820352 vlab-01 NOTICE bgp#zebra[48]: client 30 says hello and bids fair to announce only bgp routes vrf=0
Jul 22 21:13:12.820352 vlab-01 NOTICE bgp#zebra[48]: client 33 says hello and bids fair to announce only vnc routes vrf=0
```
So in our case we should start zebra first. Wait until it is started and then start bgpd and other daemons.

**- How I did it**

I changed a graph to start daemons in the following order:
1. First start zebra
2. Then starts staticd and bgpd
3. Then starts vtysh -b and bgpeoi after bgpd is started.
  • Loading branch information
pavel-shirshov authored Jul 25, 2020
1 parent 5c67a3c commit 8918403
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions dockers/docker-fpm-frr/frr/supervisord/supervisord.conf.j2
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ startsecs=0
stdout_logfile=syslog
stderr_logfile=syslog
dependent_startup=true
dependent_startup_wait_for=rsyslogd:running
dependent_startup_wait_for=zebra:running

[program:bgpd]
command=/usr/lib/frr/bgpd -A 127.0.0.1 -M snmp
Expand All @@ -59,7 +59,7 @@ startsecs=0
stdout_logfile=syslog
stderr_logfile=syslog
dependent_startup=true
dependent_startup_wait_for=rsyslogd:running
dependent_startup_wait_for=zebra:running

[program:fpmsyncd]
command=fpmsyncd
Expand Down Expand Up @@ -93,7 +93,7 @@ startsecs=0
stdout_logfile=syslog
stderr_logfile=syslog
dependent_startup=true
dependent_startup_wait_for=rsyslogd:running
dependent_startup_wait_for=bgpd:running
{% endif %}

{% if WARM_RESTART is defined and WARM_RESTART.bgp is defined and WARM_RESTART.bgp.bgp_eoiu is defined and WARM_RESTART.bgp.bgp_eoiu == "true" %}
Expand All @@ -107,5 +107,5 @@ startretries=0
stdout_logfile=syslog
stderr_logfile=syslog
dependent_startup=true
dependent_startup_wait_for=rsyslogd:running
dependent_startup_wait_for=bgpd:running
{% endif %}

0 comments on commit 8918403

Please sign in to comment.