-
Notifications
You must be signed in to change notification settings - Fork 522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new BGP Monitor Daemon to pull BGP peer state and store in State DB for MIB consumption #1429
Conversation
…ker to assist storing some BGP DB info into State DB for others to consume
…ker to assist storing some BGP DB info into State DB for others to consume
This pull request introduces 5 alerts when merging f805107 into 65f63c1 - view on LGTM.com new alerts:
|
…ker to assist storing some BGP DB info into State DB for others to consume
retest lgtm please |
fpmsyncd/bgpmon.py
Outdated
if self.ipv4_n_state[neighb] != self.new_ipv4_n_state[neighb]: | ||
# state changed. Update state DB for this entry | ||
state = self.new_ipv4_n_state[neighb] | ||
self.db.set(self.db.STATE_DB, key, 'state', state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set [](start = 28, length = 3)
Suggest to use https://redis.io/commands/expire in case this process crashes or always fails.
If this is the right direction, "only update the entry if sate changed" may be not that important since we use pipeline and there is only one transaction. #WontFix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qiluo-msft Since this StateDB is required by SNMP proxy agent and potential future consumers, if this process always fails, it will not matter what the BGP StateDB entries gets properly deleted or not... those agents will not operate correctly until bgpmon daemon is healthy again. If my understanding is correct after reading through the "expire" documentation it will require bgpmon to periodically update the expire to ensure those BGP key states does not get expired/deleted when there is no changes needed. Since we want to favor the design for the steady state condition (no more peer state changes in steady state), we should not use "expire" or else bgpmon in steady state will end up having to periodically update the state DB even there are no state changes. In this desing, bgpmon when restart will force the state DB to clean up and provide the most updated state from the new snapshot it gathered from BGP/Zebra. So StateDB clean up is taken cared of when necessary to prevent stale entries from staying in the state DB. Also, using "expire" means someone (Redis) will need to keep track of the time for each key. This operation is not free no matter how efficient Redis implements it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's assume there is a bug inside this daemon and it is stuck somewhere. It is better to expire and propagate this error to monitors. In that situation, it's dangerous to keep as false stable.
In reply to: 484678284 [](ancestors = 484678284)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qiluo-msft Per our phone discussion I think we agreed that this is ok for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As comments
…host ip when connect stateDB, moved cleaning snapshot DS only when no exception, removed unused log and debug counter
…de. Specified python2
return True | ||
|
||
# Get a new snapshot of BGP neighbors and store them in the "new" location | ||
def get_all_neigh_states(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_all_neigh_states [](start = 8, length = 20)
Could you add some unit test or vs test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qiluo-msft Will add unit test in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer adding in the this PR if you can. It even benefit yourself in iterations.
… error code, use Redis pipeline for stateDB operations
This pull request introduces 1 alert when merging 85dce8e into f041c84 - view on LGTM.com new alerts:
|
fpmsyncd/bgpmon.py
Outdated
self.peer_state[peer] = state | ||
if pipe_count > PIPE_BATCH_MAX_COUNT: | ||
self.flush_pipe(data) | ||
data = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move data = {}
into flush_pipe()
function body #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qiluo-msft Agreed. unlike "c", python pass the object instead of copy the value so it make sense to clear the data{} inside this flush_pipe() method.
fpmsyncd/bgpmon.py
Outdated
data = {} | ||
pipe_count = 0 | ||
# If anything in the pipeline not yet flushed, flush them now | ||
if pipe_count > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if pipe_count > 0: | |
if len(data) > 0: | |
``` #Closed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qiluo-msft Agreed. There is no need to introduce the extra counter "pipe_count" since python dict implementation maintains an item count that can be accessed with O(1) time so it make sense to use the len(data) instead.
return True | ||
|
||
# Get a new snapshot of BGP neighbors and store them in the "new" location | ||
def get_all_neigh_states(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer adding in the this PR if you can. It even benefit yourself in iterations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except missing test
Based on offline comment from Guohan, the bgpmon code should not reside in this repo. Based on its functionality it should be moved to under sonic-buildimage/src/sonic-bgpcfgd . |
Warn user while deleting VLAN if it has IP addresses. Signed-off-by: d-dashkov <Dmytro_Dashkov@Jabil.com>
Added bgpmon (BGP Monitor Daemon) support to assist gathering BGP related information periodically based on BGP activity detection to store those information into State DB. For components such as SNMP agent that uses this new BGP related DB in state DB to support the MIB operation.
This PR has the implementation of the bgpmon processing code. There is another PR (sonic-net/sonic-buildimage#5329) needed in the dockers/docker-fpm-frr to start up this bgpmon inside of the BGP docker.
Both of these PRs are needed to achieve the full functionality of bgpmon running inside of BGP docker.
The HLD doc PR can be found here:
The following PR (SNMP Agent Changes to use State DB for BGP MIB )has a dependency on this PR:
Signed-off-by: Gen-Hwa Chiang gechiang@microsoft.com
- What I did
Added a new daemon to be run under BGP docker periodically to gather the BGP peer state information and update them in
the state DB. In order not to continuously running even when there are no BGP peer activities, this new daemon will use the BGP frr.log file timestamp as an indicator whether there are any activities in BGP that warrants it to inspect the peer state info for possible updates.
- How I did it
- How to verify it
For IPV4 You can perform "show ip bgp summary" and use that output to check against the state DB for each peer.
The state DB key would be something like "NEIGH_STATE_TABLE|10.0.0.33" where 10.0.0.33 is the peer ip address.
you can then read out the state information by issuing the redis cmd hgetall "NEIGH_STATE_TABLE|10.0.0.33". Compare that with the corresponding output of "show ip bgp summary".
Here is a sample output from the State DB:
admin@str-s6000-acs-8:~$ redis-cli -n 6
127.0.0.1:6379[6]> keys "NEIGH*"
127.0.0.1:6379[6]> hgetall "NEIGH_STATE_TABLE|10.0.0.1"
127.0.0.1:6379[6]>
you can do the same for ipv6 by performing "show ipv6 bgp summary" and follow the same steps above to validate with the corresponding peer states stored in state DB.