-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fpm problems #17962
base: master
Are you sure you want to change the base?
Fpm problems #17962
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I think this is ok. I made changes as part of the pthread-race fix PR that introduced this: I effectively made sonic's message semantics the default inside FRR, and that wasn't correct to do.
90478c8
to
945aea6
Compare
) | ||
router.load_config( | ||
TopoRouter.RD_SHARP, os.path.join(CWD, "{}/sharpd.conf".format(rname)) | ||
) | ||
router.load_config( | ||
TopoRouter.RD_FPM_LISTENER, | ||
os.path.join(CWD, "{}/fpm_stub.conf".format(rname)), | ||
"-r" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like there's a style warning about this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah let me fix
c5ee28a
to
1464774
Compare
the last commit was a serious enough of a problem that I wanted that as a bug fix in previous releases so it's in #17969 |
A recent code change 29122bc changed the passing of data up the fpm from passing the tableid and vrf to the sonic expected tableid contains the vrfid. This violates the assumptions in the code that the netlink message passes up the tableid as the tableid. Additionally this code change did not modify the rib_find_rn_from_ctx to actually properly decode what could be passed up. Let's just fix this and let Sonic carry the patch as appropriate for themselves since they are not the only users of dplane_fpm_nl.c Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In fpm_listener, when a error is detected it would stop listening and not recover. Modify the code to close the socket and allow the connection to recover. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The fpm_testing_topo1 didn't turn on the fpm_listener sending the routes back to zebra to set the asic offload. Modify the test to tell the fpm_listener to set the offloaded flag and reflect the route back to the dplane_fpm_nl.c code. Also modify zebra to expect a response to the underlying fpm listener. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The mutex that wraps access to the output buffer is being held for the entire time the data is being generated to send down the pipe. Since the generation has absolutely nothing to do with the obuf, let's limit the mutex holding some. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When looping through the dplane providers, the worklist was being populated with items from the last provider and then the event system was checked to see if we should stop processing. If the event system says `yes` then the dplane code would stop and send the worklist to the master zebra pthread for collection. This obviously skipped the next dplane provider on the list which is double plus not good. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
1464774
to
64709ec
Compare
Found a couple issues when attempting to recreate a different problem.