-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[config_validation]: fixing crash in grpc client during shutdown #18532
Conversation
Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for addressing the missing pieces!
Part of #15072 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution. Is it possible to add a test to prove the regression?
Also, can we add a release note? in |
it feels like a race candition and only in specific use case (hence noone saw that before). something like "config should contain xDS control plane and message to it should be sent while validation server is about to shutdown (so we have it in flight during shutdown phase)" so i'm not really sure that it is possible to write a test (because even right now it is not failing all the time) |
Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am ok with the change as it is. But I would defer to @ggreenway for final approval and merge
I don't think the validation mode should be making any network connections at all. It seems like a bug to be using a GrpcMux. cc @htuch |
Yes, agree, this seems broken. @adisuissa could you take a look at this when you get a chance? I think this affects some of our intended use of config validation mode as well potentially. CC @stevenzzzz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Looking into the mux creation.
@tehnerd thanks for reporting this. |
exact config is binary protobuff. so kinda hard to share it as is. this is part which are referencing grpc related features:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under investigation; not mergable yet
I've tested the following config:
Added a breakpoint for |
Not really. We do not have in our infra the way to use yaml configs. However I can modify binary any way you need to to get stack traces or w/e else is needed with our configs. |
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
Signed-off-by: Nikita V. Shirokov tehnerd@tehnerd.com
Commit Message:
this is basically copy paste of logic from main server.cc from #17403. we found that server_validation could as well crash during shutdown when it tries to drain GrpcMux queues. this diff prevents it to do so during shutdown phase
example of such crash:
Additional Description:
Risk Level: LOW
Testing: existing UTs
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]