-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix connection reaper deadlock. #150
Conversation
Added explicitly running test to detect deadlock. Use ```go test -tags=TestConnReaperDeadlockLoop -run ^TestConnReaperDeadlockLoop$```. Be aware the test generate a lot of sockets in TIME_WAIT. Without patch to socket.go it breaks pretty quick on fast and slow machines. With the patch it runs way longer but gets slower when there are plenty of sockets in TIME_WAIT.
Dont get whats wrong with TestXPubSub - I ran it only in loop for almost 24h - no single failure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apologies for the belated answer (coming back from holidays)
just a couple of nitpicks.
Co-authored-by: Sebastien Binet <binet@cern.ch>
I will try to find way to make test with single pass to hit the deadlock. Please give me couple of days |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #150 +/- ##
==========================================
+ Coverage 68.19% 68.22% +0.03%
==========================================
Files 30 30
Lines 2600 2603 +3
==========================================
+ Hits 1773 1776 +3
Misses 724 724
Partials 103 103 ☔ View full report in Codecov by Sentry. |
The test TestConnReaperDeadlock2 designed after TestConnReaperDeadlock but uses internals to ensure problematic sequnence, which in case of TestConnReaperDeadlock requires number of runs.
I think the test is now ready but it might looks more ugly even can repro problem each run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for sticking with us.
I have a couple of comments (mostly cosmetic, though)
Additional test simplification. Now it needs only two connections, where 1st fails asap and 2nd fails with delay - giving connection reaper time to meet deadlock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
thanks for sticking through the iterative refinement process :)
Added explicitly running test to detect deadlock.
Use
go test -tags=TestConnReaperDeadlockLoop -run ^TestConnReaperDeadlockLoop$
. Be aware the test generate a lot of sockets in TIME_WAIT. Without patch to socket.go it breaks pretty quick on fast and slow machines. With the patch it runs up to default test timeout (10m) but gets slower when there are plenty of sockets in TIME_WAIT.Fixes #149