-
Notifications
You must be signed in to change notification settings - Fork 843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
websocket-client socket mode client doesn't handle reconnects properly #1379
Comments
Hi @leifwalsh, thank you very much for taking the time to report the issue! Your suggestion looks great to me 👍 May I ask you to send a pull request to merge the changes in your subclass? Otherwise, I can make a PR with the change when I have time for it. |
Thanks! If you could make the change (or figure out a better way to detect a broken connection), that would be great. Otherwise, I can send a PR but I have to go through an internal review process at my company to get permission to send you one, which could take a few days/weeks. |
@leifwalsh Thanks for your quick reply! if that's the case, I am happy to make the change for you! (also, I will explore alternatives when working on it) |
…reconnects properly
@leifwalsh Here is my PR to fix this issue: #1380 If you have any sugggestions, please feel free to leave comments on the PR! |
Looks great, thank you! |
(Filling out the following details about bugs will help us solve your issue sooner.)
Reproducible in:
slack-sdk==3.21.3
Python 3.11.3
Debian 10
The Slack SDK version
slack-sdk==3.21.3
Python runtime version
Python 3.11.3
OS info
Debian 10
Steps to reproduce:
(Share the commands to run, source code, and project settings (e.g., setup.py))
slack_sdk.socket_mode.websocket_client.SocketModeClient
Expected result:
The client should properly reconnect
Actual result:
The client repeatedly tries to reconnect without re-issuing a wss url, causing all connections to fail in a loop forever
The logs are roughly the same as described in slackapi/bolt-python#470. It repeatedly gets
WebSocketConnectionClosedException
. Specifically,websocket
logsConnection to remote host was lost. - goodbye
andslack_sdk.socket_mode.websocket_client
logson_error invoked (error: WebSocketConnectionClosedException, message: Connection to remote host was lost.)
I believe the problem is that
SocketModeClient.is_connected()
just checks whetherself.current_session is not None
(https://github.com/slackapi/python-slack-sdk/blob/v3.21.3/slack_sdk/socket_mode/websocket_client/__init__.py#L139). If you compare that with the builtin client, it also checks whetherconnection.is_active()
(https://github.com/slackapi/python-slack-sdk/blob/v3.21.3/slack_sdk/socket_mode/builtin/client.py#L152) which checks ifself.sock is not None
(https://github.com/slackapi/python-slack-sdk/blob/v3.21.3/slack_sdk/socket_mode/builtin/connection.py#L198). I think what happens is that in a certain type of disconnect scenario,BaseSocketModeClient.connect_to_new_endpoint()
doesn't callissue_new_wss_url()
when it should, because it thinks we're still connected (https://github.com/slackapi/python-slack-sdk/blob/v3.21.3/slack_sdk/socket_mode/client.py#L75).I'm not sure how reliable that is in theory, there might be race conditions still. However, empirically, it seems that
websocket-client
's connection class does a reasonable job of setting its own.sock
attribute toNone
when handling disconnects. I subclassedslack_sdk.socket_mode.websocket_client.SocketModeClient
and overrodeis_connected()
with a version that also checks ifself.current_session.sock
is falsey, and ran that in prod for a couple weeks, and it seems like when this happens it does correctly reconnect. Running the app in a HA configuration (at least two instances running) means that, as long as they both don't disconnect simultaneously, I think we retain at least one active connection consistently and don't drop messages.I think the fix is probably simple: just check that relying on checking
websocket._app.WebSocketApp.sock
is something you're comfortable doing, and then add that condition toslack_sdk.socket_mode.websocket_client.SocketModeClient.is_connected()
.The text was updated successfully, but these errors were encountered: