Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump slackclient to 1.2.0 with support to reconnect #1184

Closed
wants to merge 2 commits into from

Conversation

deferato
Copy link
Contributor

@deferato deferato commented Apr 9, 2018

When using slackclient an error is happening that the client is reconnecting after a while.
SlackClient issue
On the slack client page they recommend a change to allow an automatic reconnect.

@gbin

@ghost ghost added the in progress label Apr 9, 2018
@deferato
Copy link
Contributor Author

deferato commented Apr 9, 2018

Also, the log is not showing the last version
DEBUG errbot.plugins.VersionChecker Tested current Errbot version and it is 5.1.2

CHANGES.rst Outdated
- Message: Allow partial message size
- Slack: include the option to set Proxy when connecting using SlackClient
- Slack: fix member joined channel event handler

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this was meant to be committed. If you could rebase this out so only the changes to slack.py and setup.py remain I'll be happy to merge. 😄

@zoni
Copy link
Member

zoni commented Apr 9, 2018

Although, thinking about this a little more... Is letting slackclient handle the reconnection really better than having control over it ourselves? I'm not so sure. In fact, I'd argue errbot handling it itself is better as it notifies plugins that the network went down. This is arguably better than it being handled transparently because otherwise it gives you no way of knowing messages might have been missed/gotten lost.

@deferato
Copy link
Contributor Author

deferato commented Apr 9, 2018

@zoni this changes were intentional actually, this file was not updated on the 5.2.0, so I thought that this was something I should have done. :), removed now.

@deferato
Copy link
Contributor Author

deferato commented Apr 9, 2018

@zoni about your comment related to the change, at the moment, the connection with Slack just fails, for no reason. so, Errbot will notice after a while that the connection fails and then start a new one.
With this option to auto_reconnect on SlackClient, this will be internal of the client, which will be faster, BUT if a message is sent to Slack, will fail anyway.
I am looking the Slackclient code to see how this works. SlackCLient Code

Other option is to set this as an option on the josn BOT_IDENTITY

@deferato
Copy link
Contributor Author

@zoni any update on this case? Thanks

@GenPage
Copy link
Contributor

GenPage commented Apr 20, 2018

I think the compromise here is exposing the setting in the BOT_IDENTITY to allow the user control ovver the functionality. The actual re-connection can be left to the slackclient lib

@deferato
Copy link
Contributor Author

@GenPage I agree, the problem is that at the moment, you cannot change any settings on the SlackClient lib, because the connection is started from the Errbot code.
That's why I am suggestion the use of the BOT_IDENTITY, to set this option.

@GenPage
Copy link
Contributor

GenPage commented Apr 23, 2018

@zonn @deferato I think I see the concern where zonn is coming from.

To help shed some light, the slackclient lib by default re-attempts 5 times with an exponential back off before bombing out as ErrBot would normally expect.

https://github.com/slackapi/python-slackclient/blob/8a6e36e55e56a3e41154b824a883fd16a272dc8f/slackclient/server.py#L107-L129

The question becomes, how much of an impact is there on the bot and plugins while we are waiting for the slackclient itself to recover first before raising and allowing the bot to deactivate all the plugins through the disconnect_callback?

My thoughts here would be that I woul personally like to have full control over the number of retries and the exponential back off amount before subjecting users to this, or we could just expose it and have it off by default to start.

@deferato
Copy link
Contributor Author

@GenPage I agree that is something a bit complex, and may have other impacts on the Bot.
But I also think that creating the options (on the BOT_IDENTITY token) to do this, but for sure keeping the defaults as it's today is possible and the best.

@zoni
Copy link
Member

zoni commented Apr 28, 2018

So I looked at this some more, and based on my current understanding of the issue I remain of the opinion that this is not something we should be doing/exposing.

Please correct me if any of the below is incorrect:

  1. Sometimes the websocket connection to Slack's Real-Time Messages (RTM) API gets disconnected. Most likely this would be due to networking issues between your bot and the Slack service, but it could be for any reason.
  2. Reads from the websocket are handled by websocket_safe_read(). If a read fails, WebSocketConnectionClosedException is raised. This exception is caught by errbot and triggers errbot's disconnect/reconnect logic which (a) notifies plugins of the fact that there was a disconnection and (b) reconnects using exponential backoff.
  3. The autoreconnect parameter added in slackclient 1.2.0 merely inserts its own reconnect logic which is opaque to errbot. Is simply reconnects the socket using much the same type of logic errbot is doing at precisely the same moment errbot would start to reconnect, namely when WebSocketConnectionClosedException is caught. I don't see any code that would make it reconnect faster or more reliably than what errbot is already doing.
  4. Because of this, effectively nothing changes about the situation except for the fact that plugins do not get notified of the disconnection. This would run counter to how most other backends operate, which deactivate and re-activate plugins on a disconnection. I see this (reactivation) as a desirable trait, as otherwise there is no way to know some messages may have been lost because the bot was disconnected for some time.

I worry adding some kind of "let slackclient itself reconnect" flag just causes confusion to users, especially given the fact errbot already does all this and doesn't need the underlying client to do it for us. Maintaining that control keeps the slack backend more in line with most other backends.

@deferato perhaps you can explain how/where errbot's current behavior doesn't meet your expectations/needs? In an earlier comment you said "With this option to auto_reconnect on SlackClient, this will be internal of the client, which will be faster" but I don't see how that could be the case (see point 3 above).

@GenPage
Copy link
Contributor

GenPage commented Apr 29, 2018

Hey @zoni, Thanks for taking the time to take another look and provide a detailed explanation. I think point 3 and 4 you made finally made it click in my head.

I've reviewed the codebase in more detail and can see that in context of the Slack backend, serve_once is reattempted with an exponential backoff already based on the serve_forever code [1] in the Base backend.

@deferato I think zoni does have a point, I've compared both ErrBot reconnection in serve_forever [1] and in the SlackLib and having both reconnection logics active would only further delay the recovery the user experiences, depending on the situation that caused the connection disruption.

I think its best to keep the reconnection logic in the Base backend class since we have the luxury of informing the plugins of the disruption, no matter which backend you use.

[1] https://github.com/errbotio/errbot/blob/master/errbot/backends/base.py#L687

@deferato
Copy link
Contributor Author

deferato commented May 1, 2018

@zoni and @GenPage thanks a lot for the comments.
All the comments make sense.
I was having a weird situation where Errbot was closing connection with Slack continuously, but is connected via a proxy.
However, this was happening a lot when the log level was set to DEBUG.
When I've changed the log level to INFO and ERROR this is not happening any more.

Looking deeper the SlackClient code I've noticed that the code is basically a retry connection up to 5 times.
Initially I thought that was some specific option to Slack itself to better handle connections.

Anyway, I believe that we can close this PR without merge it.

@zoni
Copy link
Member

zoni commented May 7, 2018

Cool, in that case I will close the issue. Please go ahead and open a separate issue if there are clear issues when connecting via a proxy that we should look at.

@zoni zoni closed this May 7, 2018
@ghost ghost removed the in progress label May 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants