Skip to content

Commit

Permalink
Improve handling of connection issues (related to #47)
Browse files Browse the repository at this point in the history
There was a problem in the automatic reconnect loop: If the connection to the IRC server is dropped it will not be reestablished because the call to `client.Server()` blocks forever. This is fixed by not calling it at this point. But there are other issues we have to tackle.

The first one is our locking. We still have a big issue here if the following happens:

- if the connection goes down, `client.Connect()` returns an error
- we acquire the lock: `clientLock.Lock()`
- we sleep some time
- we call `client.Connect()` again and it fails again because our Internet is still broken
- As our handler (CONNECTED) was not called, the lock is still in use
- we hang here forever

If we want to rewrite the lock logic, we have to think about two scenarios:

1) How do we handle the time between irc-connection-is-down and the-library-knows-that-the-connection-is-down?
The library sends continuously a PING (every 20 seconds is the smallest time interval we can use). If now() - time_last(PONG) is > 60 seconds, the connection goes in state disconnected. Then `Connect()` returns and we reconnect.

2) What happens if the connection to the IRC server is down, but our http server works and is used? We can

- send a 500 because the IRC connection is down (if we know it; good for alertmanager, probably bad for other things that don't automatically resend)
- handle the received data in a go routine and always return a 200 (never blocks, but we have to buffer the messages until IRC is up again)
- we keep the logic as it is and increase the channel size. Right now, http connections are held open if we cannot write into the channel (because the IRC sender does not work and stops reading from the channel)
  • Loading branch information
kmille committed Mar 11, 2021
1 parent 5b2c4a2 commit f0faa6e
Showing 1 changed file with 10 additions and 6 deletions.
16 changes: 10 additions & 6 deletions irc.go
Original file line number Diff line number Diff line change
Expand Up @@ -120,12 +120,16 @@ func ircConnection(config *viper.Viper, channelList []string) {

log.Info("Connecting to IRC server")
for {
if err := client.Connect(); err != nil {
clientLock.Lock()
log.Warnf("Connection to %s terminated: %s", client.Server(), err)
log.Warn("Reconnecting in 30 seconds...")
time.Sleep(30 * time.Second)
}
// client.Connect() blocks. It returns nil if we call a client.Close() which we never do.
// If the the connection is dropped/broken (recognized if we don't get a PONG 60 seconds
// after we sent a PING) an error is returned.
err := client.Connect()
// FIXME: if client.Connect() fails twice without going in state connected our handler is
// not called. This means that the lock is still acquired and we will hang here forever.
clientLock.Lock()
log.Warnf("Connection terminated: %s", err)
log.Warn("Reconnecting in 30 seconds...")
time.Sleep(30 * time.Second)
}

}
Expand Down

0 comments on commit f0faa6e

Please sign in to comment.