-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: remove breaks in relay loop that could cause permanent disconnection from relay #869
Conversation
@@ -123,7 +120,6 @@ func (svc *service) startNostr(ctx context.Context) error { | |||
logger.Logger.WithError(err).Error("Got an error from the relay while listening to subscription.") | |||
continue | |||
} | |||
break | |||
} | |||
select { | |||
case <-ctx.Done(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can add a break
here (after the log) instead of making it loop through once again and then break at L62
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to keep it easy to understand by only having one exit point. The more places we can have breaks the more chance one of them doesn't work and incorrectly exits the loop.
@@ -123,7 +120,6 @@ func (svc *service) startNostr(ctx context.Context) error { | |||
logger.Logger.WithError(err).Error("Got an error from the relay while listening to subscription.") | |||
continue | |||
} | |||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this break is here because it will be only hit when err == nil
(which we assumed that it happens when ctx is done). And for that to happen, StartSubscription
should return nil err, which can only happen when subscription context ends and
- NOT because of relay connection error
- NOT because of main context being cancelled (because otherwise breaking should be valid as the app is anyways terminated)
Is it possible? How to reproduce this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it happens when sub.Relay.Write
returns an error (here) and we also have this:
Line 136 in a8101aa
logger.Logger.Error("Relay context cancelled, but no connection error...trying to reconnect") |
So it should be possible. But in any case the fix makes sense as it is always good to check instead of breaking out by assuming things 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first I think it would be good to log and see it happens. If we can find a way to reproduce it, then we can submit an issue or fix in go-nostr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to reproduce this, and besides invalid filters that the relay rejects and closes the subscription but not the socket (which should not happen when using our properly configured relay) the only possible thing that might be happening is some sort of network socket read/write timeouts which causes the subscription to be closed but no connection error on the relay level. I did not reproduce such situation, so this is not 100% confirmed.
This would previously be treated as a program exit and so the break would cancel the loop.
+1 for removing them and making the reconnect loop more reliable.
@frnandu thanks. It definitely happens related to networking issue, more likely in self-hosted hubs. |
Fixes #888