Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jsonrpc: reconnect automatically #1146

Closed
wants to merge 0 commits into from
Closed

jsonrpc: reconnect automatically #1146

wants to merge 0 commits into from

Conversation

jsign
Copy link
Contributor

@jsign jsign commented Jan 23, 2020

Closes #127

I started to do this most from the client's perspective, but then realized wsConn was also used from the server-side, so this seems a bit hacky. Unfolded this way since I'm using it only for client.

Configuration for this is opt and may serve for future configs too.

@@ -53,7 +53,7 @@ func (s *RPCServer) handleWS(ctx context.Context, w http.ResponseWriter, r *http
conn: c,
handler: s.methods,
exiting: make(chan struct{}),
}).handleWsConn(ctx)
}).handleWsConn(ctx, ReconnectConfig{})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since focused on client use, at the point of seeing this all the change was done.
Not quite like it tbh.

@jsign jsign marked this pull request as ready for review January 23, 2020 20:33
@whyrusleeping
Copy link
Member

What would happen if i send a request, and the stream dies before i get my response?

@whyrusleeping
Copy link
Member

Thinking through DX here, it seems like i want to be able to say "make this API call, and if the connection is interrupted, make it again until i get a response".

We don't want to automatically resend all requests, really only for idempotent (or mostly idempotent) calls.

@jsign
Copy link
Contributor Author

jsign commented Jan 23, 2020

What would happen if i send a request, and the stream dies before i get my response?

In this PR, that means that the NextReader() from the wsconnection will error, which:

  • Makes the calling function fail with an error
  • Makes any other API call that still didn't receive a response to fail.
  • Triggers the auto-connect procedure
  • Any new call that starts whenever all this happens, will be blocked until the new connection is alive. This since reqs and response calls are serialized in a single goroutine.

@jsign
Copy link
Contributor Author

jsign commented Jan 23, 2020

Thinking through DX here, it seems like i want to be able to say "make this API call, and if the connection is interrupted, make it again until i get a response".

We don't want to automatically resend all requests, really only for idempotent (or mostly idempotent) calls.

For a moment I thought about making the retry at the API level, but felt that decision would be better to be controlled from the caller side.

Considering you are thinking about some built-in way to do that: what are you imagining?

  • Opt 1: Configuring the retry at the client (as is done now) and make all calls using that instance insist until success (caller should ensure about idempotency (or accept whatever consequence))
  • Opt 2: making that config per call? (which I don't imagine how in a clean way without some reflection magic, or also if some constrain about # of retries or whatever are needed)

@jsign
Copy link
Contributor Author

jsign commented Jan 23, 2020

Considering what @whyrusleeping mentioned in the case of existing channels.
Taking a look more closely, that case is distinguished after the incoming channel is read (in .handleFrame()).

Before the PR, the closeInFlight logic was called when the connection died (that's where I extracted to that new func). But nothing is done regarding existing open channels, so it seems that they always blocked forever if the conn failed.

I did a quick test of this with the ChainNotify allowing to read the current and then killing the conn, and it blocks forever without the channel being closed.

Seems that the missing thing is calling some closeOpenChans() after closeInFlight() that closes all current channels. That can change the semantics of what already now can be interpreted since now they would only close on intended closing. (since they never have the dichotomy of understanding who closed it).

Open to hearing if all this make sense for the experts

@magik6k
Copy link
Contributor

magik6k commented Jan 24, 2020

Call Retrying - I really think this should be per method, we already use the proxy struct to carry permission info, we may just add another tag for this, so https://github.com/filecoin-project/lotus/blob/master/api/apistruct/struct.go#L61 would be

SyncState func(context.Context) (*api.SyncState, error) `perm:"read",retry:"true"`

Channels - those should get closed. We may be able to somehow rescue them in case the connection drops but the server is still running, but i feel like that creates more problems than it solves ().

@jsign
Copy link
Contributor Author

jsign commented Jan 24, 2020

Regarding the annotations, sounds good. I would have to reconsider the original solution since this extending the current PR direction seems not that good.

Channels - those should get closed.

Right. But what do you think about what I said before?, callers should be responsible in distinguishing between clean and forced channel close. That isn't a problem now because it seems that are never closed (even if the conn is lost). That can change the semantics of what the caller can understand as closed channels in possible logics.

If a client receive a chan of bytes, I can imagine that it can be a problem. Not much this PR can do about that since it is a client thing, but just making the point if recall an existing case that should be considered somewhere.

@magik6k
Copy link
Contributor

magik6k commented Jan 24, 2020

Right, I'm not sure what would provide the 'cleanest' DX. If we treat channels like streams (files, sockets, io.Reader), then closing with dropped connection seems not unreasonable.

This does put abnormal closure detection on the client, bu I don't think that in most cases it will be very hard:

  • Streams that get cancelled by context - check if context is alive (if it is, then connection dropped)
  • Byte returning channels (ChainExport) - If the data type is self terminating - like cbor, possibly rely on that, if it isn't - add length prefix, or explicit close message

@jsign
Copy link
Contributor Author

jsign commented Jan 24, 2020

* Streams that get cancelled by context - check if context is alive (if it is, then connection dropped)

* Byte returning channels (ChainExport) - If the data type is self terminating - like cbor, possibly rely on that, if it isn't - add length prefix, or explicit close message

SGTM.
Eventually, I'll get back here when I have some news regarding per method retries.

@jsign jsign closed this Jan 25, 2020
@jsign jsign mentioned this pull request Jan 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Auto-reconnect in JsonRPC client
3 participants