-
-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve #196 and #211 #214
Conversation
This is related to #196 and #211 but I guess that putting the numbers in the title doesn't do anything. @amtelekom, how badly did I misunderstand what you were looking for? |
First of all thanks for looking out for my needs :) Your branch works fine for our purposes - we just need to be able to detect if the network connectivity was lost or the Server exit without closing down the connection correctly on long-standing SSH connections (Netconf transport tunnels). Since our (OpenSSH-based) servers can send keepalives, just copying the While it works, it feels like the activity tracking seems like it goes farther than needed? It might still be the better way of tracking "inactivity", but then maybe we should also port this to the Server side for consistency, and expand the explanation in the doc comment in the |
Well, I would marvel at the coincidence that our respective employers have the same motivation for us to use russh, but I guess there aren't that many other client-side usecases of SSH that can't be done "simpler" by just shelling out to OpenSSH's client program with tokio::process. (Though I'd assume that that would be less scalable since certainly each russh Session has to be using less memory than each entire
Ah, so that feature's use case now makes a lot more sense. I guess that's what I might have ended up doing, if I had any influence over the SSH servers' configuration.
My best guess is that what you meant is: the activity tracking should only say "even though a packet was read, it didn't merit resetting the inactivity timer" if the packet in-question was clearly a keepalive request or reply. Is this correct? (By contrast, what I wrote will also avoid resetting the inactivity timer in case of XON/XOFF, unhandled global requests, unhandled channel requests and unhandled packets. Upon reflection, I see that's clearly a subjective judgment of mine.)
You're right, I have thusfar been paying zero attention to the Server side of russh. I'll take some time to see where this all fits into the Server. |
5ec4f7c
to
a05b7ee
Compare
I have concluded this process, and thus I believe my recent commit brings Client and Server into parity regarding these two features. Based on what I saw on the Server side - and contrary to the code I wrote in my previous PR - it seems to me that the purpose of the inactivity timer, relative to OpenSSH-style ServerAliveInterval or ClientAliveInterval, is that the inactivity timer on the one hand doesn't let the peer hold the connection open by responding to a ping-pong, but on the other hand does let the local application hold the connection open by sending data to the peer even without expecting any responses whatsoever to that data (not even window size adjustments or rekeyings or etc). So that is what I have assumed when writing this last commit. |
Hmm, running your updated commit with
against a default-configuration OpenSSH Server (i.e. it doesn't send keepalives on its own) keeps runing into timeouts. Activating the Debug outputs for russh, it seems like there is no wait time between the keepalive attempts being sent, and thereby also not giving the remote side more than a few ns to respond (the previous russh debugmsg timestamp before this is exactly a minute older):
I've pushed a fix to https://github.com/amtelekom/russh/tree/keepalive-improvement (this also fixes that an additional keepalive request is sent right before disconnecting [note the 4 seqns in the previous log], but that's more of a nicety than a functional improvement). Not sure how to best collaborate on PRs on Github :) |
f8d8567
to
9cd45b9
Compare
Aye, that was a nasty little oversight. Apologies. It should work correctly with that simple fix, and since your code was slightly cleaner than what I came up with prior to reading the entirety of your message, I cherry-picked your commit into here. |
9cd45b9
to
f5b4670
Compare
The semver checker failed when I rebased this onto the latest commit in |
fixed! |
* Add an analogue of OpenSSH's `ServerAliveCountMax`. * Use disjunctive futures for cleanly making these timers optional. * Use the `Session` to pass information back to the main bg loop from the plaintext packet reader, so that only nontrivial data transfer will reset the inactivity timer. (And so that `ServerAliveCountMax` will be judged correctly.)
Also bring client and server into parity regarding timers. Also, per OpenSSH documentation, only reset keepalive timer when receiving data, not when sending it. Also, always reset the inactivity timer unless the iteration was ended via sending a keepalive request.
1c0846d
to
7b640e0
Compare
Excellent! I took an educated guess at the version-bump, so CI seems to be in order now. |
Is there something that still needs to happen before this can be merged? In case it helps: I feel this MR now implements things as I would have expected them to work from a user's perspective, and we've been working based on this branch successfully the last weeks. Would be good to be able to go back to the upstream version though :) |
No, sorry, just lost the track of it |
This works because of https://docs.rs/futures/latest/futures/future/enum.Either.html#impl-Future-for-Either%3CA,+B%3E.
It theoretically works best because reusing the same futures means Tokio has to do less bookkeeping. Every non-diverging exit from the
select!
will reset the keepalive timer, and will also reset the inactivity timer if any nontrivial I/O happened.ServerAliveCountMax
.A counter is persisted on the
Session
; incremented when a keepalive is sent; and reset when a REQUEST_SUCCESS or REQUEST_FAILURE is received (presumably in reply to a keepalive). If it passes a configurable threshold, the connection is cut.Session
to pass information back to the main bg loop from the plaintext packet reader, so that only nontrivial data transfer will reset the inactivity timer. (And so thatServerAliveCountMax
will be judged correctly.)