-
-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
connections hanging after a long transfer #427
Comments
Thank you for the report. Did you upgraded all your hosts to 6.4.7 or do you have a mixed environment comprising older kernels? is the above the only splat you see in dmsg? On top of my head, I do not recall observing similar problem. Could you please provide more details, e.g.:
|
The above means the running kernel has loaded unsigned module[s], possibly OoT ones. Could you reproduce the issue without such taint? Just to exclude the problem is in some unknown module/source. |
Hmm, it is true I have applied a patch to add bbr2 congestion control to the kernel (and am using it), but I was also using that on v6.2.8 without issue. I am definitely not ruling out that this is not related to mptcp and is something else... I will step through all patches and try to replicate. All hosts doing the mptcp transfers have been updated to v6.4.7. I will include all the info you suggest next time I get a hung example. Thanks! |
So I found some hung examples - it takes a while for them to occur and then they start to build up until the system is mostly full of hung rsync commands. The workload is basically running rsync on a client and connecting to rsyncd on the remote server (like in #295). So something like this on the client "mptcpize run rsync -av serverB::/files/ /tmp" where rsyncd is running on serverB with the mptcp LD_PRELOAD. I looked at the hung tasks and an strace just says they are stuck in select timeout - both the client rsync and the remote rsync spawned by rsyncd/server. I actually have a 20 minute timeout on the rsync client/server that is supposed to kill the process if it makes no progress, but they are both in a permanent hung state long past that. So it thinks there is enough network communication going on not to kill them but they definitely aren't sending IO. The client reports:
And the server reports:
This particular transfer was running fine for around 5 mins before it stalled/hung. If I kill and restart the transfer it goes through fine on the second attempt. It's hard to debug these hung tasks as they are running on a production system and there are constant streams of new transfers starting and stopping - mostly successfully. I tried to tcpdump just this hung process by client port across all the mptcp interfaces but I couldn't see any packets at all. I'm not sure is there is a better way to trace a single process or if mptcp has actually opened different ports for this connection? The topology is pretty simple (#285 ), I just have one interface set as a backup which is what I use to do the initial connection and then two signal endpoints at each end that traverse two different ISPs to get to each other which do all the bulk transfers:
Where the interface pairs can only route to each other - ens192 <-> ens192, ens224 <-> ens224, ens256 <-> ens256. I think I'll switch to the v6.3.13 kernel just to add another data point and see how it compares to v6.2.8 and this v6.4.7. |
Thank you very much for the added info.
I suspected the issue could be related to fallback sockets, as the only/biggest functional change in the relevant kernel version are commit b7535cf and commit 81c1d02, but the above info should exclude that, as the mptcp socket is not fallen back one. To have a better picture, it would be great to have:
Could you please provide both the above info? Having the full tuple for all the involved subflows could possibly allow you to capture some relevant packets... In the provided dump we can see that mptcp socket as a suspiciusly large write queue (104877876 bytes), almost completely unacked 104749836. Unfortunatelly the subflow included into the dump is a backup one, so can't really give a clue. BTW, my response in the next 2 weeks will be delayed for major force cause, but others may chime-in. |
[...] This part of the configuration is possibly/likely wrong/unintended. You should you In your setup, reasonably, the server will end-up creating a good deal of tcp connections towards the client which will be dangling in half-open state, adding entropy to the system. The: You need just that line of configuration on the client side. |
Yea, so even though I have written client/server here, in fact, they are both in the sense that we initiate transfers equally on each. So they can both start the client rsyncs to each other at any time. This seemed to be the only configuration that worked (reliably uses both bulk endpoints) for that scenario? I'm going to try setup some reproducible looping test to try trigger the hangs reliably rather than wait for production workloads to exhibit it. Hopefully that will help with gathering the extra debug info you suggest. The large read/write queues are likely just because we are using very large window sizes (128MB) as these servers are separated by a long WAN distance (150ms+). The TCP memory of the systems has been tweaked to run this setup. |
But still, I think it is needed to add the |
So it's not possible to have a "static" configuration between two symmetric hosts where either might want to initiate transfers between them at any time? We would have to keep switching the endpoints between signal and subflow depending on which host was going to be server and which was client for that particular transfer? Even then, I expect there would be race conditions when multiple transfers can happen simultaneously in either direction. I should say that we have been using this configuration (both hosts using "signal" endpoints) with success for a while now and we achieve our desired behaviour - transfers use both endpoints (backed by different ISPs) regardless of the direction and which host initiates the connection. And if one path goes down (ISP), the mptcp continues to use the remaining path as expected. We only use the "backup" interface to initiate the connections as it is a stable route that never goes down (unlike the "bulk" signal/subflow endpoints which can). Certainly, our endpoints are static in the sense that we always expect to use the same two and no more or less. And they can only route to each other in pairs (so we can't or want to do "fullmesh"). In reality, it's even more complicated in that we have many "transfer" hosts in many different WAN locations that can all initiate transfers from each other. My attempts to reproduce the issues with v6.4.7 have so far failed in a test bench setup so I fear the full production loads are required to trigger it - which makes everything harder to debug and isolate. |
Okay, I think I caught another hanging example on our production servers using the v6.3.13 kernel this time. I am going to continue to use "client" to denote the host where the connection was initiated from (rsync). This time I used the token identifier to list all the relevant entries in the output of ss:
The output of nstat is constantly changing depedning on the other transfers happening so these just represent a brief snapshot:
The rsync command never seems to time out and I can see the long keepalive/persist reported on the server side connections. With all the reported client ports on each interface/endpoint above, I tried to tcpdump and capture on both client and server but nothing. I'm not sure if it's relevant, but I can see the temporary rsync file that was being copied at the time of the hang on disk - obviously it is also "stuck". Again, if I kill the stuck rsync and re-run it, it goes through fine without hanging. It could very well be that our transfers have hung like this before, but they would at least always timeout (rsync --timeout=60) and be automatically retried. But now (v6.3+ ?), it seems the connections are in some state such that rsync never times out or closes the connection. |
Okay, this example of a hanging transfer is somewhat different to my previous example and may have more in common with #295. I also can't say with complete confidence that this wasn't happening in v6.2.8 but I would have thought I'd have noticed it before - the thing is, I am only now looking very closely at every hung rsync process. So for this one the remote (server) rsyncd process dies for some reason (likely a momentary storage issue):
But the corresponding client rsync does not seem to get the hint and hangs around indefinitely:
So it looks like the two expected TCP connections have gone but the mptcp connection is still open? And this in turn is holding open the rsync process such that it never quits or times out. Like I said, it's different to the hanging of open connections in my last example, but this one is no less disruptive. |
Thank you for the new tests! Just to avoid any misunderstanding: this bug looks important and a fix is needed. Any new details are important, it's just a shame there is no simple and short way to reproduce it (using our |
Not a problem. I appreciate the work that goes into mptcp and I'm happy to help find and squash bugs - helping in the only way I know how! :) I have gone back to using the v6.2.8 kernel for now just just double check there really were no "hanging" rsync transfers using that version. And in case it wasn't clear, the previous two cases I reported were with v6.3 so the title of this bug (v6.4) is probably misleading now. |
Okay, so the "second" issue/example I gave where a client rsync never dies despite the server hanging up, and the mptcp connection shows in "ss" despite there being no corresponding active tcp subflows - is also present in v6.2.8. So it looks like that issue has always been there I just never noticed it before (it's low frequency) until looking more closely. I have yet to see anything like the first example (keepalive/persist reported on tcp connections) with v6.2.8 so maybe that's the main "regression" or change in behaviour here. I'll see if I can can start bisecting the kernel to narrow it down, but it's likely going to be slow going. |
For the long standing issue (v6.2.8), where a client rsync will not quit, the mptcp connection still reports in ss despite the TCP connections missing - I have another example that seems to trigger it quite frequently. If the rsync scans a few files to build the file list but then doesn't actually transfer anything making the connection very short, we tend to see some hanging rsync client commands. Maybe related to opening/closing lots of short connections in quick succession? For example the server log might show:
But on the client the corresponding rsync processes might get stuck on connect:
I am still trying to work through kernel bisects and reproduce the other (bigger) issue. It's proving hard to reproduce. |
Regarding the 'first' issue:
here the rtx queue is not empty (bytes acked < bytes_sent) -> no mptcp-level rtx. the https://elixir.bootlin.com/linux/latest/source/net/ipv4/tcp_output.c#L3176 ( zero window probe could happen, but window is not zero ;) Why An unusual mptcp-level mib is non zero:
that means: https://elixir.bootlin.com/linux/latest/source/net/mptcp/options.c#L1257 The latter code looks buggy: it jumps to the 'raise_win' label, which will use the TL;DR: @daire-byrne: could you please try the following patch? that should at least address the issue described above diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index c254accb14de..295ed37a489c 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -1269,12 +1269,12 @@ static void mptcp_set_rwin(struct tcp_sock *tp, struct tcphdr *th)
if (rcv_wnd == rcv_wnd_old)
break;
- if (before64(rcv_wnd_new, rcv_wnd)) {
+ rcv_wnd_old = rcv_wnd;
+ if (before64(rcv_wnd_new, rcv_wnd_old)) {
MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_RCVWNDCONFLICTUPDATE);
goto raise_win;
}
MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_RCVWNDCONFLICT);
- rcv_wnd_old = rcv_wnd;
}
return;
} Side Notes:
|
I think this could be caused by: https://elixir.bootlin.com/linux/latest/source/net/ipv4/tcp_input.c#L4345 On incoming TCP reset (ev. carrying a MPTCP fast-close) the TCP stack closes (via tcp_done()) the TCP socket before propagating the socket error (via sk_error_report()). In turn the MPTCP code could remove the tcp subflow from the subflow list (in response to tcp_done()) before trying to propagate the socket error in response to sk_error_report(). As the mptcp-level socket is closed as an effect of the error propagation, and the latter requires the errored subflow still being present into the subflows list at __mptcp_error_report() time, the all the above could lead to the reported issue. @daire-byrne: I just shared a couple of patches on the ML trying to address the above (that is, only the "2nd" issue tracked here). https://patchwork.kernel.org/project/mptcp/list/?series=778115 Could you please give them a spin in your testbed? (they are on top of the current export branch, but hopefully should apply cleanly to the current Linus tree) Side note: it would be probably better track the 2 issues described here separately. |
Thanks you for these patches! I have applied both cleanly to v6.4.11 and all our transfer hosts globally (9 machines) are now running with it. Because these hosts run thousands of rsync transfers per day but only a handful get stuck, it may take me a few days to sift through any remaining hangs. I have been waiting for them to build up naturally (easier to spot) and then debugging each one every day. Apologies that this ticket ended up mixing issues, but it was only when there was a noticeable increase in hanging rsync commands with v6.4 did I look more closely and realise that there was more than one issue at play and that one had always been there (but very low frequency). On the subject of why the rcv_wnd bug might have become more noticeable on v6.3/v6.4 - I'm not entirely sure. I did however notice that the ratio of softirq CPU to packet count changed (for the better - less cpu) with v6.4 with the vmxnet3 driver we are using which might suggest some big change in some related code? I have also just double checked any patches (other than bbr v2/3) that I was carrying and it looks like I forgot to remove this one in my v6.4 testing (slated for inclusion in v6.5): I did not have that in my v6.3 kernel testing though and I am unsure if that could have interacted with the rcv_wnd code you patched? Apologies, I should have realised I had applied that patch for testing when opening this ticket. I am not running with it anymore (but will test adding it again at a later point). Anyway, I will report back in a few days - thanks again! |
Thank you for all the tests.
Just to be sure, do you mean you ran the tests with the two series or only the two patches of the series Paolo sent on the ML (for the 2nd issue)? If I'm not mistaken, the first patch modifying (Do not hesitate to create a dedicated issue to avoid confusions ;-) ) |
git log suggests this change:
which entered 6.3 and could have changed significantly the timing of ingress TCP packet - before that change GRO was basically disabled on some (most???) scenarios. |
Yes, sorry, I meant I have applied the patches for both "issue 1" (rcv_wnd) and "issue 2" (close transition) at the same time. I * think* I am already seeing a positive impact on the long standing close transition one - but I'll know more in a day or two. The rcv_wnd/mptcp_set_rwin one seemed harder to come across but once one happens on a server it felt like more then happened more quickly after that. But again, in the early days of debugging this, I was not collecting as much useful debug info as I am now. |
Should some rsync connections being stuck even with the patched kernel, please try to collect the same info already mentioned in the past ( |
I have not seen a connection hang due to this bug since applying the patch. However, this one took a while to trigger and then seemingly once it happened once on a host it was more likely to happen again. I will report back in a week or so. |
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Cc: stable@vger.kernel.org Closes: multipath-tcp/mptcp_net-next#427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Closes: #427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org>
In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Cc: stable@vger.kernel.org Closes: multipath-tcp/mptcp_net-next#427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
commit 6bec041 upstream. In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Cc: stable@vger.kernel.org Closes: multipath-tcp/mptcp_net-next#427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6bec041 upstream. In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Cc: stable@vger.kernel.org Closes: multipath-tcp/mptcp_net-next#427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6bec041 upstream. In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Cc: stable@vger.kernel.org Closes: multipath-tcp/mptcp_net-next#427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6bec041 upstream. In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Cc: stable@vger.kernel.org Closes: multipath-tcp/mptcp_net-next#427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
BugLink: https://bugs.launchpad.net/bugs/2046197 commit 6bec041147a2a64a490d1f813e8a004443061b38 upstream. In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Cc: stable@vger.kernel.org Closes: multipath-tcp/mptcp_net-next#427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/2046197 commit 6bec041147a2a64a490d1f813e8a004443061b38 upstream. In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Cc: stable@vger.kernel.org Closes: multipath-tcp/mptcp_net-next#427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/2044174 commit 6bec041 upstream. In case multiple subflows race to update the mptcp-level receive window, the subflow losing the race should use the window value provided by the "winning" subflow to update it's own tcp-level rcv_wnd. To such goal, the current code bogusly uses the mptcp-level rcv_wnd value as observed before the update attempt. On unlucky circumstances that may lead to TCP-level window shrinkage, and stall the other end. Address the issue feeding to the rcv wnd update the correct value. Fixes: f3589be ("mptcp: never shrink offered window") Cc: stable@vger.kernel.org Closes: multipath-tcp/mptcp_net-next#427 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Sorry, this might be a bit lacking in detail until I can dig into it a bit more, but I thought I should open an issue now in case there are any known problems or anyone can help guide my investigation.
We recently upgraded our kernel from v6.2.8 to v6.4.7 and we are running lots of mptcp rsync/rsyncd transfers between hosts. I actually had an earlier issue #295 which was addressed in v6.1 and this workload has been stable since then (using mptcpize/LD_PRELOAD).
But since updating to v6.4.7, we are seeing some unexplained "hangs" in the running transfers such that they never complete. We can start new mptcp connections (either more rsync or iperf) and those seem to work fine, but the running ones either stall or become really slow to progress.
If I kill the rsync transfers and restart them, they perform well again until the transfer hosts start having problems and the transfer tasks start to slow again (a few days later).
I have also seen this dumped from time to time but it's appearance does not seem to line up with the transfer hangs or slowness:
I do not recall ever seeing these on v6.2.8. But they are sporadic and only occur once every couple of days so may not be all that important.
All of this could be some other regression in v6.3 or v6.4 unrelated to mptcp, but such network oddities are usually noticed much quicker and addressed. I just suspect that it has more to do with our niche mptcp+rsync setup (having already had a previous issue).
I will try a v6.3 kernel too and see if I can replicate with that. Sorry for the lack of detail at this stage...
The text was updated successfully, but these errors were encountered: