Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in-kernel PM: listen socket: support "behind a NAT" use case #337

Open
shockx2 opened this issue Jan 17, 2023 · 11 comments
Open

in-kernel PM: listen socket: support "behind a NAT" use case #337

shockx2 opened this issue Jan 17, 2023 · 11 comments
Labels
enhancement pm path-manager

Comments

@shockx2
Copy link

shockx2 commented Jan 17, 2023

Currently, if the server is behind a NAT/LB/..., adding an endpoint with a "public" (exposed) IP and a custom port would fail because the in-kernel PM will try to create a listening socket with an IP not allocated on the system: -99: Cannot assign requested address.

It is possible to work around that by adding the exposed IP on the loopback interface but the listen socket created by the in-kernel PM will be useless as it will bind on the "public" (exposed) IP, not the "private" (internal) one.

A solution could be to extend the current API to allow something like:

ip mptcp endpoint add <exposed address> dev <NIC> listen <internal address or 0.0.0.0> port <port> signal

listen will need port (and signal).

(no-listen flag could also be use not to create a listening socket automatically when a port is given)


Original bug report:

Hello
I'm very interested in all about MPTCP. Thank you MPTCP team.

And, I experiment to apply MPTCP for video streaming.
The simple command is proof of concept of availability of MPTCP for streaming. But, MPTCP is fallback to single TCP.
Other applications e.g. iperf3 that works very well.

Environments

  • Host A: MPTCP client (linux v6.2.0-rc4)
    10.0.0.2
    10.0.1.2
    ip mptcp endpoint
    10.0.0.2 id 1 subflow dev h1-eth0
    10.0.1.2 id 2 subflow dev h1-eth1

  • Host B: MPTCP server (linux v6.2.0-rc4)
    10.0.2.2

Command:
Host A(Client): mptcpize run -d gst-launch-1.0 videotestsrc ! tcpclientsink host=10.0.2.2 port=40000
Host B(Server): mptcpize run -d gst-launch-1.0 tcpserversrc host=0.0.0.0 port=40000 ! filesink location=./output
Run Host B first, and Host A later. Then the server saves output file for receiving data.
But, MPTCP is not working. second subflow is reset.

Result: wireshark

1	0.000000000	10.0.0.2	10.0.2.2	MPTCP	80	50036 → 40000 [SYN] Seq=0 Win=42340 Len=0 MSS=1460 SACK_PERM=1 TSval=630985339 TSecr=0 WS=512
2	0.000046971	10.0.2.2	10.0.0.2	MPTCP	88	40000 → 50036 [SYN, ACK] Seq=0 Ack=1 Win=43440 Len=0 MSS=1460 SACK_PERM=1 TSval=4012900360 TSecr=630985339 WS=512
3	0.000068303	10.0.0.2	10.0.2.2	MPTCP	88	50036 → 40000 [ACK] Seq=1 Ack=1 Win=42496 Len=0 TSval=630985339 TSecr=4012900360
4	0.001566001	10.0.0.2	10.0.2.2	MPTCP	7212	50036 → 40000 [PSH, ACK] Seq=1 Ack=1 Win=42496 Len=7120 TSval=630985341 TSecr=4012900360 [TCP segment of a reassembled PDU]
5	0.001598651	10.0.2.2	10.0.0.2	MPTCP	80	40000 → 50036 [ACK] Seq=1 Ack=7121 Win=39936 Len=0 TSval=4012900362 TSecr=630985341
6	0.001723756	10.0.1.2	10.0.2.2	MPTCP	88	52019 → 40000 [SYN] Seq=0 Win=42496 Len=0 MSS=1460 SACK_PERM=1 TSval=1966134748 TSecr=0 WS=512 ###
7	0.002074009	10.0.2.2	10.0.1.2	TCP	56	40000 → 52019 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 ###
8	0.006433137	10.0.0.2	10.0.2.2	TCP	7212	50036 → 40000 [PSH, ACK] Seq=7121 Ack=1 Win=42496 Len=7120 TSval=630985341 TSecr=4012900360 [TCP segment of a reassembled PDU]
9	0.006466182	10.0.2.2	10.0.0.2	MPTCP	80	40000 → 50036 [ACK] Seq=1 Ack=14241 Win=39936 Len=0 TSval=4012900367 TSecr=630985341
10	0.012425790	10.0.0.2	10.0.2.2	TCP	7212	50036 → 40000 [PSH, ACK] Seq=14241 Ack=1 Win=42496 Len=7120 TSval=630985341 TSecr=4012900362 [TCP segment of a reassembled PDU]
11	0.012471922	10.0.2.2	10.0.0.2	MPTCP	80	40000 → 50036 [ACK] Seq=1 Ack=21361 Win=39936 Len=0 TSval=4012900373 TSecr=630985341

And iperf3 result below
Command:
Host B: mptcpize run -d iperf3 -s
Host A: mptcpize run -d iperf3 -c 10.0.2.2

Result: It works well.

1	0.000000000	10.0.0.2	10.0.2.2	MPTCP	80	53982 → 5201 [SYN] Seq=0 Win=42340 Len=0 MSS=1460 SACK_PERM=1 TSval=631071184 TSecr=0 WS=512
2	0.000092389	10.0.2.2	10.0.0.2	MPTCP	88	5201 → 53982 [SYN, ACK] Seq=0 Ack=1 Win=43440 Len=0 MSS=1460 SACK_PERM=1 TSval=4012986205 TSecr=631071184 WS=512
3	0.000148964	10.0.0.2	10.0.2.2	MPTCP	88	53982 → 5201 [ACK] Seq=1 Ack=1 Win=42496 Len=0 TSval=631071184 TSecr=4012986205
4	0.000277911	10.0.0.2	10.0.2.2	MPTCP	129	53982 → 5201 [PSH, ACK] Seq=1 Ack=1 Win=42496 Len=37 TSval=631071184 TSecr=4012986205
5	0.000357607	10.0.2.2	10.0.0.2	MPTCP	80	5201 → 53982 [ACK] Seq=1 Ack=38 Win=43520 Len=0 TSval=4012986205 TSecr=631071184
6	0.000419136	10.0.2.2	10.0.0.2	MPTCP	97	5201 → 53982 [PSH, ACK] Seq=1 Ack=38 Win=43520 Len=1 TSval=4012986205 TSecr=631071184
7	0.000471001	10.0.0.2	10.0.2.2	MPTCP	80	53982 → 5201 [ACK] Seq=38 Ack=2 Win=42496 Len=0 TSval=631071184 TSecr=4012986205
8	0.000591673	10.0.0.2	10.0.2.2	MPTCP	100	53982 → 5201 [PSH, ACK] Seq=38 Ack=2 Win=42496 Len=4 TSval=631071184 TSecr=4012986205
9	0.000794230	10.0.1.2	10.0.2.2	MPTCP	88	51285 → 5201 [SYN] Seq=0 Win=42496 Len=0 MSS=1460 SACK_PERM=1 TSval=1966220591 TSecr=0 WS=512
10	0.000855521	10.0.2.2	10.0.1.2	MPTCP	92	5201 → 51285 [SYN, ACK] Seq=0 Ack=1 Win=43440 Len=0 MSS=1460 SACK_PERM=1 TSval=408156345 TSecr=1966220591 WS=512
11	0.000908853	10.0.1.2	10.0.2.2	MPTCP	92	51285 → 5201 [ACK] Seq=1 Ack=1 Win=42496 Len=0 TSval=1966220592 TSecr=408156345
12	0.000993630	10.0.2.2	10.0.1.2	MPTCP	80	[TCP Window Update] 5201 → 51285 [ACK] Seq=1 Ack=1 Win=43520 Len=0 TSval=408156345 TSecr=1966220592
13	0.042926949	10.0.2.2	10.0.0.2	MPTCP	80	5201 → 53982 [ACK] Seq=2 Ack=42 Win=43520 Len=0 TSval=4012986248 TSecr=631071184

Thank you again!!

@matttbe
Copy link
Member

matttbe commented Jan 17, 2023

Hi @shockx2

Do you know if your GST server close the listening socket after having accepted the first connection? (e.g. do you have to relaunch the server after the client got disconnected?)
strace should be able to explain what's going on.

For MPTCP, we need to have a socket listening to accept more subflows.

If it is the case and if you cannot modify your app, a way to work around this is to have another app doing a listen() on the same port and specific to the second interface. mptcpd should be able to do that: multipath-tcp/mptcpd#223
Or try with Python for example to create an MPTCP socket and bind on the specific IP/Port of the second interface, e.g.

import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM, socket.IPPROTO_MPTCP)
s.bind(("10.0.2.2", 40000))
s.listen(1)

(Mmh, but that's the same IP as the initial subflow, the server only has one IP :-/)

@pabeni
Copy link

pabeni commented Jan 17, 2023

strace shows that gst-launch-1.0 closes the listener socket just after accepting the first subflow. As guess by Mat, that is the root cause of the failure.

An alternative workaround, beyond the already proposed ones would be adding on the server side a port-based endpoint:

ip mptcp endpoint add 10.0.2.2 port 12345 signal

set the 'fullmesh' flag on the client endpoints, and increases max subflow limit:

ip mptcp limit set subflows 4
ip mptcp endpoint add 10.0.0.2 dev h1-eth0 subflow fullmesh
ip mptcp endpoint add 10.0.1.2 dev h1-eth1 subflow fullmesh

Side note: could you please share your routing configuration, too? e.g. the output of ip addr; ip route on both the client and the server.

@shockx2
Copy link
Author

shockx2 commented Jan 18, 2023

Very thank you. @matttbe @pabeni
It' almost resolved.
I try the workaroud. But, it does't work. So, I make some modify. And, it works!

  • Host B (Server):
    ip mptcp limits set add_addr_accepted 4 subflows 4
    ip mptcp endpoint add 10.0.2.2 port 12345 signal
  • Host A (Client):
    ip mptcp limits set add_addr_accepted 4 subflows 4
    ip mptcp endpoint add 10.0.0.2 dev h1-eth0 fullmesh
    ip mptcp endpoint add 10.0.1.2 dev h1-eth1 fullmesh

I try the solution to AWS EC2 instance(Server), and my local PC(Client).
Server instance has public IP and private IP.
ip mptcp endpoint add <public IP> port 12345 signal result is Cannot assign requested address
ip mptcp endpoint add <private IP> port 12345 signal is good.

But, 2nd endpoint advertising is not working.
I think, it is cause by NAT.

@pabeni
my routing configuration is (this is mininet environment)

  • Server
$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: h2-eth0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc htb state UP group default qlen 1000
    link/ether 0a:d7:8d:d0:ab:f7 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.0.2.2/24 brd 10.0.2.255 scope global h2-eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::8d7:8dff:fed0:abf7/64 scope link 
       valid_lft forever preferred_lft forever
$ ip route
default via 10.0.2.1 dev h2-eth0 
10.0.2.0/24 dev h2-eth0 proto kernel scope link src 10.0.2.2 
  • Client
ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: h1-eth0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc htb state UP group default qlen 1000
    link/ether ba:b7:51:fe:91:21 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.0.0.2/24 brd 10.0.0.255 scope global h1-eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::b8b7:51ff:fefe:9121/64 scope link 
       valid_lft forever preferred_lft forever
3: h1-eth1@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc htb state UP group default qlen 1000
    link/ether 4e:9f:d0:4b:29:3f brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.0.1.2/24 brd 10.0.1.255 scope global h1-eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::4c9f:d0ff:fe4b:293f/64 scope link 
       valid_lft forever preferred_lft forever
$ ip route
default via 10.0.0.1 dev h1-eth0 
10.0.0.0/24 dev h1-eth0 proto kernel scope link src 10.0.0.2 
10.0.1.0/24 dev h1-eth1 proto kernel scope link src 10.0.1.2 

Thank you.

@matttbe
Copy link
Member

matttbe commented Jan 18, 2023

Server instance has public IP and private IP. ip mptcp endpoint add <public IP> port 12345 signal result is Cannot assign requested address ip mptcp endpoint add <private IP> port 12345 signal is good.

But, 2nd endpoint advertising is not working. I think, it is cause by NAT.

@shockx2 you need then to announce the public IP to be reachable from the client side.

Mmh yes, the in-kernel PM doesn't create a new socket with "freebind". This could be changed I suppose.

While trying workarounds, can you first try to add the public IP to the loopback interface?

ip addr add <public IP>/32 dev lo

Then add the new endpoint using the public IP?

You will need to add an IP rule and route to use the right interface when the source IP is the one linked to the "second" interface

https://multipath-tcp.org/pmwiki.php/Users/ConfigureRouting

Something like this I suppose:

ip rule add from <public IP> table 42
ip route add default via 10.0.1.2 dev h1-eth1 table 42

But maybe you will need to has a SNAT rule to change public IP to source IP... :-)

@shockx2
Copy link
Author

shockx2 commented Jan 20, 2023

Thank you, very much @matttbe

I am understanding that the "freebind" means that it makes PM can announces public IP to client. right?

I think another workaround that the announcing packet modification by eBPF to send public IP.
I will try that. How do you think?

Thank you.

@matttbe
Copy link
Member

matttbe commented Jan 20, 2023

I am understanding that the "freebind" means that it makes PM can announces public IP to client. right?

It would allow the kernel to create a listening socket on an IP it doesn't own. It could be needed in some cases but in yours, that will partly help you:

  • the command will succeed
  • an ADD_ADDR with the right IP will be sent
  • the kernel will listen on packets arriving with the public IP (and specified port): the kernel will not see such packets if there is a NAT before.

I see two solutions (that could be combined) for your case (being a NAT and an app closing the listening socket after the accept()):

  • the in-kernel PM should not fail if it is not able to create a listening socket
  • a flag can be added not to create a listening socket

(@pabeni: what do you think?)

In both cases, it means you will have to create the listening socket, e.g. with Python:

import socket

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 262)
s.bind(("<private IP>", <port>))
s.listen(1024)

while True:
  ss = s.accept()
  print(ss)
  ss[0].close()

I think another workaround that the announcing packet modification by eBPF to send public IP.

It is not easy because there is an HMAC to avoid having this address being modified by someone else (or "injected").

With the current situation, I think the best is either to:

  • announce the Public IP with the same port:
    • add the endpoint (signal) with the public IP
    • create a listening socket on the private IP + the same port as the app
  • announce the Public IP with a different port:
    • add the public IP on the loopback interface (this step would not be needed if the kernel is modified with at least one of the two solutions I proposed here above)
    • add the endpoint (signal) with the public IP + a new port
    • create a listening socket on the private IP + the same port

Does it work for you?

@pabeni
Copy link

pabeni commented Jan 20, 2023

I see two solutions (that could be combined) for your case (being a NAT and an app closing the listening socket after the accept()):

* the in-kernel PM should not fail if it is not able to create a listening socket

* a flag can be added not to create a listening socket

(@pabeni: what do you think?)

both option will fail ?!? if the listener socket is not created, and the server closes the port after connect, I don't see how the subflow created by the client could fail. Since this thing looks very specific to NAT, and the admin need to know the NAT details (exposed address, local address) I think we could change the in-kernel PM to listen on a different address other then the signaled one, something alike:

ip mptcp endpoint add <exposed address> dev <NIC> listen <internal address or 0.0.0.0> signal

@pabeni
Copy link

pabeni commented Jan 20, 2023

side note: the client routing configuration looks broken. Specifically I don't see how the client could connect from h1-eth1/10.0.1.2 towards the server, since it lacks a suitable route. e.g.

ping -I h1-eth1 <server public IP>

from the client should fail - while it should be successful from h1-eth0

@matttbe
Copy link
Member

matttbe commented Jan 20, 2023

@pabeni thank you for your reply!

both option will fail ?!? if the listener socket is not created, and the server closes the port after connect, I don't see how the subflow created by the client could fail.

Yes indeed. But it is possible to work around this issue by creating this listening socket as suggested with the Python code, no?

Since this thing looks very specific to NAT, and the admin need to know the NAT details (exposed address, local address) I think we could change the in-kernel PM to listen on a different address other then the signaled one, something alike:

ip mptcp endpoint add <exposed address> dev <NIC> listen <internal address or 0.0.0.0> signal

Indeed, it would be cleaner and clearer.

I can update the ticket to switch to "feature request"

@matttbe matttbe changed the title tcp fallback with gstreamer tcp elements in-kernel PM: listen socket: support "behind a NAT" use case Jan 20, 2023
@shockx2
Copy link
Author

shockx2 commented Jan 27, 2023

I am understanding that the "freebind" means that it makes PM can announces public IP to client. right?

It would allow the kernel to create a listening socket on an IP it doesn't own. It could be needed in some cases but in yours, that will partly help you:

  • the command will succeed
  • an ADD_ADDR with the right IP will be sent
  • the kernel will listen on packets arriving with the public IP (and specified port): the kernel will not see such packets if there is a NAT before.

I see two solutions (that could be combined) for your case (being a NAT and an app closing the listening socket after the accept()):

  • the in-kernel PM should not fail if it is not able to create a listening socket
  • a flag can be added not to create a listening socket

(@pabeni: what do you think?)

In both cases, it means you will have to create the listening socket, e.g. with Python:

import socket

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 262)
s.bind(("<private IP>", <port>))
s.listen(1024)

while True:
  ss = s.accept()
  print(ss)
  ss[0].close()

I think another workaround that the announcing packet modification by eBPF to send public IP.

It is not easy because there is an HMAC to avoid having this address being modified by someone else (or "injected").

With the current situation, I think the best is either to:

  • announce the Public IP with the same port:

    • add the endpoint (signal) with the public IP
    • create a listening socket on the private IP + the same port as the app
  • announce the Public IP with a different port:

    • add the public IP on the loopback interface (this step would not be needed if the kernel is modified with at least one of the two solutions I proposed here above)
    • add the endpoint (signal) with the public IP + a new port
    • create a listening socket on the private IP + the same port

Does it work for you?

Thank you @matttbe
I tried the workaround. But, it doesn't work.

  1. I tested by iperf3 that does not close listening socket. It can be clear to resolve NAT issue.
  2. In server, changed the loopback address from 127.0.0.1 to public IP
  3. In server, add endpoint with public ip and port (ip mptcp endpoint add 5.38.112.212 port 12345 signal) is good!!
  4. In client, add endpoint (192.168.0.52 id 1 fullmesh dev wlp0s20f3)
    But, server sends RST (twice)
    I tested client with single interface(single fullmesh endpoint), because I want to understand about fullmesh mechanism.
    I expected that the client fullmesh endpoint makes 2 subflow with server's application listening socket and signal endpoint(port 12345)
    But, it does no work.
    (Server: Kernel v6.0.0, Client: Kernel v6.2.0-rc4)

Client's wireshark

3535	66.936254272	5.38.112.212	192.168.0.52	MPTCP	86	5201 → 41764 [SYN, ACK] Seq=0 Ack=1 Win=62643 Len=0 MSS=1460 SACK_PERM=1 TSval=1270292065 TSecr=2434907681 WS=128
3536	66.936316755	192.168.0.52	5.38.112.212	MPTCP	86	41764 → 5201 [ACK] Seq=1 Ack=1 Win=42496 Len=0 TSval=2434907687 TSecr=1270292065
3537	66.936405351	192.168.0.52	5.38.112.212	MPTCP	127	41764 → 5201 [PSH, ACK] Seq=1 Ack=1 Win=42496 Len=37 TSval=2434907687 TSecr=1270292065
3541	66.943542084	5.38.112.212	192.168.0.52	MPTCP	78	5201 → 41764 [ACK] Seq=1 Ack=38 Win=62720 Len=0 TSval=1270292073 TSecr=2434907687
3542	66.943542355	5.38.112.212	192.168.0.52	MPTCP	95	5201 → 41764 [PSH, ACK] Seq=1 Ack=38 Win=62720 Len=1 TSval=1270292073 TSecr=2434907687
3543	66.943542406	5.38.112.212	192.168.0.52	MPTCP	86	[TCP Dup ACK 3541#1] 5201 → 41764 [ACK] Seq=2 Ack=38 Win=62720 Len=0 TSval=1270292073 TSecr=2434907687
3544	66.943603566	192.168.0.52	5.38.112.212	MPTCP	78	41764 → 5201 [ACK] Seq=38 Ack=2 Win=42496 Len=0 TSval=2434907694 TSecr=1270292073
3545	66.943648239	192.168.0.52	5.38.112.212	MPTCP	78	[TCP Dup ACK 3544#1] 41764 → 5201 [ACK] Seq=38 Ack=2 Win=42496 Len=0 TSval=2434907694 TSecr=1270292073
3546	66.943701675	192.168.0.52	5.38.112.212	MPTCP	86	38873 → 12345 [SYN] Seq=0 Win=42496 Len=0 MSS=1460 SACK_PERM=1 TSval=2434907695 TSecr=0 WS=512
3547	66.943815783	192.168.0.52	5.38.112.212	MPTCP	98	41764 → 5201 [PSH, ACK] Seq=38 Ack=2 Win=42496 Len=4 TSval=2434907695 TSecr=1270292073
**3548	66.951025150	5.38.112.212	192.168.0.52	TCP	54	12345 → 38873 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0**
3549	66.951115913	192.168.0.52	5.38.112.212	MPTCP	194	41764 → 5201 [PSH, ACK] Seq=42 Ack=2 Win=42496 Len=100 TSval=2434907702 TSecr=1270292073
3550	66.957025451	5.38.112.212	192.168.0.52	MPTCP	78	5201 → 41764 [ACK] Seq=2 Ack=142 Win=62720 Len=0 TSval=1270292086 TSecr=2434907695
3551	66.957067340	192.168.0.52	5.38.112.212	MPTCP	198	41764 → 5201 [PSH, ACK] Seq=142 Ack=2 Win=42496 Len=104 TSval=2434907708 TSecr=1270292086
3552	66.957103906	5.38.112.212	192.168.0.52	MPTCP	95	5201 → 41764 [PSH, ACK] Seq=2 Ack=142 Win=62720 Len=1 TSval=1270292086 TSecr=2434907695
3553	66.957338796	192.168.0.52	5.38.112.212	MPTCP	78	41772 → 5201 [SYN] Seq=0 Win=42340 Len=0 MSS=1460 SACK_PERM=1 TSval=2434907708 TSecr=0 WS=512
3554	66.963632852	5.38.112.212	192.168.0.52	MPTCP	86	5201 → 41772 [SYN, ACK] Seq=0 Ack=1 Win=62643 Len=0 MSS=1460 SACK_PERM=1 TSval=1270292093 TSecr=2434907708 WS=128
3555	66.963696106	192.168.0.52	5.38.112.212	MPTCP	86	41772 → 5201 [ACK] Seq=1 Ack=1 Win=42496 Len=0 TSval=2434907715 TSecr=1270292093
3556	66.963806293	192.168.0.52	5.38.112.212	MPTCP	127	41772 → 5201 [PSH, ACK] Seq=1 Ack=1 Win=42496 Len=37 TSval=2434907715 TSecr=1270292093
3557	66.969415121	5.38.112.212	192.168.0.52	MPTCP	86	[TCP Window Update] 5201 → 41772 [ACK] Seq=1 Ack=1 Win=62720 Len=0 TSval=1270292099 TSecr=2434907715
3558	66.969431544	192.168.0.52	5.38.112.212	MPTCP	78	41772 → 5201 [ACK] Seq=38 Ack=1 Win=42496 Len=0 TSval=2434907720 TSecr=1270292099
3559	66.969440281	5.38.112.212	192.168.0.52	MPTCP	78	5201 → 41772 [ACK] Seq=1 Ack=38 Win=62720 Len=0 TSval=1270292099 TSecr=2434907715
3560	66.969464162	192.168.0.52	5.38.112.212	MPTCP	86	49303 → 12345 [SYN] Seq=0 Win=42496 Len=0 MSS=1460 SACK_PERM=1 TSval=2434907720 TSecr=0 WS=512
**3561	66.976147257	5.38.112.212	192.168.0.52	TCP	54	12345 → 49303 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0**
3562	66.998854885	192.168.0.52	5.38.112.212	MPTCP	78	41764 → 5201 [ACK] Seq=246 Ack=3 Win=42496 Len=0 TSval=2434907750 TSecr=1270292086
3563	67.006231584	5.38.112.212	192.168.0.52	MPTCP	96	5201 → 41764 [PSH, ACK] Seq=3 Ack=246 Win=62720 Len=2 TSval=1270292134 TSecr=2434907708
3564	67.006246149	192.168.0.52	5.38.112.212	MPTCP	78	41764 → 5201 [ACK] Seq=246 Ack=5 Win=42496 Len=0 TSval=2434907757 TSecr=1270292134
3565	67.006273710	192.168.0.52	5.38.112.212	MPTCP	7210	41772 → 5201 [PSH, ACK] Seq=38 Ack=1 Win=42496 Len=7120 TSval=2434907757 TSecr=1270292099```

@matttbe
Copy link
Member

matttbe commented Feb 7, 2023

Hello,

Thank you @matttbe I tried the workaround. But, it doesn't work.

1. I tested by iperf3 that does not close listening socket. It can be clear to resolve NAT issue.

If your app doesn't close the listening socket, you don't need the workaround that force creating a listening socket. All you need is to add the signal endpoint on the server using the public IP: it will send an ADD_ADDR with the public IP and the server should accept the MP_JOIN from the client, no?

2. In server, changed the loopback address from 127.0.0.1 to public IP

You should keep 127.0.0.1 but add a new one.

3. In server, add endpoint with public ip and port (ip mptcp endpoint add 5.38.112.212 port 12345 signal) is good!!

Yes but the listening socket will only listen on the public IP. But the server will receive the packet modified by the NAT: with the private IP.

4. In client, add endpoint (192.168.0.52 id 1 fullmesh dev wlp0s20f3)
   But, server sends RST (twice)
   I tested client with single interface(single fullmesh endpoint), because I want to understand about fullmesh mechanism.
   I expected that the client fullmesh endpoint makes 2 subflow with server's application listening socket and signal endpoint(port 12345)
   But, it does no work.
   (Server: Kernel v6.0.0, Client: Kernel v6.2.0-rc4)

Client's wireshark

Do you have the packet trace instead? (export the trace and zip it to join it here)
We don't have all the details.

@matttbe matttbe moved this to Needs triage in MPTCP Upstream: Future Feb 22, 2023
@geliangtang geliangtang added the pm path-manager label Aug 4, 2023
jenkins-tessares pushed a commit that referenced this issue Sep 18, 2023
Puranjay Mohan says:

====================
arm32, bpf: add support for cpuv4 insns

Changes in V2 -> V3
- Added comments at places where there could be confustion.
- In the patch for DIV64, fix the if-else case that would never run.
- In the same patch use a single instruction to POP caller saved regs.
- Add a patch to change maintainership of ARM32 BPF JIT.

Changes in V1 -> V2:
- Fix coding style issues.
- Don't use tmp variable for src in emit_ldsx_r() as it is redundant.
- Optimize emit_ldsx_r() when offset can fit in immediate.

Add the support for cpuv4 instructions for ARM32 BPF JIT. 64-bit division
was not supported earlier so this series adds 64-bit DIV, SDIV, MOD, SMOD
instructions as well.

This series needs any one of the patches from [1] to disable zero-extension
for BPF_MEMSX to support ldsx.

The relevant selftests have passed expect ldsx_insn which needs fentry:

Tested on BeagleBone Black (ARMv7-A):

[root@alarm del]# echo 1 > /proc/sys/net/core/bpf_jit_enable
[root@alarm del]# ./test_progs -a verifier_sdiv,verifier_movsx,verifier_ldsx,verifier_gotol,verifier_bswap
#337/1   verifier_bswap/BSWAP, 16:OK
#337/2   verifier_bswap/BSWAP, 16 @unpriv:OK
#337/3   verifier_bswap/BSWAP, 32:OK
#337/4   verifier_bswap/BSWAP, 32 @unpriv:OK
#337/5   verifier_bswap/BSWAP, 64:OK
#337/6   verifier_bswap/BSWAP, 64 @unpriv:OK
#337     verifier_bswap:OK
#351/1   verifier_gotol/gotol, small_imm:OK
#351/2   verifier_gotol/gotol, small_imm @unpriv:OK
#351     verifier_gotol:OK
#359/1   verifier_ldsx/LDSX, S8:OK
#359/2   verifier_ldsx/LDSX, S8 @unpriv:OK
#359/3   verifier_ldsx/LDSX, S16:OK
#359/4   verifier_ldsx/LDSX, S16 @unpriv:OK
#359/5   verifier_ldsx/LDSX, S32:OK
#359/6   verifier_ldsx/LDSX, S32 @unpriv:OK
#359/7   verifier_ldsx/LDSX, S8 range checking, privileged:OK
#359/8   verifier_ldsx/LDSX, S16 range checking:OK
#359/9   verifier_ldsx/LDSX, S16 range checking @unpriv:OK
#359/10  verifier_ldsx/LDSX, S32 range checking:OK
#359/11  verifier_ldsx/LDSX, S32 range checking @unpriv:OK
#359     verifier_ldsx:OK
#370/1   verifier_movsx/MOV32SX, S8:OK
#370/2   verifier_movsx/MOV32SX, S8 @unpriv:OK
#370/3   verifier_movsx/MOV32SX, S16:OK
#370/4   verifier_movsx/MOV32SX, S16 @unpriv:OK
#370/5   verifier_movsx/MOV64SX, S8:OK
#370/6   verifier_movsx/MOV64SX, S8 @unpriv:OK
#370/7   verifier_movsx/MOV64SX, S16:OK
#370/8   verifier_movsx/MOV64SX, S16 @unpriv:OK
#370/9   verifier_movsx/MOV64SX, S32:OK
#370/10  verifier_movsx/MOV64SX, S32 @unpriv:OK
#370/11  verifier_movsx/MOV32SX, S8, range_check:OK
#370/12  verifier_movsx/MOV32SX, S8, range_check @unpriv:OK
#370/13  verifier_movsx/MOV32SX, S16, range_check:OK
#370/14  verifier_movsx/MOV32SX, S16, range_check @unpriv:OK
#370/15  verifier_movsx/MOV32SX, S16, range_check 2:OK
#370/16  verifier_movsx/MOV32SX, S16, range_check 2 @unpriv:OK
#370/17  verifier_movsx/MOV64SX, S8, range_check:OK
#370/18  verifier_movsx/MOV64SX, S8, range_check @unpriv:OK
#370/19  verifier_movsx/MOV64SX, S16, range_check:OK
#370/20  verifier_movsx/MOV64SX, S16, range_check @unpriv:OK
#370/21  verifier_movsx/MOV64SX, S32, range_check:OK
#370/22  verifier_movsx/MOV64SX, S32, range_check @unpriv:OK
#370/23  verifier_movsx/MOV64SX, S16, R10 Sign Extension:OK
#370/24  verifier_movsx/MOV64SX, S16, R10 Sign Extension @unpriv:OK
#370     verifier_movsx:OK
#382/1   verifier_sdiv/SDIV32, non-zero imm divisor, check 1:OK
#382/2   verifier_sdiv/SDIV32, non-zero imm divisor, check 1 @unpriv:OK
#382/3   verifier_sdiv/SDIV32, non-zero imm divisor, check 2:OK
#382/4   verifier_sdiv/SDIV32, non-zero imm divisor, check 2 @unpriv:OK
#382/5   verifier_sdiv/SDIV32, non-zero imm divisor, check 3:OK
#382/6   verifier_sdiv/SDIV32, non-zero imm divisor, check 3 @unpriv:OK
#382/7   verifier_sdiv/SDIV32, non-zero imm divisor, check 4:OK
#382/8   verifier_sdiv/SDIV32, non-zero imm divisor, check 4 @unpriv:OK
#382/9   verifier_sdiv/SDIV32, non-zero imm divisor, check 5:OK
#382/10  verifier_sdiv/SDIV32, non-zero imm divisor, check 5 @unpriv:OK
#382/11  verifier_sdiv/SDIV32, non-zero imm divisor, check 6:OK
#382/12  verifier_sdiv/SDIV32, non-zero imm divisor, check 6 @unpriv:OK
#382/13  verifier_sdiv/SDIV32, non-zero imm divisor, check 7:OK
#382/14  verifier_sdiv/SDIV32, non-zero imm divisor, check 7 @unpriv:OK
#382/15  verifier_sdiv/SDIV32, non-zero imm divisor, check 8:OK
#382/16  verifier_sdiv/SDIV32, non-zero imm divisor, check 8 @unpriv:OK
#382/17  verifier_sdiv/SDIV32, non-zero reg divisor, check 1:OK
#382/18  verifier_sdiv/SDIV32, non-zero reg divisor, check 1 @unpriv:OK
#382/19  verifier_sdiv/SDIV32, non-zero reg divisor, check 2:OK
#382/20  verifier_sdiv/SDIV32, non-zero reg divisor, check 2 @unpriv:OK
#382/21  verifier_sdiv/SDIV32, non-zero reg divisor, check 3:OK
#382/22  verifier_sdiv/SDIV32, non-zero reg divisor, check 3 @unpriv:OK
#382/23  verifier_sdiv/SDIV32, non-zero reg divisor, check 4:OK
#382/24  verifier_sdiv/SDIV32, non-zero reg divisor, check 4 @unpriv:OK
#382/25  verifier_sdiv/SDIV32, non-zero reg divisor, check 5:OK
#382/26  verifier_sdiv/SDIV32, non-zero reg divisor, check 5 @unpriv:OK
#382/27  verifier_sdiv/SDIV32, non-zero reg divisor, check 6:OK
#382/28  verifier_sdiv/SDIV32, non-zero reg divisor, check 6 @unpriv:OK
#382/29  verifier_sdiv/SDIV32, non-zero reg divisor, check 7:OK
#382/30  verifier_sdiv/SDIV32, non-zero reg divisor, check 7 @unpriv:OK
#382/31  verifier_sdiv/SDIV32, non-zero reg divisor, check 8:OK
#382/32  verifier_sdiv/SDIV32, non-zero reg divisor, check 8 @unpriv:OK
#382/33  verifier_sdiv/SDIV64, non-zero imm divisor, check 1:OK
#382/34  verifier_sdiv/SDIV64, non-zero imm divisor, check 1 @unpriv:OK
#382/35  verifier_sdiv/SDIV64, non-zero imm divisor, check 2:OK
#382/36  verifier_sdiv/SDIV64, non-zero imm divisor, check 2 @unpriv:OK
#382/37  verifier_sdiv/SDIV64, non-zero imm divisor, check 3:OK
#382/38  verifier_sdiv/SDIV64, non-zero imm divisor, check 3 @unpriv:OK
#382/39  verifier_sdiv/SDIV64, non-zero imm divisor, check 4:OK
#382/40  verifier_sdiv/SDIV64, non-zero imm divisor, check 4 @unpriv:OK
#382/41  verifier_sdiv/SDIV64, non-zero imm divisor, check 5:OK
#382/42  verifier_sdiv/SDIV64, non-zero imm divisor, check 5 @unpriv:OK
#382/43  verifier_sdiv/SDIV64, non-zero imm divisor, check 6:OK
#382/44  verifier_sdiv/SDIV64, non-zero imm divisor, check 6 @unpriv:OK
#382/45  verifier_sdiv/SDIV64, non-zero reg divisor, check 1:OK
#382/46  verifier_sdiv/SDIV64, non-zero reg divisor, check 1 @unpriv:OK
#382/47  verifier_sdiv/SDIV64, non-zero reg divisor, check 2:OK
#382/48  verifier_sdiv/SDIV64, non-zero reg divisor, check 2 @unpriv:OK
#382/49  verifier_sdiv/SDIV64, non-zero reg divisor, check 3:OK
#382/50  verifier_sdiv/SDIV64, non-zero reg divisor, check 3 @unpriv:OK
#382/51  verifier_sdiv/SDIV64, non-zero reg divisor, check 4:OK
#382/52  verifier_sdiv/SDIV64, non-zero reg divisor, check 4 @unpriv:OK
#382/53  verifier_sdiv/SDIV64, non-zero reg divisor, check 5:OK
#382/54  verifier_sdiv/SDIV64, non-zero reg divisor, check 5 @unpriv:OK
#382/55  verifier_sdiv/SDIV64, non-zero reg divisor, check 6:OK
#382/56  verifier_sdiv/SDIV64, non-zero reg divisor, check 6 @unpriv:OK
#382/57  verifier_sdiv/SMOD32, non-zero imm divisor, check 1:OK
#382/58  verifier_sdiv/SMOD32, non-zero imm divisor, check 1 @unpriv:OK
#382/59  verifier_sdiv/SMOD32, non-zero imm divisor, check 2:OK
#382/60  verifier_sdiv/SMOD32, non-zero imm divisor, check 2 @unpriv:OK
#382/61  verifier_sdiv/SMOD32, non-zero imm divisor, check 3:OK
#382/62  verifier_sdiv/SMOD32, non-zero imm divisor, check 3 @unpriv:OK
#382/63  verifier_sdiv/SMOD32, non-zero imm divisor, check 4:OK
#382/64  verifier_sdiv/SMOD32, non-zero imm divisor, check 4 @unpriv:OK
#382/65  verifier_sdiv/SMOD32, non-zero imm divisor, check 5:OK
#382/66  verifier_sdiv/SMOD32, non-zero imm divisor, check 5 @unpriv:OK
#382/67  verifier_sdiv/SMOD32, non-zero imm divisor, check 6:OK
#382/68  verifier_sdiv/SMOD32, non-zero imm divisor, check 6 @unpriv:OK
#382/69  verifier_sdiv/SMOD32, non-zero reg divisor, check 1:OK
#382/70  verifier_sdiv/SMOD32, non-zero reg divisor, check 1 @unpriv:OK
#382/71  verifier_sdiv/SMOD32, non-zero reg divisor, check 2:OK
#382/72  verifier_sdiv/SMOD32, non-zero reg divisor, check 2 @unpriv:OK
#382/73  verifier_sdiv/SMOD32, non-zero reg divisor, check 3:OK
#382/74  verifier_sdiv/SMOD32, non-zero reg divisor, check 3 @unpriv:OK
#382/75  verifier_sdiv/SMOD32, non-zero reg divisor, check 4:OK
#382/76  verifier_sdiv/SMOD32, non-zero reg divisor, check 4 @unpriv:OK
#382/77  verifier_sdiv/SMOD32, non-zero reg divisor, check 5:OK
#382/78  verifier_sdiv/SMOD32, non-zero reg divisor, check 5 @unpriv:OK
#382/79  verifier_sdiv/SMOD32, non-zero reg divisor, check 6:OK
#382/80  verifier_sdiv/SMOD32, non-zero reg divisor, check 6 @unpriv:OK
#382/81  verifier_sdiv/SMOD64, non-zero imm divisor, check 1:OK
#382/82  verifier_sdiv/SMOD64, non-zero imm divisor, check 1 @unpriv:OK
#382/83  verifier_sdiv/SMOD64, non-zero imm divisor, check 2:OK
#382/84  verifier_sdiv/SMOD64, non-zero imm divisor, check 2 @unpriv:OK
#382/85  verifier_sdiv/SMOD64, non-zero imm divisor, check 3:OK
#382/86  verifier_sdiv/SMOD64, non-zero imm divisor, check 3 @unpriv:OK
#382/87  verifier_sdiv/SMOD64, non-zero imm divisor, check 4:OK
#382/88  verifier_sdiv/SMOD64, non-zero imm divisor, check 4 @unpriv:OK
#382/89  verifier_sdiv/SMOD64, non-zero imm divisor, check 5:OK
#382/90  verifier_sdiv/SMOD64, non-zero imm divisor, check 5 @unpriv:OK
#382/91  verifier_sdiv/SMOD64, non-zero imm divisor, check 6:OK
#382/92  verifier_sdiv/SMOD64, non-zero imm divisor, check 6 @unpriv:OK
#382/93  verifier_sdiv/SMOD64, non-zero imm divisor, check 7:OK
#382/94  verifier_sdiv/SMOD64, non-zero imm divisor, check 7 @unpriv:OK
#382/95  verifier_sdiv/SMOD64, non-zero imm divisor, check 8:OK
#382/96  verifier_sdiv/SMOD64, non-zero imm divisor, check 8 @unpriv:OK
#382/97  verifier_sdiv/SMOD64, non-zero reg divisor, check 1:OK
#382/98  verifier_sdiv/SMOD64, non-zero reg divisor, check 1 @unpriv:OK
#382/99  verifier_sdiv/SMOD64, non-zero reg divisor, check 2:OK
#382/100 verifier_sdiv/SMOD64, non-zero reg divisor, check 2 @unpriv:OK
#382/101 verifier_sdiv/SMOD64, non-zero reg divisor, check 3:OK
#382/102 verifier_sdiv/SMOD64, non-zero reg divisor, check 3 @unpriv:OK
#382/103 verifier_sdiv/SMOD64, non-zero reg divisor, check 4:OK
#382/104 verifier_sdiv/SMOD64, non-zero reg divisor, check 4 @unpriv:OK
#382/105 verifier_sdiv/SMOD64, non-zero reg divisor, check 5:OK
#382/106 verifier_sdiv/SMOD64, non-zero reg divisor, check 5 @unpriv:OK
#382/107 verifier_sdiv/SMOD64, non-zero reg divisor, check 6:OK
#382/108 verifier_sdiv/SMOD64, non-zero reg divisor, check 6 @unpriv:OK
#382/109 verifier_sdiv/SMOD64, non-zero reg divisor, check 7:OK
#382/110 verifier_sdiv/SMOD64, non-zero reg divisor, check 7 @unpriv:OK
#382/111 verifier_sdiv/SMOD64, non-zero reg divisor, check 8:OK
#382/112 verifier_sdiv/SMOD64, non-zero reg divisor, check 8 @unpriv:OK
#382/113 verifier_sdiv/SDIV32, zero divisor:OK
#382/114 verifier_sdiv/SDIV32, zero divisor @unpriv:OK
#382/115 verifier_sdiv/SDIV64, zero divisor:OK
#382/116 verifier_sdiv/SDIV64, zero divisor @unpriv:OK
#382/117 verifier_sdiv/SMOD32, zero divisor:OK
#382/118 verifier_sdiv/SMOD32, zero divisor @unpriv:OK
#382/119 verifier_sdiv/SMOD64, zero divisor:OK
#382/120 verifier_sdiv/SMOD64, zero divisor @unpriv:OK
#382     verifier_sdiv:OK
Summary: 5/163 PASSED, 0 SKIPPED, 0 FAILED

As the selftests don't compile for 32-bit architectures without
modifications due to long being 32-bit,
I have added new tests to lib/test_bpf.c for cpuv4 insns, all are passing:

test_bpf: Summary: 1052 PASSED, 0 FAILED, [891/1040 JIT'ed]
test_bpf: test_tail_calls: Summary: 10 PASSED, 0 FAILED, [10/10 JIT'ed]
test_bpf: test_skb_segment: Summary: 2 PASSED, 0 FAILED

[1] https://lore.kernel.org/all/mb61p5y4u3ptd.fsf@amazon.com/
====================

Link: https://lore.kernel.org/r/20230907230550.1417590-1-puranjay12@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
matttbe pushed a commit that referenced this issue Jul 10, 2024
Add a test case which replaces an active ingress qdisc while keeping the
miniq in-tact during the transition period to the new clsact qdisc.

  # ./vmtest.sh -- ./test_progs -t tc_link
  [...]
  ./test_progs -t tc_link
  [    3.412871] bpf_testmod: loading out-of-tree module taints kernel.
  [    3.413343] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  #332     tc_links_after:OK
  #333     tc_links_append:OK
  #334     tc_links_basic:OK
  #335     tc_links_before:OK
  #336     tc_links_chain_classic:OK
  #337     tc_links_chain_mixed:OK
  #338     tc_links_dev_chain0:OK
  #339     tc_links_dev_cleanup:OK
  #340     tc_links_dev_mixed:OK
  #341     tc_links_ingress:OK
  #342     tc_links_invalid:OK
  #343     tc_links_prepend:OK
  #344     tc_links_replace:OK
  #345     tc_links_revision:OK
  Summary: 14/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240708133130.11609-2-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement pm path-manager
Projects
Status: Needs triage
Development

No branches or pull requests

4 participants