You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 16, 2020. It is now read-only.
When spawning docker container at high rate in network mode bridge, some containers will not have networking connectivity. Coming from moby/moby#27808 asked to report the bug in CoreOS
Created a network with /16 address space, in bridge mode
Spawn container at high rate and try to access network (ping)
I can reproduce the issue with the following script, some container image work better but will fail at some point. Also, I can not reproduce the issue in host network mode.
for num in range {1..300}
do
docker run --network taskers --rm ubuntu:14.04 sh -c "ping -c 1 173.36.21.105; arp -n; ifconfig" | tee output_$num.txt &
done
Describe the results you received:
About 1% of the container will fail without networking.
Here some command outputs taken from the script above:
PING 173.36.21.105 (173.36.21.105) 56(84) bytes of data.
From 173.77.2.18 icmp_seq=1 Destination Host Unreachable
--- 173.36.21.105 ping statistics ---
1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
Address HWtype HWaddress Flags Mask Iface
173.77.0.1 (incomplete) eth0
eth0 Link encap:Ethernet HWaddr 02:42:ad:4d:02:12
inet addr:173.77.2.18 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::42:adff:fe4d:212/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:5 errors:0 dropped:0 overruns:0 frame:0
TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:438 (438.0 B) TX bytes:1072 (1.0 KB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:1 errors:0 dropped:0 overruns:0 frame:0
TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:112 (112.0 B) TX bytes:112 (112.0 B)
X bytes:0 (0.0 B) TX bytes:0 (0.0 B)
Describe the results you expected:
No network failure, ping should go through, arp should resolve.
Additional information you deem important (e.g. issue happens only occasionally):
The issue only occur for 1% of the pods when the system is under load, spawning and deleting lots of containers.
It doesn't occur in host network mode.
Output of docker version:
$ docker version
Client:
Version: 1.12.1
API version: 1.24
Go version: go1.6.3
Git commit: 7a86f89
Built:
OS/Arch: linux/amd64
Server:
Version: 1.12.1
API version: 1.24
Go version: go1.6.3
Git commit: 7a86f89
Built:
OS/Arch: linux/amd64
I've run this a few times to ping the gateway with the default bridge network driver, and there have been no ping failures.
for ((i=0; i<200; i++)) ; do (docker run --rm busybox ping -c 1 172.17.0.1 &> fail.$i && rm -f fail.$i) & done
How did you set up your custom network? Can you confirm whether the failed containers are attached to the bridge? (For example, ip link should show e.g. master docker0 on the veth interface.)
It happens that the node I used for repro just reloaded and I couldn't reproduce the issue easily. I had to re-run several times the scripts (maybe 5000 container runs) until I found the first failures.
Nevertheless, here is ip link from a container that failed:
PING 173.36.21.105 (173.36.21.105) 56(84) bytes of data.
From 173.77.4.23 icmp_seq=1 Destination Host Unreachable
--- 173.36.21.105 ping statistics ---
1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
8274: eth0@if8275: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether 02:42:ad:4d:04:17 brd ff:ff:ff:ff:ff:ff
Here is one that worked:
PING 173.36.21.105 (173.36.21.105) 56(84) bytes of data.
64 bytes from 173.36.21.105: icmp_seq=1 ttl=61 time=3.71 ms
--- 173.36.21.105 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 3.716/3.716/3.716/0.000 ms
8192: eth0@if8193: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether 02:42:ad:4d:03:ed brd ff:ff:ff:ff:ff:ff
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
Okay, I eventually managed to reproduce this with a fresh CoreOS system. However, when using an image built with a fix I've already proposed to systemd, I could not reproduce the issue (after nearly 20,000 containers spawned). We are waiting on upstream to decide on the configuration option to use in systemd/systemd#4228, but we can backport it to fix this issue when a decision is made.
Issue Report
Bug
When spawning docker container at high rate in network mode bridge, some containers will not have networking connectivity. Coming from moby/moby#27808 asked to report the bug in CoreOS
CoreOS Version
Environment
VM running on VMWare
Steps to reproduce the issue:
I can reproduce the issue with the following script, some container image work better but will fail at some point. Also, I can not reproduce the issue in host network mode.
Describe the results you received:
About 1% of the container will fail without networking.
Here some command outputs taken from the script above:
Describe the results you expected:
No network failure, ping should go through, arp should resolve.
Additional information you deem important (e.g. issue happens only occasionally):
The issue only occur for 1% of the pods when the system is under load, spawning and deleting lots of containers.
It doesn't occur in host network mode.
Output of
docker version
:Output of
docker info
:The text was updated successfully, but these errors were encountered: