Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recent test failures in CI (apply caps: operation not permitted) #42892

Closed
thaJeztah opened this issue Sep 28, 2021 · 5 comments · Fixed by #42933
Closed

Recent test failures in CI (apply caps: operation not permitted) #42892

thaJeztah opened this issue Sep 28, 2021 · 5 comments · Fixed by #42933

Comments

@thaJeztah
Copy link
Member

thaJeztah commented Sep 28, 2021

Started to see these tests fail; not sure if something changed in our code, or if Jenkins agents were updated leading to this issue;

e.g. https://ci-next.docker.com/public/blue/organizations/jenkins/moby/detail/PR-42888/5/pipeline/

=== RUN   TestContainerVolumesMountedAsShared
    mounts_linux_test.go:313: assertion failed: error is not nil: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: apply caps: operation not permitted: unknown
--- FAIL: TestContainerVolumesMountedAsShared (0.53s)

=== RUN   TestCgroupNamespacesRunPrivileged
    run_cgroupns_linux_test.go:26: assertion failed: error is not nil: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: apply caps: operation not permitted: unknown
--- FAIL: TestCgroupNamespacesRunPrivileged (1.35s)

=== RUN   TestCgroupNamespacesRunPrivilegedAndPrivate
    run_cgroupns_linux_test.go:26: assertion failed: error is not nil: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: apply caps: operation not permitted: unknown
--- FAIL: TestCgroupNamespacesRunPrivilegedAndPrivate (1.36s)

and this one failing on cgroups v2

=== RUN   TestHealthKillContainer
    health_test.go:62: timeout hit after 10s: waiting for container to become healthy
--- FAIL: TestHealthKillContainer (12.47s)

Failing on (arm54, kernel 5.11):

 + docker version
 Client: Docker Engine - Community
  Version:           20.10.8
  API version:       1.41
  Go version:        go1.16.6
  Git commit:        3967b7d
  Built:             Fri Jul 30 19:55:05 2021
  OS/Arch:           linux/arm64
  Context:           default
  Experimental:      true

 Server: Docker Engine - Community
  Engine:
   Version:          20.10.8
   API version:      1.41 (minimum version 1.12)
   Go version:       go1.16.6
   Git commit:       75249d8
   Built:            Fri Jul 30 19:53:13 2021
   OS/Arch:          linux/arm64
   Experimental:     true
  containerd:
   Version:          1.4.9
   GitCommit:        e25210fe30a0a703442421b0f60afac609f950a3
  runc:
   Version:          1.0.1
   GitCommit:        v1.0.1-0-g4144b63
  docker-init:
   Version:          0.19.0
   GitCommit:        de40ad0
 + docker info
 Client:
  Context:    default
  Debug Mode: false
  Plugins:
   app: Docker App (Docker Inc., v0.9.1-beta3)
   buildx: Build with BuildKit (Docker Inc., v0.6.1-docker)

 Server:
  Containers: 0
   Running: 0
   Paused: 0
   Stopped: 0
  Images: 3
  Server Version: 20.10.8
  Storage Driver: overlay2
   Backing Filesystem: extfs
   Supports d_type: true
   Native Overlay Diff: true
   userxattr: false
  Logging Driver: json-file
  Cgroup Driver: cgroupfs
  Cgroup Version: 1
  Plugins:
   Volume: local
   Network: bridge host ipvlan macvlan null overlay
   Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
  Swarm: inactive
  Runtimes: runc io.containerd.runc.v2 io.containerd.runtime.v1.linux
  Default Runtime: runc
  Init Binary: docker-init
  containerd version: e25210fe30a0a703442421b0f60afac609f950a3
  runc version: v1.0.1-0-g4144b63
  init version: de40ad0
  Security Options:
   apparmor
   seccomp
    Profile: default
  Kernel Version: 5.11.0-1017-aws
  Operating System: Ubuntu 20.04.3 LTS
  OSType: linux
  Architecture: aarch64
  CPUs: 4
  Total Memory: 7.488GiB
  Name: ip-10-100-109-202
  ID: DIKC:UTS5:M75Q:2MM2:LS2J:VMJT:INXD:WG5D:FSNZ:SFK6:5RTZ:XAII
  Docker Root Dir: /var/lib/docker
  Debug Mode: false
  Registry: https://index.docker.io/v1/
  Labels:
  Experimental: true
  Insecure Registries:
   127.0.0.0/8
  Live Restore Enabled: true

 + echo check-config.sh version: 2b0755b936416834e14208c6c37b36977e67ea35
 check-config.sh version: 2b0755b936416834e14208c6c37b36977e67ea35
 + curl -fsSL -o /home/ubuntu/workspace/moby_PR-42890/check-config.sh https://raw.githubusercontent.com/moby/moby/2b0755b936416834e14208c6c37b36977e67ea35/contrib/check-config.sh
 + bash /home/ubuntu/workspace/moby_PR-42890/check-config.sh
 warning: /proc/config.gz does not exist, searching other paths for kernel config ...
 info: reading kernel config from /boot/config-5.11.0-1017-aws ...
 
 Generally Necessary:
 - cgroup hierarchy: properly mounted [/sys/fs/cgroup]
 - apparmor: enabled and tools installed
 - CONFIG_NAMESPACES: enabled
 - CONFIG_NET_NS: enabled
 - CONFIG_PID_NS: enabled
 - CONFIG_IPC_NS: enabled
 - CONFIG_UTS_NS: enabled
 - CONFIG_CGROUPS: enabled
 - CONFIG_CGROUP_CPUACCT: enabled
 - CONFIG_CGROUP_DEVICE: enabled
 - CONFIG_CGROUP_FREEZER: enabled
 - CONFIG_CGROUP_SCHED: enabled
 - CONFIG_CPUSETS: enabled
 - CONFIG_MEMCG: enabled
 - CONFIG_KEYS: enabled
 - CONFIG_VETH: enabled (as module)
 - CONFIG_BRIDGE: enabled (as module)
 - CONFIG_BRIDGE_NETFILTER: enabled (as module)
 - CONFIG_IP_NF_FILTER: enabled (as module)
 - CONFIG_IP_NF_TARGET_MASQUERADE: enabled (as module)
 - CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: enabled (as module)
 - CONFIG_NETFILTER_XT_MATCH_CONNTRACK: enabled (as module)
 - CONFIG_NETFILTER_XT_MATCH_IPVS: enabled (as module)
 - CONFIG_NETFILTER_XT_MARK: enabled (as module)
 - CONFIG_IP_NF_NAT: enabled (as module)
 - CONFIG_NF_NAT: enabled (as module)
 - CONFIG_POSIX_MQUEUE: enabled
 
 Optional Features:
 - CONFIG_USER_NS: enabled
 - CONFIG_SECCOMP: enabled
 - CONFIG_SECCOMP_FILTER: enabled
 - CONFIG_CGROUP_PIDS: enabled
 - CONFIG_MEMCG_SWAP: enabled
     (cgroup swap accounting is currently enabled)
 - CONFIG_BLK_CGROUP: enabled
 - CONFIG_BLK_DEV_THROTTLING: enabled
 - CONFIG_CGROUP_PERF: enabled
 - CONFIG_CGROUP_HUGETLB: enabled
 - CONFIG_NET_CLS_CGROUP: enabled (as module)
 - CONFIG_CGROUP_NET_PRIO: enabled
 - CONFIG_CFS_BANDWIDTH: enabled
 - CONFIG_FAIR_GROUP_SCHED: enabled
 - CONFIG_RT_GROUP_SCHED: missing
 - CONFIG_IP_NF_TARGET_REDIRECT: enabled (as module)
 - CONFIG_IP_VS: enabled (as module)
 - CONFIG_IP_VS_NFCT: enabled
 - CONFIG_IP_VS_PROTO_TCP: enabled
 - CONFIG_IP_VS_PROTO_UDP: enabled
 - CONFIG_IP_VS_RR: enabled (as module)
 - CONFIG_SECURITY_SELINUX: enabled
 - CONFIG_SECURITY_APPARMOR: enabled
 - CONFIG_EXT4_FS: enabled
 - CONFIG_EXT4_FS_POSIX_ACL: enabled
 - CONFIG_EXT4_FS_SECURITY: enabled
 - Network Drivers:
   - "overlay":
     - CONFIG_VXLAN: enabled (as module)
     - CONFIG_BRIDGE_VLAN_FILTERING: enabled
       Optional (for encrypted networks):
       - CONFIG_CRYPTO: enabled
       - CONFIG_CRYPTO_AEAD: enabled
       - CONFIG_CRYPTO_GCM: enabled
       - CONFIG_CRYPTO_SEQIV: enabled
       - CONFIG_CRYPTO_GHASH: enabled
       - CONFIG_XFRM: enabled
       - CONFIG_XFRM_USER: enabled (as module)
       - CONFIG_XFRM_ALGO: enabled (as module)
       - CONFIG_INET_ESP: enabled (as module)
   - "ipvlan":
     - CONFIG_IPVLAN: enabled (as module)
   - "macvlan":
     - CONFIG_MACVLAN: enabled (as module)
     - CONFIG_DUMMY: enabled (as module)
   - "ftp,tftp client in container":
     - CONFIG_NF_NAT_FTP: enabled (as module)
     - CONFIG_NF_CONNTRACK_FTP: enabled (as module)
     - CONFIG_NF_NAT_TFTP: enabled (as module)
     - CONFIG_NF_CONNTRACK_TFTP: enabled (as module)
 - Storage Drivers:
   - "aufs":
     - CONFIG_AUFS_FS: missing
   - "btrfs":
     - CONFIG_BTRFS_FS: enabled (as module)
     - CONFIG_BTRFS_FS_POSIX_ACL: enabled
   - "devicemapper":
     - CONFIG_BLK_DEV_DM: enabled
     - CONFIG_DM_THIN_PROVISIONING: enabled (as module)
   - "overlay":
     - CONFIG_OVERLAY_FS: enabled (as module)
   - "zfs":
     - /dev/zfs: present
     - zfs command: missing
     - zpool command: missing
 
 Limits:
 - /proc/sys/kernel/keys/root_maxkeys: 1000000
 
 + true

Passing on (amd64, kernel 5.4):

 + docker version
 Client: Docker Engine - Community
  Version:           20.10.8
  API version:       1.41
  Go version:        go1.16.6
  Git commit:        3967b7d
  Built:             Fri Jul 30 19:54:08 2021
  OS/Arch:           linux/amd64
  Context:           default
  Experimental:      true

 Server: Docker Engine - Community
  Engine:
   Version:          20.10.8
   API version:      1.41 (minimum version 1.12)
   Go version:       go1.16.6
   Git commit:       75249d8
   Built:            Fri Jul 30 19:52:16 2021
   OS/Arch:          linux/amd64
   Experimental:     true
  containerd:
   Version:          1.4.9
   GitCommit:        e25210fe30a0a703442421b0f60afac609f950a3
  runc:
   Version:          1.0.1
   GitCommit:        v1.0.1-0-g4144b63
  docker-init:
   Version:          0.19.0
   GitCommit:        de40ad0
 + docker info
 Client:
  Context:    default
  Debug Mode: false
  Plugins:
   app: Docker App (Docker Inc., v0.9.1-beta3)
   buildx: Build with BuildKit (Docker Inc., v0.6.1-docker)
   scan: Docker Scan (Docker Inc., v0.8.0)

 Server:
  Containers: 0
   Running: 0
   Paused: 0
   Stopped: 0
  Images: 2
  Server Version: 20.10.8
  Storage Driver: overlay2
   Backing Filesystem: extfs
   Supports d_type: true
   Native Overlay Diff: true
   userxattr: false
  Logging Driver: json-file
  Cgroup Driver: cgroupfs
  Cgroup Version: 1
  Plugins:
   Volume: local
   Network: bridge host ipvlan macvlan null overlay
   Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
  Swarm: inactive
  Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
  Default Runtime: runc
  Init Binary: docker-init
  containerd version: e25210fe30a0a703442421b0f60afac609f950a3
  runc version: v1.0.1-0-g4144b63
  init version: de40ad0
  Security Options:
   apparmor
   seccomp
    Profile: default
  Kernel Version: 5.4.0-1057-aws
  Operating System: Ubuntu 18.04.6 LTS
  OSType: linux
  Architecture: x86_64
  CPUs: 2
  Total Memory: 7.487GiB
  Name: ip-10-100-50-184
  ID: 57XX:XYXI:POD2:3T6U:6STX:HEQ4:FLYX:4CFS:5EAQ:GF3U:DBKY:RKIJ
  Docker Root Dir: /var/lib/docker
  Debug Mode: false
  Registry: https://index.docker.io/v1/
  Labels:
  Experimental: true
  Insecure Registries:
   127.0.0.0/8
  Live Restore Enabled: true

 WARNING: No swap limit support
 + echo check-config.sh version: 2b0755b936416834e14208c6c37b36977e67ea35
 check-config.sh version: 2b0755b936416834e14208c6c37b36977e67ea35
 + curl -fsSL -o /home/ubuntu/workspace/moby_PR-42890/check-config.sh https://raw.githubusercontent.com/moby/moby/2b0755b936416834e14208c6c37b36977e67ea35/contrib/check-config.sh
 + bash /home/ubuntu/workspace/moby_PR-42890/check-config.sh
 warning: /proc/config.gz does not exist, searching other paths for kernel config ...
 info: reading kernel config from /boot/config-5.4.0-1057-aws ...

 Generally Necessary:
 - cgroup hierarchy: properly mounted [/sys/fs/cgroup]
 - apparmor: enabled and tools installed
 - CONFIG_NAMESPACES: enabled
 - CONFIG_NET_NS: enabled
 - CONFIG_PID_NS: enabled
 - CONFIG_IPC_NS: enabled
 - CONFIG_UTS_NS: enabled
 - CONFIG_CGROUPS: enabled
 - CONFIG_CGROUP_CPUACCT: enabled
 - CONFIG_CGROUP_DEVICE: enabled
 - CONFIG_CGROUP_FREEZER: enabled
 - CONFIG_CGROUP_SCHED: enabled
 - CONFIG_CPUSETS: enabled
 - CONFIG_MEMCG: enabled
 - CONFIG_KEYS: enabled
 - CONFIG_VETH: enabled (as module)
 - CONFIG_BRIDGE: enabled (as module)
 - CONFIG_BRIDGE_NETFILTER: enabled (as module)
 - CONFIG_IP_NF_FILTER: enabled (as module)
 - CONFIG_IP_NF_TARGET_MASQUERADE: enabled (as module)
 - CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: enabled (as module)
 - CONFIG_NETFILTER_XT_MATCH_CONNTRACK: enabled (as module)
 - CONFIG_NETFILTER_XT_MATCH_IPVS: enabled (as module)
 - CONFIG_NETFILTER_XT_MARK: enabled (as module)
 - CONFIG_IP_NF_NAT: enabled (as module)
 - CONFIG_NF_NAT: enabled (as module)
 - CONFIG_POSIX_MQUEUE: enabled

 Optional Features:
 - CONFIG_USER_NS: enabled
 - CONFIG_SECCOMP: enabled
 - CONFIG_SECCOMP_FILTER: enabled
 - CONFIG_CGROUP_PIDS: enabled
 - CONFIG_MEMCG_SWAP: enabled
 - CONFIG_MEMCG_SWAP_ENABLED: missing
     (cgroup swap accounting is currently not enabled, you can enable it by setting boot option "swapaccount=1")
 - CONFIG_BLK_CGROUP: enabled
 - CONFIG_BLK_DEV_THROTTLING: enabled
 - CONFIG_CGROUP_PERF: enabled
 - CONFIG_CGROUP_HUGETLB: enabled
 - CONFIG_NET_CLS_CGROUP: enabled (as module)
 - CONFIG_CGROUP_NET_PRIO: enabled
 - CONFIG_CFS_BANDWIDTH: enabled
 - CONFIG_FAIR_GROUP_SCHED: enabled
 - CONFIG_RT_GROUP_SCHED: missing
 - CONFIG_IP_NF_TARGET_REDIRECT: enabled (as module)
 - CONFIG_IP_VS: enabled (as module)
 - CONFIG_IP_VS_NFCT: enabled
 - CONFIG_IP_VS_PROTO_TCP: enabled
 - CONFIG_IP_VS_PROTO_UDP: enabled
 - CONFIG_IP_VS_RR: enabled (as module)
 - CONFIG_SECURITY_SELINUX: enabled
 - CONFIG_SECURITY_APPARMOR: enabled
 - CONFIG_EXT4_FS: enabled
 - CONFIG_EXT4_FS_POSIX_ACL: enabled
 - CONFIG_EXT4_FS_SECURITY: enabled
 - Network Drivers:
   - "overlay":
     - CONFIG_VXLAN: enabled (as module)
     - CONFIG_BRIDGE_VLAN_FILTERING: enabled
       Optional (for encrypted networks):
       - CONFIG_CRYPTO: enabled
       - CONFIG_CRYPTO_AEAD: enabled
       - CONFIG_CRYPTO_GCM: enabled
       - CONFIG_CRYPTO_SEQIV: enabled
       - CONFIG_CRYPTO_GHASH: enabled
       - CONFIG_XFRM: enabled
       - CONFIG_XFRM_USER: enabled (as module)
       - CONFIG_XFRM_ALGO: enabled (as module)
       - CONFIG_INET_ESP: enabled (as module)
   - "ipvlan":
     - CONFIG_IPVLAN: enabled (as module)
   - "macvlan":
     - CONFIG_MACVLAN: enabled (as module)
     - CONFIG_DUMMY: enabled (as module)
   - "ftp,tftp client in container":
     - CONFIG_NF_NAT_FTP: enabled (as module)
     - CONFIG_NF_CONNTRACK_FTP: enabled (as module)
     - CONFIG_NF_NAT_TFTP: enabled (as module)
     - CONFIG_NF_CONNTRACK_TFTP: enabled (as module)
 - Storage Drivers:
   - "aufs":
     - CONFIG_AUFS_FS: enabled (as module)
   - "btrfs":
     - CONFIG_BTRFS_FS: enabled (as module)
     - CONFIG_BTRFS_FS_POSIX_ACL: enabled
   - "devicemapper":
     - CONFIG_BLK_DEV_DM: enabled
     - CONFIG_DM_THIN_PROVISIONING: enabled (as module)
   - "overlay":
     - CONFIG_OVERLAY_FS: enabled (as module)
   - "zfs":
     - /dev/zfs: present
     - zfs command: missing
     - zpool command: missing

 Limits:
 - /proc/sys/kernel/keys/root_maxkeys: 1000000

 + true

Also failing (cgroupv2, kernel 5.11);

 + docker version
 Client: Docker Engine - Community
  Version:           20.10.8
  API version:       1.41
  Go version:        go1.16.6
  Git commit:        3967b7d
  Built:             Fri Jul 30 19:54:27 2021
  OS/Arch:           linux/amd64
  Context:           default
  Experimental:      true

 Server: Docker Engine - Community
  Engine:
   Version:          20.10.8
   API version:      1.41 (minimum version 1.12)
   Go version:       go1.16.6
   Git commit:       75249d8
   Built:            Fri Jul 30 19:52:33 2021
   OS/Arch:          linux/amd64
   Experimental:     true
  containerd:
   Version:          1.4.9
   GitCommit:        e25210fe30a0a703442421b0f60afac609f950a3
  runc:
   Version:          1.0.1
   GitCommit:        v1.0.1-0-g4144b63
  docker-init:
   Version:          0.19.0
   GitCommit:        de40ad0
 + docker info
 Client:
  Context:    default
  Debug Mode: false
  Plugins:
   app: Docker App (Docker Inc., v0.9.1-beta3)
   buildx: Build with BuildKit (Docker Inc., v0.6.1-docker)
   scan: Docker Scan (Docker Inc., v0.8.0)
 
 Server:
  Containers: 0
   Running: 0
   Paused: 0
   Stopped: 0
  Images: 0
  Server Version: 20.10.8
  Storage Driver: overlay2
   Backing Filesystem: extfs
   Supports d_type: true
   Native Overlay Diff: true
   userxattr: false
  Logging Driver: json-file
  Cgroup Driver: systemd
  Cgroup Version: 2
  Plugins:
   Volume: local
   Network: bridge host ipvlan macvlan null overlay
   Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
  Swarm: inactive
  Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
  Default Runtime: runc
  Init Binary: docker-init
  containerd version: e25210fe30a0a703442421b0f60afac609f950a3
  runc version: v1.0.1-0-g4144b63
  init version: de40ad0
  Security Options:
   apparmor
   seccomp
    Profile: default
   cgroupns
  Kernel Version: 5.11.0-1017-aws
  Operating System: Ubuntu 20.04.3 LTS
  OSType: linux
  Architecture: x86_64
  CPUs: 2
  Total Memory: 7.564GiB
  Name: ip-10-100-94-68
  ID: VD2N:KFQR:TIUN:GCTK:Z7MW:SXOL:X7RA:5WVK:DVKW:4H4I:D5PW:2WY5
  Docker Root Dir: /var/lib/docker
  Debug Mode: false
  Registry: https://index.docker.io/v1/
  Labels:
  Experimental: true
  Insecure Registries:
   127.0.0.0/8
  Live Restore Enabled: true

@thaJeztah
Copy link
Member Author

thaJeztah commented Sep 28, 2021

Looks like it may be related to Ubuntu 20.04 / kernel 5.11.

Diff between "passing" and "arm64";

diff --git a/./test-pass.txt b/./test-fail.txt
index 0712495..6ea9559 100644
--- a/./test-pass.txt
+++ b/./test-fail.txt
@@ -8,7 +8,7 @@
   Go version:        go1.16.6
   Git commit:        3967b7d
   Built:             REDACTED
-  OS/Arch:           linux/amd64
+  OS/Arch:           linux/arm64
   Context:           default
   Experimental:      true
 
@@ -19,7 +19,7 @@
    Go version:       go1.16.6
    Git commit:       75249d8
    Built:            REDACTED
-   OS/Arch:          linux/amd64
+   OS/Arch:          linux/arm64
    Experimental:     true
   containerd:
    Version:          1.4.9
@@ -40,7 +40,6 @@
   Plugins:
    app: Docker App (Docker Inc., v0.9.1-beta3)
    buildx: Build with BuildKit (Docker Inc., v0.6.1-docker)
-   scan: Docker Scan (Docker Inc., v0.8.0)
 
  Server:
   Containers: 0
@@ -72,11 +71,11 @@
    apparmor
    seccomp
     Profile: default
-  Kernel Version: 5.4.0-1057-aws
-  Operating System: Ubuntu 18.04.6 LTS
+  Kernel Version: 5.11.0-1017-aws
+  Operating System: Ubuntu 20.04.3 LTS
   OSType: linux
-  Architecture: x86_64
-  CPUs: 2
+  Architecture: aarch64
+  CPUs: 4
   Total Memory: 7.488GiB
   Name: REDACTED
   ID: REDACTED
@@ -89,7 +88,6 @@
    127.0.0.0/8
   Live Restore Enabled: true
 
- WARNING: No swap limit support

@@ -98,7 +96,7 @@
  + curl -fsSL -o /home/ubuntu/workspace/moby_PR-42890/check-config.sh https://raw.githubusercontent.com/moby/moby/2b0755b936416834e14208c6c37b36977e67ea35/contrib/check-config.sh
  + bash /home/ubuntu/workspace/moby_PR-42890/check-config.sh
  warning: /proc/config.gz does not exist, searching other paths for kernel config ...
- info: reading kernel config from /boot/config-5.4.0-1057-aws ...
+ info: reading kernel config from /boot/config-5.11.0-1017-aws ...
 
  Generally Necessary:
  - cgroup hierarchy: properly mounted [/sys/fs/cgroup]
@@ -135,8 +133,7 @@
  - CONFIG_SECCOMP_FILTER: enabled
  - CONFIG_CGROUP_PIDS: enabled
  - CONFIG_MEMCG_SWAP: enabled
- - CONFIG_MEMCG_SWAP_ENABLED: missing
-     (cgroup swap accounting is currently not enabled, you can enable it by setting boot option "swapaccount=1")
+     (cgroup swap accounting is currently enabled)
  - CONFIG_BLK_CGROUP: enabled
  - CONFIG_BLK_DEV_THROTTLING: enabled
  - CONFIG_CGROUP_PERF: enabled
@@ -183,7 +180,7 @@
      - CONFIG_NF_CONNTRACK_TFTP: enabled (as module)
  - Storage Drivers:
    - "aufs":
-     - CONFIG_AUFS_FS: enabled (as module)
+     - CONFIG_AUFS_FS: missing
    - "btrfs":
      - CONFIG_BTRFS_FS: enabled (as module)
      - CONFIG_BTRFS_FS_POSIX_ACL: enabled

Diff between "passing" and "group v2" (remove the "check-config.sh" output, as that's not ran on cgroup2 in jenkins):

diff --git a/./test-pass.txt b/./test-fail2.txt
index 0712495..c2d3191 100644
--- a/./test-pass.txt
+++ b/./test-fail2.txt
@@ -1,3 +1,5 @@
+Also failing (cgroupv2);
+
 <details>
 
@@ -55,8 +57,8 @@
    Native Overlay Diff: true
    userxattr: false
   Logging Driver: json-file
-  Cgroup Driver: cgroupfs
-  Cgroup Version: 1
+  Cgroup Driver: systemd
+  Cgroup Version: 2
   Plugins:
    Volume: local
    Network: bridge host ipvlan macvlan null overlay
@@ -72,8 +74,9 @@
    apparmor
    seccomp
     Profile: default
-  Kernel Version: 5.4.0-1057-aws
-  Operating System: Ubuntu 18.04.6 LTS
+   cgroupns
+  Kernel Version: 5.11.0-1017-aws
+  Operating System: Ubuntu 20.04.3 LTS
   OSType: linux
   Architecture: x86_64
   CPUs: 2
@@ -89,118 +92,6 @@
    127.0.0.0/8
   Live Restore Enabled: true
 
- WARNING: No swap limit support

@thaJeztah
Copy link
Member Author

I was wondering if it would be related to #42736, but that has been merged for some time, and CI passed on that PR (running on ubuntu 20.04).

However it looks that;

  • at the time, the ubuntu 20.04 machines were on kernel 5.4 (so looks like they were upgraded to kernel 5.11?)
  • at the time, CI was running Docker 20.06 with runc v1.0.0-rc95; wondering if we're hitting a regression in runc (there were some fixes in v1.0.2)

Info from a "pass" on that PR:

  • docker 20.10.6
  • containerd 1.4.6
  • runc v1.0.0-rc95

OS:

  • kernel 5.4.0-1048-aws
  • Ubuntu 20.04.2 LTS (failures above are on 20.04.3)
+ docker version
 Client: Docker Engine - Community
  Version:           20.10.6
  API version:       1.41
  Go version:        go1.13.15
  Git commit:        370c289
  Built:             Fri Apr  9 22:45:59 2021
  OS/Arch:           linux/arm64
  Context:           default
  Experimental:      true

 Server: Docker Engine - Community
  Engine:
   Version:          20.10.6
   API version:      1.41 (minimum version 1.12)
   Go version:       go1.13.15
   Git commit:       8728dd2
   Built:            Fri Apr  9 22:44:09 2021
   OS/Arch:          linux/arm64
   Experimental:     true
  containerd:
   Version:          1.4.6
   GitCommit:        d71fcd7d8303cbf684402823e425e9dd2e99285d
  runc:
   Version:          1.0.0-rc95
   GitCommit:        b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7
  docker-init:
   Version:          0.19.0
   GitCommit:        de40ad0
 + docker info
 Client:
  Context:    default
  Debug Mode: false
  Plugins:
   app: Docker App (Docker Inc., v0.9.1-beta3)
   buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)

 Server:
  Containers: 0
   Running: 0
   Paused: 0
   Stopped: 0
  Images: 0
  Server Version: 20.10.6
  Storage Driver: overlay2
   Backing Filesystem: extfs
   Supports d_type: true
   Native Overlay Diff: true
   userxattr: false
  Logging Driver: json-file
  Cgroup Driver: cgroupfs
  Cgroup Version: 1
  Plugins:
   Volume: local
   Network: bridge host ipvlan macvlan null overlay
   Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
  Swarm: inactive
  Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
  Default Runtime: runc
  Init Binary: docker-init
  containerd version: d71fcd7d8303cbf684402823e425e9dd2e99285d
  runc version: b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7
  init version: de40ad0
  Security Options:
   apparmor
   seccomp
    Profile: default
  Kernel Version: 5.4.0-1048-aws
  Operating System: Ubuntu 20.04.2 LTS
  OSType: linux
  Architecture: aarch64
  CPUs: 4
  Total Memory: 7.493GiB
  Name: ip-10-100-48-136
  ID: MGZL:OOTV:SN5F:X6EL:EAFI:KBFD:Y3ES:GQ7F:YIFA:P3HD:KTGV:V6GG
  Docker Root Dir: /var/lib/docker
  Debug Mode: false
  Registry: https://index.docker.io/v1/
  Labels:
  Experimental: true
  Insecure Registries:
   127.0.0.0/8
  Live Restore Enabled: true

 WARNING: No swap limit support

@thaJeztah
Copy link
Member Author

So the error is emitted here; https://github.com/opencontainers/runc/blob/v1.0.1/libcontainer/init_linux.go#L195-L197

	if err := w.ApplyCaps(); err != nil {
		return errors.Wrap(err, "apply caps")
	}

And comes from; https://github.com/opencontainers/runc/blob/v1.0.1/libcontainer/capabilities/capabilities.go#L104-L111

// Apply sets all the capabilities for the current process in the config.
func (c *Caps) ApplyCaps() error {
	c.pid.Clear(allCapabilityTypes)
	for _, g := range capTypes {
		c.pid.Set(g, c.caps[g]...)
	}
	return c.pid.Apply(allCapabilityTypes)
}

@thaJeztah
Copy link
Member Author

@AkihiroSuda @kolyshkin any ideas? This something you've seen before? Seems like it's either related to privileged, or a combination of privileged and "private" cgroup NS mode.

@thaJeztah
Copy link
Member Author

thaJeztah commented Sep 28, 2021

Also curious where the 10 seconds timeout comes from in the TestHealthKillContainer test, because the test sets 30 seconds;

ctxPoll, cancel = context.WithTimeout(ctx, 30*time.Second)
defer cancel()
poll.WaitOn(t, pollForHealthStatus(ctxPoll, client, id, "healthy"), poll.WithDelay(100*time.Millisecond))

Ah; looks like that's the config of poll.WaitOn();

func defaultConfig() *Settings {
return &Settings{Timeout: 10 * time.Second, Delay: 100 * time.Millisecond}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant