Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Netlink invalid argument error on VF VLAN configuration #303

Closed
jqueuniet opened this issue Jul 17, 2024 · 13 comments · Fixed by #309
Closed

Netlink invalid argument error on VF VLAN configuration #303

jqueuniet opened this issue Jul 17, 2024 · 13 comments · Fixed by #309

Comments

@jqueuniet
Copy link

What happened?

Pod fails to start, gives an error related to VF VLAN configuration

error adding container to network "sriovnet1": SRIOV-CNI failed to configure VF "failed to set vf 4 vlan configuration - id 100, qos 0 and proto 802.1q: invalid argument"

What did you expect to happen?

Pod starts with VF attached

What are the minimal steps needed to reproduce the bug?

  1. Create SR-IOV policy
  2. Create SR-IOV network
  3. Create pod

Anything else we need to know?

Setup done using the SR-IOV operator

Reading the code for the netlink Go library, I gathered the failed command was equivalent to an ip link CLI call and tried to reproduce with it, but it worked and the VLAN was properly set afterward.

# ip link show enp129s0f0np0
4: enp129s0f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 94:6d:ae:8c:87:80 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 06:24:7b:de:48:61 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 1     link/ether 3a:4e:66:07:97:fe brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 2     link/ether ae:51:fe:0d:d7:d2 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 3     link/ether 32:61:97:93:b3:92 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 4     link/ether fa:33:12:40:60:c6 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 5     link/ether fe:7d:74:65:93:a6 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 6     link/ether a6:12:6d:9d:bd:0a brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 7     link/ether 7e:14:18:08:b0:45 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
# ip link set enp129s0f0np0 vf 4 vlan 100 qos 0 proto 802.1q
# ip link show enp129s0f0np0
4: enp129s0f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 94:6d:ae:8c:87:80 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 06:24:7b:de:48:61 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 1     link/ether 3a:4e:66:07:97:fe brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 2     link/ether ae:51:fe:0d:d7:d2 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 3     link/ether 32:61:97:93:b3:92 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 4     link/ether fa:33:12:40:60:c6 brd ff:ff:ff:ff:ff:ff, vlan 100, spoof checking off, link-state auto, trust off, query_rss off
    vf 5     link/ether fe:7d:74:65:93:a6 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 6     link/ether a6:12:6d:9d:bd:0a brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 7     link/ether 7e:14:18:08:b0:45 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off

Component Versions

Please fill in the below table with the version numbers of applicable components used.

Component Version
SR-IOV CNI Plugin v2.8.0
Multus v4.0.2
SR-IOV Network Device Plugin v3.7.0
Kubernetes 1.29.2
OS Fedora CoreOS 40.20240616.3.0 - kernel 6.8.11-300.fc40.x86_64

Hardware

# lspci -nnk | grep Ethernet
41:00.0 Ethernet controller [0200]: Mellanox Technologies MT2892 Family [ConnectX-6 Dx] [15b3:101d]
41:00.1 Ethernet controller [0200]: Mellanox Technologies MT2892 Family [ConnectX-6 Dx] [15b3:101d]
81:00.0 Ethernet controller [0200]: Mellanox Technologies MT2892 Family [ConnectX-6 Dx] [15b3:101d]
81:00.1 Ethernet controller [0200]: Mellanox Technologies MT2892 Family [ConnectX-6 Dx] [15b3:101d]
81:00.2 Ethernet controller [0200]: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function [15b3:101e]
81:00.3 Ethernet controller [0200]: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function [15b3:101e]
81:00.4 Ethernet controller [0200]: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function [15b3:101e]
81:00.5 Ethernet controller [0200]: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function [15b3:101e]
81:00.6 Ethernet controller [0200]: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function [15b3:101e]
81:00.7 Ethernet controller [0200]: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function [15b3:101e]
81:01.0 Ethernet controller [0200]: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function [15b3:101e]
81:01.1 Ethernet controller [0200]: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function [15b3:101e]
c6:00.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller [14e4:16d8] (rev 01)
	DeviceName: Broadcom 10G Ethernet #1
c6:00.1 Ethernet controller [0200]: Broadcom Inc. and subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller [14e4:16d8] (rev 01)
	DeviceName: Broadcom 10G Ethernet #2
# mstflint -d 81:00.0 query full
Image type:            FS4
FW Version:            22.39.3560
FW Release Date:       24.6.2024
Part Number:           MCX623106AC-CDA_Ax
Description:           ConnectX-6 Dx EN adapter card; 100GbE; Dual-port QSFP56; PCIe 4.0 x16; Crypto and Secure Boot
Product Version:       22.39.3560
Rom Info:              type=UEFI version=14.32.17 cpu=AMD64,AARCH64
                       type=PXE version=3.7.300 cpu=AMD64
Description:           UID                GuidsNumber
Base GUID:             946dae03008c8780        4
Base MAC:              946dae8c8780            4
Image VSD:             N/A
Device VSD:            N/A
PSID:                  MT_0000000436
Security Attributes:   secure-fw
Default Update Method: fw_ctrl
Life cycle:            GA SECURED
Secure Boot Capable:   Enabled
EFUSE Security Ver:    0
Image Security Ver:    0
Security Ver Program:  Manually ; Disabled

Only enp129s0f0np0/81:00.0 is currently configured for VF, to facilitate debugging.

Config Files

Config file locations may be config dependent.

Pod manifest
apiVersion: v1
kind: Pod
metadata:
  name: samplepod1
  namespace: sriov-network-operator
  annotations:
    k8s.v1.cni.cncf.io/networks: sriovnet1
spec:
  nodeSelector:
    feature.node.kubernetes.io/network-sriov.capable: "true"
  containers:
  - name: samplepod
    image: centos/tools
    imagePullPolicy: IfNotPresent
    command: [ "/bin/bash", "-c", "--" ]
    args: [ "while true; do sleep 300000; done;" ]
    resources:
      requests:
        openshift.io/mlxnics: '1'
      limits:
        openshift.io/mlxnics: '1'
CNI config (Try '/etc/cni/net.d/')
{
  "cniVersion": "0.3.1",
  "name": "cilium",
  "plugins": [
    {
       "type": "cilium-cni",
       "enable-debug": false,
       "log-file": "/var/run/cilium/cilium-cni.log"
    }
  ]
}
Device pool config file location (Try '/etc/pcidp/config.json')
Multus config (Try '/etc/cni/multus/net.d')
{"cniVersion":"0.3.1","logLevel":"verbose","logToStderr":true,"name":"multus-cni-network","clusterNetwork":"/host/etc/cni/net.d/05-cilium.conflist","type":"multus-shim"}
Kubernetes deployment type ( Bare Metal, Kubeadm etc.)

bare-metal custom deployment with v1.29.2 rpm kubelet

SR-IOV Network Custom Resource Definition
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
  name: policy-1
  namespace: sriov-network-operator
spec:
  nodeSelector:
    feature.node.kubernetes.io/network-sriov.capable: "true"
  resourceName: mlxnics
  priority: 99
  mtu: 9000
  numVfs: 8
  nicSelector:
      deviceID: "101d"
      rootDevices:
      - "0000:81:00.0"
      #- "0000:81:00.1"
      #- "0000:41:00.0"
      #- "0000:41:00.1"
      vendor: "15b3"
  deviceType: netdevice
--- 
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
  name: sriovnet1
  namespace: sriov-network-operator
spec:
  ipam: |
    {
      "type": "host-local",
      "ranges": [{
        "subnet": "172.18.201.0/24",
        "rangeStart": "172.18.201.8",
        "rangeEnd": "172.18.201.12",
        "gateway": "172.18.201.254"
      }],
      "routes": [{
        "dst": "0.0.0.0/0"
      }]
    }
  resourceName: mlxnics
  vlan: 100

Logs

SR-IOV Network Device Plugin Logs (use kubectl logs $PODNAME)

None, pod does not start

Multus logs (If enabled. Try '/var/log/multus.log' )
Kubelet logs (journalctl -u kubelet)
Jul 17 10:54:41 node1 kubelet[2680]: E0717 10:54:41.613192    2680 remote_runtime.go:193] "RunPodSandbox from runtime service failed" err=<
Jul 17 10:54:41 node1 kubelet[2680]:         rpc error: code = Unknown desc = failed to setup network for sandbox "e0677533c600df92694f6f415fe410de34f291b70244bb5d51367adbeaa190d4": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: 'ContainerID:"e0677533c600df92694f6f415fe410de34f291b70244bb5d51367adbeaa190d4" Netns:"/var/run/netns/cni-fb7ece69-e25b-e798-5e2e-cefc3f83f139" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=sriov-network-operator;K8S_POD_NAME=samplepod1;K8S_POD_INFRA_CONTAINER_ID=e0677533c600df92694f6f415fe410de34f291b70244bb5d51367adbeaa190d4;K8S_POD_UID=758278fb-3cac-4b02-b2e0-367b94517b5a" Path:"" ERRORED: error configuring pod [sriov-network-operator/samplepod1] networking: [sriov-network-operator/samplepod1/758278fb-3cac-4b02-b2e0-367b94517b5a:sriovnet1]: error adding container to network "sriovnet1": SRIOV-CNI failed to configure VF "failed to set vf 4 vlan configuration - id 100, qos 0 and proto 802.1q: invalid argument"
Jul 17 10:54:41 node1 kubelet[2680]:         ': StdinData: {"clusterNetwork":"/host/etc/cni/net.d/05-cilium.conflist","cniVersion":"0.3.1","logLevel":"verbose","logToStderr":true,"name":"multus-cni-network","type":"multus-shim"}
Jul 17 10:54:41 node1 kubelet[2680]:  >
Jul 17 10:54:41 node1 kubelet[2680]: E0717 10:54:41.613278    2680 kuberuntime_sandbox.go:72] "Failed to create sandbox for pod" err=<
Jul 17 10:54:41 node1 kubelet[2680]:         rpc error: code = Unknown desc = failed to setup network for sandbox "e0677533c600df92694f6f415fe410de34f291b70244bb5d51367adbeaa190d4": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: 'ContainerID:"e0677533c600df92694f6f415fe410de34f291b70244bb5d51367adbeaa190d4" Netns:"/var/run/netns/cni-fb7ece69-e25b-e798-5e2e-cefc3f83f139" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=sriov-network-operator;K8S_POD_NAME=samplepod1;K8S_POD_INFRA_CONTAINER_ID=e0677533c600df92694f6f415fe410de34f291b70244bb5d51367adbeaa190d4;K8S_POD_UID=758278fb-3cac-4b02-b2e0-367b94517b5a" Path:"" ERRORED: error configuring pod [sriov-network-operator/samplepod1] networking: [sriov-network-operator/samplepod1/758278fb-3cac-4b02-b2e0-367b94517b5a:sriovnet1]: error adding container to network "sriovnet1": SRIOV-CNI failed to configure VF "failed to set vf 4 vlan configuration - id 100, qos 0 and proto 802.1q: invalid argument"
Jul 17 10:54:41 node1 kubelet[2680]:         ': StdinData: {"clusterNetwork":"/host/etc/cni/net.d/05-cilium.conflist","cniVersion":"0.3.1","logLevel":"verbose","logToStderr":true,"name":"multus-cni-network","type":"multus-shim"}
Jul 17 10:54:41 node1 kubelet[2680]:  > pod="sriov-network-operator/samplepod1"
Jul 17 10:54:41 node1 kubelet[2680]: E0717 10:54:41.613317    2680 kuberuntime_manager.go:1172] "CreatePodSandbox for pod failed" err=<
Jul 17 10:54:41 node1 kubelet[2680]:         rpc error: code = Unknown desc = failed to setup network for sandbox "e0677533c600df92694f6f415fe410de34f291b70244bb5d51367adbeaa190d4": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: 'ContainerID:"e0677533c600df92694f6f415fe410de34f291b70244bb5d51367adbeaa190d4" Netns:"/var/run/netns/cni-fb7ece69-e25b-e798-5e2e-cefc3f83f139" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=sriov-network-operator;K8S_POD_NAME=samplepod1;K8S_POD_INFRA_CONTAINER_ID=e0677533c600df92694f6f415fe410de34f291b70244bb5d51367adbeaa190d4;K8S_POD_UID=758278fb-3cac-4b02-b2e0-367b94517b5a" Path:"" ERRORED: error configuring pod [sriov-network-operator/samplepod1] networking: [sriov-network-operator/samplepod1/758278fb-3cac-4b02-b2e0-367b94517b5a:sriovnet1]: error adding container to network "sriovnet1": SRIOV-CNI failed to configure VF "failed to set vf 4 vlan configuration - id 100, qos 0 and proto 802.1q: invalid argument"
Jul 17 10:54:41 node1 kubelet[2680]:         ': StdinData: {"clusterNetwork":"/host/etc/cni/net.d/05-cilium.conflist","cniVersion":"0.3.1","logLevel":"verbose","logToStderr":true,"name":"multus-cni-network","type":"multus-shim"}
Jul 17 10:54:41 node1 kubelet[2680]:  > pod="sriov-network-operator/samplepod1"
Jul 17 10:54:41 node1 kubelet[2680]: E0717 10:54:41.613456    2680 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"samplepod1_sriov-network-operator(758278fb-3cac-4b02-b2e0-367b94517b5a)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"samplepod1_sriov-network-operator(758278fb-3cac-4b02-b2e0-367b94517b5a)\\\": rpc error: code = Unknown desc = failed to setup network for sandbox \\\"e0677533c600df92694f6f415fe410de34f291b70244bb5d51367adbeaa190d4\\\": plugin type=\\\"multus-shim\\\" name=\\\"multus-cni-network\\\" failed (add): CmdAdd (shim): CNI request failed with status 400: 'ContainerID:\\\"e0677533c600df92694f6f415fe410de34f291b70244bb5d51367adbeaa190d4\\\" Netns:\\\"/var/run/netns/cni-fb7ece69-e25b-e798-5e2e-cefc3f83f139\\\" IfName:\\\"eth0\\\" Args:\\\"IgnoreUnknown=1;K8S_POD_NAMESPACE=sriov-network-operator;K8S_POD_NAME=samplepod1;K8S_POD_INFRA_CONTAINER_ID=e0677533c600df92694f6f415fe410de34f291b70244bb5d51367adbeaa190d4;K8S_POD_UID=758278fb-3cac-4b02-b2e0-367b94517b5a\\\" Path:\\\"\\\" ERRORED: error configuring pod [sriov-network-operator/samplepod1] networking: [sriov-network-operator/samplepod1/758278fb-3cac-4b02-b2e0-367b94517b5a:sriovnet1]: error adding container to network \\\"sriovnet1\\\": SRIOV-CNI failed to configure VF \\\"failed to set vf 4 vlan configuration - id 100, qos 0 and proto 802.1q: invalid argument\\\"\\n': StdinData: {\\\"clusterNetwork\\\":\\\"/host/etc/cni/net.d/05-cilium.conflist\\\",\\\"cniVersion\\\":\\\"0.3.1\\\",\\\"logLevel\\\":\\\"verbose\\\",\\\"logToStderr\\\":true,\\\"name\\\":\\\"multus-cni-network\\\",\\\"type\\\":\\\"multus-shim\\\"}\"" pod="sriov-network-operator/samplepod1" podUID="758278fb-3cac-4b02-b2e0-367b94517b5a"
@SchSeba
Copy link
Collaborator

SchSeba commented Jul 17, 2024

Hi @jqueuniet this sounds like an issue with the driver.

can you please try to just run

ip link set <pf-name> vf 4 vlan 100 qos 0 proto 802.1q

if that failed check dmesg for any logs from the kernel

@jqueuniet
Copy link
Author

Hey, thanks for your answer. I already tried that as I found this kind of feedback in similar issues like #285 , mentioned it in the initial report, here is the CLI output:

# ip link show enp129s0f0np0
4: enp129s0f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 94:6d:ae:8c:87:80 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 06:24:7b:de:48:61 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 1     link/ether 3a:4e:66:07:97:fe brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 2     link/ether ae:51:fe:0d:d7:d2 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 3     link/ether 32:61:97:93:b3:92 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 4     link/ether fa:33:12:40:60:c6 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 5     link/ether fe:7d:74:65:93:a6 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 6     link/ether a6:12:6d:9d:bd:0a brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 7     link/ether 7e:14:18:08:b0:45 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
# ip link set enp129s0f0np0 vf 4 vlan 100 qos 0 proto 802.1q
# ip link show enp129s0f0np0
4: enp129s0f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 94:6d:ae:8c:87:80 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 06:24:7b:de:48:61 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 1     link/ether 3a:4e:66:07:97:fe brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 2     link/ether ae:51:fe:0d:d7:d2 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 3     link/ether 32:61:97:93:b3:92 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 4     link/ether fa:33:12:40:60:c6 brd ff:ff:ff:ff:ff:ff, vlan 100, spoof checking off, link-state auto, trust off, query_rss off
    vf 5     link/ether fe:7d:74:65:93:a6 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 6     link/ether a6:12:6d:9d:bd:0a brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 7     link/ether 7e:14:18:08:b0:45 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off

So, the VLAN is successfully set using the CLI, and no error is returned.

@jqueuniet
Copy link
Author

Just in case this is useful, here are the iproute package version:

# ip -V
ip utility, iproute2-6.7.0, libbpf 1.2.0
# rpm -qv iproute
iproute-6.7.0-2.fc40.x86_64

@jqueuniet
Copy link
Author

Tried with a less bleeding edge distribution, Flatcar stable with a 6.1 kernel and iproute2 6.5.0. Still getting the same error and the same symptoms, sriov-cni can't set the VLAN but I can using iproute2. Nothing seems out of place in dmesg either.

@ianb-mp
Copy link

ianb-mp commented Aug 5, 2024

I have two worker nodes with same NIC hardware (Intel XL710) - the error occurs on one but not the other!? The main difference I can see is that one is running Debian 12 (kernel 6.1.0-23-amd64) and the other is running Rocky Linux 9.4 (kernel 5.14.0-427.24.1.el9_4.x86_64). Debian consistently fails, Rocky consistently works.

Other components (same for both nodes):

Component Version
SR-IOV CNI Plugin v2.8.0
Multus v4.0.2
SR-IOV Network Device Plugin v3.7.0
Kubernetes 1.30.2 (v1.30.2+k0s)

EDIT: I can also set VF config manually from the Debian host using ip link commands.

@mega-alex
Copy link

We've been able to replicate the error consistently with the github.com/vishvananda/netlink library, it seems like there is a problem specifically setting the vlan protocol. See the issue here.

@mega-alex
Copy link

Hi there, we did some more digging and it appears that the issue is related to extra validation added to address this CVE. By adjusting the size of the attribute to a multiple of 4 bytes, it seems to correct the issue. Doing more testing with the patched version today to see if the problem is fully addressed.

@SchSeba
Copy link
Collaborator

SchSeba commented Aug 5, 2024

Hi @mega-alex great working on the debug!
if we can open a PR to the netlink lib it will be great so we can review it

@SchSeba
Copy link
Collaborator

SchSeba commented Aug 6, 2024

Hi @mega-alex thanks for taking care on the netlink side.
this is the PR for the sriov-cni if you want to give it a try #309

@ianb-mp
Copy link

ianb-mp commented Aug 7, 2024

I can confirm the fix has solved my issue 👍

@ianb-mp
Copy link

ianb-mp commented Aug 9, 2024

@SchSeba can you please tag a new release to make this fix available to upstream projects e.g. sriov-network-operator

@SchSeba
Copy link
Collaborator

SchSeba commented Aug 19, 2024

@adrianchiris can I ask you help on creating a new release for this and the sriov operation? :)

@mchiappero
Copy link

Please release at least a v2.8.1 with this fix ASAP. Anything in production cannot receive kernel updates otherwise. Thank you!

starlingx-github pushed a commit to starlingx/ansible-playbooks that referenced this issue Aug 22, 2024
After kernel upgraded to 6.6.0, the sriov-cni plugin started failing
when creating VLANs over a VF interfaces. This described in:
k8snetworkplumbingwg/sriov-cni#303

The fix was released in v2.8.1, pull request #309:
https://github.com/k8snetworkplumbingwg/sriov-cni/releases/tag/v2.8.1

Test Plan:
PASS: start the sriov pod with VLAN configured

Story: 2011124
Task: 50894

Change-Id: I09191a71574cf4f0073c1a40226d5cd679d3e857
Signed-off-by: Caio Bruchert <caio.bruchert@windriver.com>
e0ne added a commit to e0ne/network-operator that referenced this issue Aug 23, 2024
We need to update SR-IOV CNI to have it working with the latest
Linux kernels and have [1] fix included.

[1] k8snetworkplumbingwg/sriov-cni#303

Signed-off-by: Ivan Kolodiazhnyi <ikolodiazhny@nvidia.com>
e0ne added a commit to e0ne/network-operator that referenced this issue Aug 23, 2024
We need to update SR-IOV CNI to have it working with the latest
Linux kernels and have [1] fix included.

[1] k8snetworkplumbingwg/sriov-cni#303

Signed-off-by: Ivan Kolodiazhnyi <ikolodiazhny@nvidia.com>
(cherry picked from commit 8a5f320)
e0ne added a commit to Mellanox/network-operator that referenced this issue Aug 23, 2024
We need to update SR-IOV CNI to have it working with the latest Linux
kernels and have [1] fix included.

[1] k8snetworkplumbingwg/sriov-cni#303
e0ne added a commit to Mellanox/network-operator that referenced this issue Aug 23, 2024
We need to update SR-IOV CNI to have it working with the latest Linux
kernels and have [1] fix included.

[1] k8snetworkplumbingwg/sriov-cni#303

Signed-off-by: Ivan Kolodiazhnyi <ikolodiazhny@nvidia.com>
(cherry picked from commit 8a5f320)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants