-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not try to set VF MAC non-administratively #232
Do not try to set VF MAC non-administratively #232
Conversation
to change mac address for SR-IOV you either:
we are doing the latter. AFAIK this procedure is suppose to work. |
@adrianchiris you are right we are doing the second option but it's not right. I know we put the same mac address but for some drivers it still looks like a try to change the mac address on the VF when the VF doesn't have a trust flag on and that is blocked for security reasons |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
small comments :)
pkg/sriov/sriov.go
Outdated
return "", fmt.Errorf("failed to set netlink MAC address to %s: %v", hwaddr, err) | ||
} | ||
macAddress = conf.MAC | ||
return fmt.Errorf("error setting temp IF name %s for %s", tempName, linkName) | ||
} | ||
|
||
// 4. Change netns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please update the number here :)
pkg/utils/netlink_manager.go
Outdated
@@ -14,7 +14,6 @@ type NetlinkManager interface { | |||
LinkSetVfVlan(netlink.Link, int, int) error | |||
LinkSetVfVlanQos(netlink.Link, int, int, int) error | |||
LinkSetVfHardwareAddr(netlink.Link, int, net.HardwareAddr) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please re-generate the mocks so the function will not be there
pkg/types/types.go
Outdated
LinkState uint32 | ||
HostIFName string | ||
SpoofChk bool | ||
AdminMAC string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe we can change this to just mac now that we only handle the mac address via the PF
so with trust on, its ok to change mac addresses right ? we can configure trust via cni config. perhaps we should fail CNI call if trust is off but mac address was specified ? |
Hi @adrianchiris thanks for the comment let me try to answer That configuration was implemented 2 years ago together with the mac address change from the PF configuration. So this means that from day 1 it was redundant we should not failed the CNI if someone want to configure mac address but trust is off. The PF driver will handle the propagation of the mac address to the vf netdevice mac address. also for example in case of vfio-pci (DPDK) we only configure the mac via the pf administrative configuration that is fine. |
The PR as-is actually does not work as expected for Mellanox (though I stated it does per testing, my bad). Setting the VF MAC via PF appears to work but in fact it is not propagated down to the VF (e.g. eno1v0):
Running a Kubernetes cluster with this PR, pod VF interfaces will still be configured with the original MAC address while the PF reports the VF has the desired MAC address:
I don't know if this is by design or a bug in the Mellanox kernel driver. I looked up for official documentation on how to set VF MAC and all I could find is only via PF ( |
yes, you need to either change MAC on VF netdev (bring interface down, change mac, bring interface up). for Intel NICs you are seeing the MAC address is updated on the fly for VF netdev when setting their administrative MAC ? |
Yes. |
i wonder if this is a change in behaviour for intel NIC or was this always failing. i see support was added around 4 years ago[1] @cgoncalves i wonder, when the VF interface is up and Admin MAC is changed. what happens to the VF netdev ? [1] b4a9571 |
It is changed on the fly:
On Intel SR-IOV NICs, setting the VF netdev MAC address to the same already configured administratively will just be a no-op in the Linux kernel. Setting to a different MAC address, it will fail with
|
The next question will be if now we need a split in the code between mlx and intel, or we can introduce a rebind the driver in the sriov-cni the rebind can have multiple implications like renaming, udev rules, networkManager update, etc so I am not too much in favor of that |
Given we have now confirmed that Mellanox VFs behave that way by design, we might have to yes. The issue that led to this PR is: |
Following @SchSeba's suggestion, I could update this PR to add that new logic.
@adrianchiris WDYT? |
Pull Request Test Coverage Report for Build 3531491065Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
Id prefer to avoid checking vendor driver if possible. see below some additional thoughts :) Seems like there are inconsistencies between vendors WRT admin MAC and also spoofchk trust and vlan. @cgoncalves, after setting admin MAC address does the effective mac of the VF netdev change immediately ? or its an async operation ? perhaps @Eoghan1232 has some more info on this. if it happens immediately (or we can do some sensible polling) then we could read MAC address of the VF and if it differs from desired MAC the set it. in Intel case it would be the same so we will do a no-op. alternatively is to always try to set it. if it fails with a permission denied, then we could read VF effective MAC if it equals our desired MAC then continue, else fail CNI call. (sensible polling may be needed here as well in case operation is async in kernel). for cleanup we could use a similar logic. @cgoncalves what is the cni config you are using ? (are you using trust/spoofchk ?) [1] https://github.com/Netronome/nic-firmware/blob/master/docs/notes/sriov-trust.rst |
Also @cgoncalves i see the following from you:
so the issue is actually only on cleanup ? because on CMD ADD we would
so it should work according to this. is this the case ? on cleanup we do the reverse
here according to what you say, will fail. then we just need to swap between ReleaseVF and ResetVFConfig on cleanup |
I tested this on a PF/VF with i40e and iavf
The mac address got updated without needing a restart of the VF.
I can see in dmesg log that the mac gets assigned and it schedules a reset task. |
VF state machine triggering in the kernel on a mac address change is convoluted and racy. |
The latest change now instead adds a retry should the driver still be working on propagating the MAC address change down to the VF. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
just a comment request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks good to me.
once the comments are added I am happy with it
Some NIC drivers (i.e. i40e/iavf) set their VF MAC addressing asynchronously when set administratively. This means that while the PF could already show the VF with the desired MAC address, the netdev VF may still have the original one. If in this window we issue a netdev VF MAC address set, the driver will return an error and the pod will fail to create. One way to fix this issue would be to not try to set the netdev VF MAC address, rather simply rely on the MAC address set administratively already in place. However, other NIC drivers (i.e. mlx5_core) do not propagate the MAC address down to the netdev VF so for those drivers we have to continue setting the VF MAC address the same way (via PF and netdev VF). This commit addresses the issue with a retry where it waits up to 1 second (5 retries * 200 millisecond sleep) in case driver is still working on propagating the MAC address change down to the VF. ResetVFConfig resets a VF administratively. We must run ResetVFConfig before ReleaseVF because some drivers will error out if we try to reset netdev VF with trust off. So, reset VF MAC address via PF first. Signed-off-by: Carlos Goncalves <cgoncalves@redhat.com>
Approved by 2 maintainers - merging this PR now. |
The VF MAC address is, when requested, already set administratively via PF and works regardless whether VF trust is on or off.
Trying to set VF MAC address non-administratively without trust on is expected to gracefully fail. For example, here's the dmesg output example:
kernel: i40e 0000:3b:00.0: VF attempting to override administratively set MAC address, bring down and up the VF interface to resume normal operation
kernel: i40e 0000:3b:00.0: VF 0 failed opcode 10, retval: -1
kernel: iavf 0000:3b:02.0: Failed to add MAC filter, error IAVF_ERR_NVM
The same applies also when releasing the VF.
Tested on Intel XXV710 (i40e/iavf), Intel E810 (ice/iavf) and Mellanox ConnectX-4 (mlx5_core).
Signed-off-by: Carlos Goncalves cgoncalves@redhat.com