-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pci address lock for allocated devices #220
Add pci address lock for allocated devices #220
Conversation
Pull Request Test Coverage Report for Build 3037867348Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
d9b7567
to
de043e3
Compare
Hi @SchSeba, what will happen in case the ADD command fails because of the new lock? |
Hi @orelmisan nope. the pod will stay in example:
|
Thank you for the answer @SchSeba. |
de043e3
to
60b047a
Compare
cmd/sriov/main.go
Outdated
|
||
// Check if the device is already allocated. | ||
// This is to prevent issues where kubelet request to delete a pod and in the same time a new pod using the same | ||
// vf is started. we can have an issue where the cmdDel of the old pod us called AFTER the cmdAdd of the new one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// vf is started. we can have an issue where the cmdDel of the old pod us called AFTER the cmdAdd of the new one | |
// vf is started. we can have an issue where the cmdDel of the old pod is called AFTER the cmdAdd of the new one |
pkg/utils/utils.go
Outdated
|
||
path := filepath.Join(allocatedPCIPath, pciAddress) | ||
|
||
err := os.WriteFile(path, []byte(ns), 0700) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to grant this file only read and write permissions.
err := os.WriteFile(path, []byte(ns), 0700) | |
err := os.WriteFile(path, []byte(ns), 0600) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SchSeba does it make sense to check if a VF is allocated at cmdCheck?
60b047a
to
eb95f56
Compare
Hi @ormergi I am not sure if that function is even implemented today via multus |
eb95f56
to
e27e0ad
Compare
return n, err | ||
} | ||
|
||
if isAllocated { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the verification from the main to here so we produce a better error when a PCI address is still assign.
instead of this error
Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_client_seba_65fe3641-ebd8-4f8b-9b17-785f07b3ad3b_0(0140bb9890307e2f3f8804a06dd0f70a0970f7b8ee1254227eb5c0043ddf2f77): error adding pod seba_client to CNI network "multus-cni-network": plugin type="multus" name="multus-cni-network" failed (add): [seba/client/65fe3641-ebd8-4f8b-9b17-785f07b3ad3b:sriov-network-2]: error adding container to network "sriov-network-2": SRIOV-CNI failed to load netconf: LoadConf(): the VF 0000:d8:0b.0 does not have a interface name or a dpdk driver
we will return error that the PCI is still allocated
e27e0ad
to
c0728f0
Compare
@SchSeba The code looks like it does, check this out https://github.com/k8snetworkplumbingwg/multus-cni/blob/v3.7/cmd/main.go#L56 EDIT: if it a lot of work or out of this PR scope, we can do it on a follow-up PR if it makes sense. |
@ormergi I think it's I can open an issue so we track the request to implement the check function from the CNI API definition |
Hi @SchSeba Does the new locking mechanism supports the following scenarios:
|
There is also, reboot of node then force delete of pod from k8s API, in this case, when node comes up i dont think cni will be called |
Hi @orelmisan @adrianchiris I test the 3 cases 1 . A graceful reboot of the node. - With a drain then pod moves to another node or to pending after the node is back the new pod start again
|
Thank you for your answer @SchSeba. |
they may be files overtime but I don't think that can impact a system if you ask me. |
Will it work if these files will be written to |
we can do it. but I am scared that some auto clean in the node if there is storage pressure, will remove the files and then we can again have the same issue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the changes @SchSeba
pkg/utils/utils.go
Outdated
|
||
dat, err := os.ReadFile(path) | ||
if err != nil { | ||
return false, "", fmt.Errorf("failed to read for pci address file for %s: %v", pciAddress, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, add the whole path to the error. It makes it easier to debug in case of errors.
return false, "", fmt.Errorf("failed to read for pci address file for %s: %v", pciAddress, err) | |
return false, "", fmt.Errorf("failed to read PCI address lock file %s: %w", path, err) |
pkg/utils/utils.go
Outdated
func CleanAllocatedPCI(pciAddress, dataDir string) error { | ||
path := filepath.Join(dataDir, "pci", pciAddress) | ||
if err := os.Remove(path); err != nil { | ||
return fmt.Errorf("error removing PCI address lock file %s: %q", path, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: why using %q
for printing errors?
return fmt.Errorf("error removing PCI address lock file %s: %q", path, err) | |
return fmt.Errorf("error removing PCI address lock file %s: %w", path, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was a copy paste from CleanCachedNetConf
I agree we should switch to %w
but all the code use %v
so I will use V and in the future we can open a PR to switch all of them together
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall, lgtm.
c0728f0
to
a0e4c88
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SchSeba Thanks for fixing it!
pkg/config/config.go
Outdated
@@ -36,6 +37,38 @@ func LoadConf(bytes []byte) (*sriovtypes.NetConf, error) { | |||
return nil, fmt.Errorf("LoadConf(): VF pci addr is required") | |||
} | |||
|
|||
// Get the vf pci address | |||
vfPciAddress, err := utils.GetPciAddress(n.Master, n.VFID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
n.DeviceID already holds the VF PCI address no ? why do we need this here ?
pkg/types/types.go
Outdated
@@ -40,6 +40,7 @@ type NetConf struct { | |||
VlanQoS *int `json:"vlanQoS"` | |||
DeviceID string `json:"deviceID"` // PCI address of a VF in valid sysfs format | |||
VFID int | |||
VFPciAddress string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DeviceID is the VF PCI address AFAIK. did i miss anything ?
pkg/utils/utils.go
Outdated
} | ||
return nil | ||
} | ||
|
||
func SaveAllocatedPCI(pciAddress, ns, dataDir string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we wrap these three methods in an interface ?
i know cache is not implemented this way, but i do not think we need to take the same approach here.
it would be great if this would be its own package with interface definition, impl and UT.
or if you prefer, part of util package but a separate file e.g pci_allocator.go
WDYT ?
the dataDir (currently config.DefaultCNIDir
) should be provided when creating the interface implementation (i.e provided in the "New" function)
WDYT ?
later on we can move the caching methods to their own interface as well.
pkg/utils/utils.go
Outdated
} | ||
return nil | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add docstring to the methods below ?
pkg/utils/utils.go
Outdated
return nil | ||
} | ||
|
||
func IsAllocated(pciAddress, dataDir string) (bool, string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when you add docstring here, please add some info about the returned string.
cmd/sriov/main.go
Outdated
@@ -207,10 +212,20 @@ func cmdDel(args *skel.CmdArgs) error { | |||
} | |||
} | |||
|
|||
vfPciAddress, err := utils.GetPciAddress(netConf.Master, netConf.VFID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
netConf.DeviceID is not saved ?
pkg/config/config.go
Outdated
// This is to prevent issues where kubelet request to delete a pod and in the same time a new pod using the same | ||
// vf is started. we can have an issue where the cmdDel of the old pod is called AFTER the cmdAdd of the new one | ||
// This will block the new pod creation until the cmdDel is done. | ||
isAllocated, allocatedNs, err := utils.IsAllocated(vfPciAddress, DefaultCNIDir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could roll the logic below into IsAllocated
and simplify this method.
a VF is allocated if file exists and the namespace stored within it exists else it is not allocated.
if it is not allocated then IsAllocated can delete the stale file (as best effort operation) WDYT ?
also it would be great if you could add more info in commit message about the issue and the changes introduced and even better ( : - D ) if you can split PR to 2 commits (adding the locking mechanism and the other, integrating it in CNI code) with info on each would be even better :) the latter is not a must ofc, depends on you. |
a0e4c88
to
a3b8961
Compare
@adrianchiris Thanks for the comments! I make the changes let me know how it looks now :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so minor nits.
it would be great to add some UT of the added functionality.
pkg/utils/pci_allocator.go
Outdated
dataDir string | ||
} | ||
|
||
func NewPCIAllocator(dataDir string) PCIAllocation { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m(iiinor)nit: docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method should IMO return *PCIAllocator
the caller should store it in a PCIAllocation interface
a3b8961
to
802b9ad
Compare
This is needed to lock the allocation of the same PCI address until the cmdDel is called or the kernel remove the network namespace. Signed-off-by: Sebastian Sch <sebassch@gmail.com>
802b9ad
to
1ec4511
Compare
Hi @adrianchiris thanks for the great comments! when you have time please have a look and let me know if you have any other comments |
1ec4511
to
19e50d9
Compare
…i addresses Fixes: k8snetworkplumbingwg#219 Signed-off-by: Sebastian Sch <sebassch@gmail.com>
19e50d9
to
1ae1fea
Compare
Thanks for addressing my comments ! LGTM |
Signed-off-by: Or Mergi <ormergi@redhat.com>
Place a delay between tests to assure resources (VF/s) are fully released before reused again on a new VMI. (results show that waiting for VMI/s to disappear is not enough) Ref: k8snetworkplumbingwg/sriov-cni#219 This workaround should be temporary until the fix [1] can be consumed. [1] k8snetworkplumbingwg/sriov-cni#220 Signed-off-by: Edward Haas <edwardh@redhat.com>
Place a delay between tests to assure resources (VF/s) are fully released before reused again on a new VMI. (results show that waiting for VMI/s to disappear is not enough) Ref: k8snetworkplumbingwg/sriov-cni#219 This workaround should be temporary until the fix [1] can be consumed. [1] k8snetworkplumbingwg/sriov-cni#220 Signed-off-by: Edward Haas <edwardh@redhat.com>
Following the flakes we have all over SR-IOV lanes due to [1] and [2], bump sriov-cni to v2.7.0 in order to consume the fix [3]. [1] kubevirt/kubevirt#6776 [2] k8snetworkplumbingwg/sriov-cni#219 [3] k8snetworkplumbingwg/sriov-cni#220 Signed-off-by: Or Mergi <ormergi@redhat.com>
Following the flakes we have all over SR-IOV lanes due to [1] and [2], bump sriov-cni to v2.7.0 in order to consume the fix [3]. [1] kubevirt/kubevirt#6776 [2] k8snetworkplumbingwg/sriov-cni#219 [3] k8snetworkplumbingwg/sriov-cni#220 Signed-off-by: Or Mergi <ormergi@redhat.com> Signed-off-by: Or Mergi <ormergi@redhat.com>
* Create allocation interface and implementation. This is needed to lock the allocation of the same PCI address until the cmdDel is called or the kernel remove the network namespace. Signed-off-by: Sebastian Sch <sebassch@gmail.com> * Use the allocator interface to prevent allocation of still in used pci addresses Fixes: k8snetworkplumbingwg#219 Signed-off-by: Sebastian Sch <sebassch@gmail.com> Signed-off-by: Sebastian Sch <sebassch@gmail.com>
This commit add a type of lock to allocated pci address
Fixes: #219
Signed-off-by: Sebastian Sch sebassch@gmail.com