-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reset the VF if failure occurs before the netconf is cached #222
Reset the VF if failure occurs before the netconf is cached #222
Conversation
/cc @adrianchiris |
cmd/sriov/main.go
Outdated
@@ -94,6 +94,8 @@ func cmdAdd(args *skel.CmdArgs) error { | |||
}) | |||
if err == nil { | |||
_ = sm.ReleaseVF(netConf, args.IfName, args.ContainerID, netns) | |||
// Reset the VF if failure occurs before the netconf is cached | |||
_ = sm.ResetVFConfig(netConf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not the right place.
we need to run the reset not only for non dpdk mode.
I will like to see a defer function going before
if err := sm.ApplyVFConfig(netConf); err != nil {
but for this we need to split that function to the point where we field the origin info.
can you take this part to another function?
pfLink, err := s.nLink.LinkByName(conf.Master)
if err != nil {
return fmt.Errorf("failed to lookup master %q: %v", conf.Master, err)
}
// Save current the VF state before modifying it
vfState := getVfInfo(pfLink, conf.VFID)
if vfState == nil {
return fmt.Errorf("failed to find vf %d", conf.VFID)
}
conf.OrigVfState.FillFromVfInfo(vfState)
and then you can have something like
sm := sriov.NewSriovManager()
(new filed origin function)
(defer here with the reset function inside)
if err := sm.ApplyVFConfig(netConf); err != nil {
return fmt.Errorf("SRIOV-CNI failed to configure VF %q", err)
}
also another change is we need a global err that we can use so if there is an error down in the stack the LIFO method of the defer function will call the release in the right order
- ipam release
- release vf
- resetVFconfig
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the requested changes kindly check.
fbc2673
to
f5e9e56
Compare
Pull Request Test Coverage Report for Build 3412540026Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there , thx for contributing !
a few comments.
once reworked it would be great if you squash to a single commit and add a descriptive commit message.
this brings #115 one step closer to completion :)
cmd/sriov/main.go
Outdated
if err == nil { | ||
_ = sm.ReleaseVF(netConf, args.IfName, args.ContainerID, netns) | ||
// Reset the VF if failure occurs before the netconf is cached | ||
_ = sm.ResetVFConfig(netConf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should always do this IMO, regardless of the fact that netdev is present in container namespace.
currently if ApplyVFConfig fails it will not call ResetVFConfig here.
so lets call it after L91
pkg/sriov/sriov.go
Outdated
@@ -126,7 +126,8 @@ type Manager interface { | |||
SetupVF(conf *sriovtypes.NetConf, podifName string, cid string, netns ns.NetNS) (string, error) | |||
ReleaseVF(conf *sriovtypes.NetConf, podifName string, cid string, netns ns.NetNS) error | |||
ResetVFConfig(conf *sriovtypes.NetConf) error | |||
ApplyVFConfig(conf *sriovtypes.NetConf) error | |||
ApplyVFConfig(conf *sriovtypes.NetConf, pfLink netlink.Link) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you change this interface method ? just to avoid getting netlink handle twice ?
I think it should be kept as is for the sake of simplicity as before
pkg/sriov/sriov.go
Outdated
@@ -126,7 +126,8 @@ type Manager interface { | |||
SetupVF(conf *sriovtypes.NetConf, podifName string, cid string, netns ns.NetNS) (string, error) | |||
ReleaseVF(conf *sriovtypes.NetConf, podifName string, cid string, netns ns.NetNS) error | |||
ResetVFConfig(conf *sriovtypes.NetConf) error | |||
ApplyVFConfig(conf *sriovtypes.NetConf) error | |||
ApplyVFConfig(conf *sriovtypes.NetConf, pfLink netlink.Link) error | |||
FillOriginalVfInfo(conf *sriovtypes.NetConf) (netlink.Link, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the name of the method and what it returns is a bit confusing.
Id prefer that it will just do one thing and fill the original VF info.
pkg/sriov/sriov.go
Outdated
vfState := getVfInfo(pfLink, conf.VFID) | ||
if vfState == nil { | ||
return pfLink, fmt.Errorf("failed to find vf %d", conf.VFID) | ||
//return fmt.Errorf("failed to find vf %d", conf.VFID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove line
pkg/sriov/sriov.go
Outdated
@@ -374,6 +364,22 @@ func (s *sriovManager) ApplyVFConfig(conf *sriovtypes.NetConf) error { | |||
return nil | |||
} | |||
|
|||
// Fills original VF state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit add golang docs string
cmd/sriov/main.go
Outdated
if err := sm.ApplyVFConfig(netConf); err != nil { | ||
pfLink, err := sm.FillOriginalVfInfo(netConf) | ||
if err != nil { | ||
return fmt.Errorf("failed to lookup master %q: %v", netConf.Master, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error string should be: "failed to get original vf information: %v"
the needed info is already part of err
pkg/sriov/sriov.go
Outdated
@@ -374,6 +364,22 @@ func (s *sriovManager) ApplyVFConfig(conf *sriovtypes.NetConf) error { | |||
return nil | |||
} | |||
|
|||
// Fills original VF state | |||
func (s *sriovManager) FillOriginalVfInfo(conf *sriovtypes.NetConf) (netlink.Link, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add unit tests for this new method ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made the changes that you have asked, kindly check it.
f5e9e56
to
a9aa458
Compare
return fmt.Errorf("failed to get original vf information: %v", err) | ||
} | ||
defer func() { | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a call here in this function:
if len(newResult.IPs) == 0 {
return errors.New("IPAM plugin returned missing IP config")
}
Where err wouldn't be != nil. Should we also handle that as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I might be misunderstanding,if len(newResult.IPs) == 0
we are returning the error and it will not be equal to nil, defer function will release ipam and vf, and resetVFConfig. Is it wrong flow? Can you elaborate bit more what is the issue here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The defer function checks that "if err != nil" and in that specific case err will equal nil since it is a length check. So should we actually do something like this?
if len(newResult.IPs) == 0 {
err = ...
return err
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I missed that. Now its fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good now, thanks!
I would like to know how can I run these PR tests myself if I want to? I have seen /retest in one of the PR, wasn't able find any other. Thanks. |
a9aa458
to
39f61af
Compare
39f61af
to
79d287f
Compare
79d287f
to
1eb89e7
Compare
@SchSeba kindly review and trigger tests here. |
1eb89e7
to
bcf7aaf
Compare
/test-all |
PTAL :) |
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This will not work as expected from here
you can't allocate the new err variable or it will not get caught by the defer function. |
bcf7aaf
to
e92d077
Compare
@SchSeba Fixed, there was another place with similar allocation that is also fixed. kindly check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @adilGhaffarDev it looks good now
/lgtm
This PR fixes following issue:
#212