-
Notifications
You must be signed in to change notification settings - Fork 339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use csiNode informer to prevent hitting the API server for all time #327
Use csiNode informer to prevent hitting the API server for all time #327
Conversation
Welcome @mucahitkurt! |
Hi @mucahitkurt. Thanks for your PR. I'm waiting for a kubernetes-csi or kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
727cb01
to
ae4316b
Compare
/ok-to-test |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
Was the Gopkg.lock change intentional? There isn't any Gopkg.toml changes.
Did you get a chance to test the provisioner end to end manually?
pkg/controller/topology.go
Outdated
// TODO (#144): use informers | ||
selectedNodeInfo, err := kubeClient.StorageV1beta1().CSINodes().Get(selectedNode.Name, metav1.GetOptions{}) | ||
if err != nil { | ||
key := fmt.Sprintf("%s/%s", selectedNode.Namespace, selectedNode.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Node
objects are non-namespaced. Is this the right key format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it's not right, it should be only name
pkg/controller/topology.go
Outdated
|
||
rand.Shuffle(len(nodeInfos.Items), func(i, j int) { | ||
nodeInfos.Items[i], nodeInfos.Items[j] = nodeInfos.Items[j], nodeInfos.Items[i] | ||
nodeInfos := informer.GetIndexer().List() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: rename all occurrences of nodeInfo
to csiNode
I know you didn't make the name but it'd be super helpful if you could change it while you are at it, but totally optional
pkg/controller/topology_test.go
Outdated
preBetaNodeVersion = "1.13.0" | ||
testDriverName = "com.example.csi/test-driver" | ||
preBetaNodeVersion = "1.13.0" | ||
interval = 100 * time.Millisecond |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use a more descriptive name - what interval is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll delete this after HasSynced method usage
pkg/controller/topology_test.go
Outdated
Labels: l, | ||
Name: fmt.Sprintf("node-%d", i), | ||
Labels: l, | ||
Namespace: "testnamespace", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Node objects shouldn't have namespaces, so this isn't realistic
pkg/controller/topology_test.go
Outdated
@@ -1528,3 +1550,20 @@ func requisiteEqual(t1, t2 []*csi.Topology) bool { | |||
|
|||
return unchecked.Len() == 0 | |||
} | |||
|
|||
func runCsiNodeInformer(kubeClient *fakeclientset.Clientset, t *testing.T, csiNodeNum int) (cache.SharedIndexInformer, chan struct{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: csiNodeNum
-> csiNodeCount
. "Num" could be misinterpreted as index, for example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll delete it after HasSynced method usage
@@ -179,9 +181,27 @@ func main() { | |||
provisionerOptions = append(provisionerOptions, controller.AdditionalProvisionerNames([]string{supportsMigrationFromInTreePluginName})) | |||
} | |||
|
|||
// Create informer to prevent hit the API server for all resource request | |||
csiNodeInformer := v1beta1.NewCSINodeInformer(clientset, 1*time.Hour, cache.Indexers{cache.NamespaceIndex: cache.MetaNamespaceIndexFunc}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull 1 * time.Hour into a named const
pkg/controller/topology_test.go
Outdated
|
||
func waitToObserveCsiNodes(t *testing.T, csiNodeInformer cache.SharedIndexInformer, csiNodeNum int) { | ||
if err := wait.PollImmediate(interval, informerWaitTimeout, func() (bool, error) { | ||
objects := csiNodeInformer.GetIndexer().List() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this just make sure the informer is synced? Could you use informer.HasSynced() instead?
pkg/controller/controller.go
Outdated
@@ -252,6 +255,7 @@ func NewCSIProvisioner(client kubernetes.Interface, | |||
controllerCapabilities: controllerCapabilities, | |||
supportsMigrationFromInTreePluginName: supportsMigrationFromInTreePluginName, | |||
strictTopology: strictTopology, | |||
informer: informer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to wait for the informer to sync.
IIRC other CSI controllers have examples of using informers and they should have sync wait logic
@@ -179,9 +181,27 @@ func main() { | |||
provisionerOptions = append(provisionerOptions, controller.AdditionalProvisionerNames([]string{supportsMigrationFromInTreePluginName})) | |||
} | |||
|
|||
// Create informer to prevent hit the API server for all resource request | |||
csiNodeInformer := v1beta1.NewCSINodeInformer(clientset, 1*time.Hour, cache.Indexers{cache.NamespaceIndex: cache.MetaNamespaceIndexFunc}) | |||
go csiNodeInformer.Run(context.TODO().Done()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO context.Background() is more appropriate
@@ -179,9 +181,27 @@ func main() { | |||
provisionerOptions = append(provisionerOptions, controller.AdditionalProvisionerNames([]string{supportsMigrationFromInTreePluginName})) | |||
} | |||
|
|||
// Create informer to prevent hit the API server for all resource request | |||
csiNodeInformer := v1beta1.NewCSINodeInformer(clientset, 1*time.Hour, cache.Indexers{cache.NamespaceIndex: cache.MetaNamespaceIndexFunc}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change 1*time.Hour
into a named const. Ditto for all the other time constants
Thanks for the comments, I'll try to fix them. Yes, Gopkg.lock change is intentional, Gopkg.toml is not changed because I did not add a new library dependency, I just added some package dependency under client-go. No, I didn't make e2e manual test, TBH I'm not very familiar with the all environment of external provisioner but I'll try to run e2e test for the external provisioner. |
ae4316b
to
0e58fc7
Compare
It might be worth implementing topology first in the hostpath csi driver: kubernetes-csi/csi-driver-host-path#54. Then, we have automated pull/ci jobs that can test the topology path. (We'll also need to update those pull jobs to bring up a multi-node cluster) I am also working on adding additional topology tests as part of kubernetes/kubernetes#71289 |
/test pull-kubernetes-csi-external-provisioner-1-13-on-kubernetes-1-13 |
/retest |
@verult Do you have any idea why |
Topology was an alpha feature in 1.13, so the beta object doesn't exist. To avoid this, we should only create the informer if the topology feature gate in the external-provisioner is enabled. |
0e58fc7
to
3614485
Compare
/test pull-kubernetes-csi-external-provisioner-1-15-on-kubernetes-1-15 |
var csiNodeInformer cache.SharedIndexInformer | ||
if utilfeature.DefaultFeatureGate.Enabled(features.Topology) { | ||
// Create informer to prevent hit the API server for all resource request | ||
csiNodeInformer := v1beta1.NewCSINodeInformer(clientset, ctrl.ResyncPeriodOfCsiNodeInformer, cache.Indexers{cache.NamespaceIndex: cache.MetaNamespaceIndexFunc}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure indexing by namespace makes sense because CSINode is a non-namespaced object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change to cache.Indexers{}
pkg/controller/controller.go
Outdated
@@ -235,7 +239,8 @@ func NewCSIProvisioner(client kubernetes.Interface, | |||
pluginCapabilities connection.PluginCapabilitySet, | |||
controllerCapabilities connection.ControllerCapabilitySet, | |||
supportsMigrationFromInTreePluginName string, | |||
strictTopology bool) controller.Provisioner { | |||
strictTopology bool, | |||
informer cache.SharedIndexInformer) controller.Provisioner { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of passing in the entire informer to the provisioner, how about only passing in the Lister(), since all we need is Get() and List()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change to informer.GetStore()
3614485
to
3ecf51a
Compare
/retest |
1 similar comment
/retest |
7bdb6d2
to
b961349
Compare
var factory informers.SharedInformerFactory | ||
if utilfeature.DefaultFeatureGate.Enabled(features.Topology) { | ||
// Create informer to prevent hit the API server for all resource request | ||
factory = informers.NewSharedInformerFactory(clientset, ctrl.ResyncPeriodOfCsiNodeInformer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the lister is only necessary when topology feature is used, should it use supportsTopology()
instead to capture all conditions?
func (p *csiProvisioner) supportsTopology() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean to change the condition utilfeature.DefaultFeatureGate.Enabled(features.Topology)
with pluginCapabilities[csi.PluginCapability_Service_VOLUME_ACCESSIBILITY_CONSTRAINTS] && utilfeature.DefaultFeatureGate.Enabled(features.Topology)
or do you mean to use csiNodeLister
as initiated inside the csiProvisioner
with the supportsTopology()
check instead of a external provided property of the struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added additional check for plugin capabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup this is what I meant, though in its current form, I'm worried that someone later on will only change one place but not the other. It would be good to have a shared utility function instead in a different file (maybe package/util)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, thanks. Putting the function inside topology.go
instead of util package makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about the delay, just a few more last comments and everything else looks good!
pkg/controller/topology.go
Outdated
// error with the API server. | ||
return nil, fmt.Errorf("error listing CSINodes: %v", err) | ||
klog.Warningf("No csi nodes found") | ||
return nil, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come it's OK to not return an error now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't remember any valid case for this change, I think I changed it in a wrong way, I'll change it to return error.
b961349
to
704a706
Compare
704a706
to
71473f7
Compare
71473f7
to
70c2877
Compare
70c2877
to
6061117
Compare
6061117
to
14e1e38
Compare
@verult PTAL |
14e1e38
to
d5b98b8
Compare
/lgtm Thanks @mucahitkurt ! |
@@ -168,10 +168,6 @@ k8s.io/api/apps/v1 | |||
k8s.io/api/apps/v1beta1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is updating this file necessary? it looks like just some lines got moved around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes content is the same, just some order changes, but I'm getting error from CI like below when I don't change this file with the below command;
ERROR: vendor directory *not* up-to-date, it did get modified by 'GO111MODULE=on go mod vendor':
d5b98b8
to
b1bccd2
Compare
…source request Signed-off-by: Mucahit Kurt <mucahitkurt@gmail.com>
b1bccd2
to
17fa2a8
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: msau42, mucahitkurt The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…master master: update release-tools
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
Use csiNode shared index informer instead of hitting the API server for all csiNode resource request.
Which issue(s) this PR fixes:
Fixes #144
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
cc @msau42