-
Notifications
You must be signed in to change notification settings - Fork 55
Conversation
b4096b7
to
1b8bb71
Compare
788a921
to
eaa0e65
Compare
'Late binding unsets unsuitable selected node' test fails on both 1.19 and 1.20: |
I could reproduce that locally. After the raw conversion test ran and (temporarily) had the CSI driver running on the master node, kubelet keeps the driver listed in CSINode for that node. Further investigation showed that this is because we leave the registration socket in I'm checking with upstream (#csi channel, in case you want to follow) how this is supposed to be done. |
Updating node-driver-registrar fixes that. I also added an E2E test for it. Or rather, I first added the test, it failed, and then I found the solution... |
The "better volume leak check" worked as intended and points towards the LVM CSINode test as culprit: https://cloudnative-k8sci.southcentralus.cloudapp.azure.com/blue/organizations/jenkins/pmem-csi/detail/PR-919/10/tests |
9333845
to
1e64e0a
Compare
It's the same version also used by current Kubernetes and we want to stay up-to-date regarding the runtime.
Node controller secrets are no more used by the driver, so it is unnecessary to generate them. The only secret/certificates the controller driver required is the webhook secrets. This shouldn't be a problem for running version skew tests as those tests uses the version specific deployment script. So downgrading the driver version to v0.8 requires those secrets and they gets generated by the setup scripts of that revision of code.
With the recent(a519ba3) change now gotests only run device-manager and imagefile tests. So passing of certificates and CRDs no more required to run these tests inside the pod.
Moving to version skew testing to base release 0.9. We no need to carry 0.8 specific workarounds anymore. Also fixed controller skew test that was using wrong statefulset name while retrieving controller.
…ller' Using 'pmem-registry' as prefix for controller certificate is not relevant anymore. 'pmem-controller' is more appropriate.
Commit 40ba355 intentionally captured stderr in the output because earlier it was hard to understand why LVM commands failed. However, we now learned the some error messages can be safely ignored and not doing so broke LVM mode: when /dev/sda is unreadable, "vgs" prints a line about it on stderr, but the output on stdout is fine. Now we capture both output streams separately, log both, but only return stdout to the caller. The TestResult call is unrelated to that fix and will be used for additional debugging in another PR. It gets added here because it can use the same test cases and is related.
This addresses some nits found during review of intel#921
This has different goals: - by introducing abstract interfaces it will be possible to mock the library during unit testing - via type aliases it becomes possible to drop several explicit type conversions and extra variables - with a String() method it is no longer necessary to offer non-standard MarshalJSON methods - by moving helper code that only needs the abstrace interface into stand-alone functions those functions can also be used together with a mock language binding It is unclear why, but because of those changes, k8s.io/component-base/logs gets pulled into the test binary for pkg/scheduler which caused the klog flags to be initialized twice.
Reconfiguring an existing namespace needs the same alignment checks and PFN support as creating a namespace. It also needs the ability to disable the namespace.
The new functionality gets added to the existing binary to minimize overall image size. Unit testing with a fake ndctl implementation is not very realistic because it does not mimick all constraints of the actual library, but it's still better than none and can be used at least to verify the behavior when invoked in different hardware environments, something that would be difficult in a E2E testing.
e37ffe8
to
410731f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@avalluri: can you review?
Other than the documentation and updating daemonset fields, the rest looks good to me.
docs/install.md
Outdated
only the central controller pod will run. | ||
1. For each node that has one or more raw namespaces that all need | ||
to be converted, set the `<driver name>/convert-raw-namespaces` label (usually | ||
`pmem-csi.intel.com/convert-raw-namespaces` to `force`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing closing ')'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
conversion. | ||
|
||
The output of a successful conversion will look like this: | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, as there is no easy way for the user to check the logs, keeping this log output is not required in the installation documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log of a successful conversion is useful when comparing against an unsuccessful one because it shows which messages are normal and which aren't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reworded the section about how to see these logs.
ds.Labels["app.kubernetes.io/component"] = "node-setup" | ||
ds.Labels["app.kubernetes.io/instance"] = d.Name | ||
|
||
ds.Spec = appsv1.DaemonSetSpec{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid unnecessarily object patching, can you please change this such that it set's only interested daemonset fields and void unsettling the default fields as in #931.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
410731f
to
8e8ded8
Compare
@avalluri okay now? |
// Node setup must run as root user | ||
RunAsUser: &root, | ||
}, | ||
TerminationMessagePath: "/tmp/termination-log", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please also set containers TerminationMessagePolicy
to the default value of corev1.TerminationMessageReadFile
. Otherwise, it ends up patching the deamonset on every reconcile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This really is impossible to get right without a test.
Anyway, added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This really is impossible to get right without a test.
Yes, true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we fix the new daemonset container's TerminationMessagePolicy
, then looks good to me.
This creates the DaemonSet which runs on nodes with the "<driver name>/convert-raw-namespaces: force" label. It's inactive if that label is never set by a user. Node pods are now allowed to run on the master node. This is usually not necessary and won't be triggered because the required labels won't be set for that node, but if someone wants that, we should allow it, i.e. tolerate the node taint. The immediate usage for this will be testing of raw namespace conversion on the master node of the QEMU cluster.
QEMU substracts space for the labels from the backing store file. To ensure that the size inside the VM nicely aligns, we have to compensate for that.
Will be usful for testing raw namespace conversion.
Namespaces on legacy PMEM do not support UUID, size and most importantly, the alt name. The conversion code now skips setting those just like the ndctl binary does. The effect should be that conversion works, but then further work will be needed to get LVM up and running on that namespace.
There were still issues with the Go implementation ("failed to run driver: set fsdax mode: Failed to set enforce mode: No such device or address", intel#884 (comment)). Trying to get it to emulate all quirks and workarounds of the ndctl binaries seems (in retrospect...) unnecessarily complex. We already include the binary in our images and likely always will for debugging, so we might as well just call the binary. The output is easily parsable after all.
If something goes wrong, then ndctl and vgdisplay output is useful to determine what the state of the node is.
For legacy PMEM we cannot rely on the alt name. When conversion is explicitly requested, we can and have to create the expected volume group directly, then the normal driver will use it without checking the alt name. For the sake of consistency with the normal mode, only active regions are considered. This made no difference so far, but if we ever really encounter an inactive region, then we probably cannot use it anyway because our code does not activate it.
When a user explicitly asks for conversion, the expected outcome is that afterwards the PMEM-CSI driver can run in LVM mode. By checking that explicitly and throwing an error if it is not the case instead of silently not starting the LVM driver the user can investigate by reading the logs of the failed conversion pod.
Without that information it is hard to tell what is being done based on the logs.
This is useful for users. One common problem is that a node is not prepared properly and then PMEM-CSI runs with zero PMEM, which was not obvious from the log output. It's also going to be useful for testing of raw namespace conversion.
This ensures that: - pod scheduling really works as planned - ndctl invocations correctly transform the raw namespace into fsdax
The registration socket must be removed by node-driver-registrar when it terminates, otherwise kubelet will not update CSINode. The newer v2.1.0 node-driver-register does that. The corresponding E2E test failed before that image update and thus covers the issue.
The previous approach tried to reduce the output of the underlying commands. For direct mode, that involved sorting by keyword, which made error messages almost impossible to read ("not meant to be pretty-printed for human"). The problem is that these error messages *do* get read by humans when a volume leaks. The new approach simply splits output into lines with a prefix that gives enough context (host + mode) for the volume and then uses stretchr/testify/assert to print a diff. That's the part that then is human-readable, with full list of lines provided as additional context. Examples below. In addition, the master node also gets checked. This is important because some tests now run there (raw namespace conversion). Also, LVM and direct mode both get checked for more tests. We can no longer assume that a test only touches volumes of one kind because the driver has become more flexible than that and even before might have had bugs that caused other changes. • Failure in Spec Teardown (AfterEach) [5.560 seconds] lvm-testing /nvme/gopath/src/github.com/intel/pmem-csi/test/e2e/deploy/deploy.go:1304 sanity [AfterEach] /nvme/gopath/src/github.com/intel/pmem-csi/test/e2e/deploy/deploy.go:1289 Node Service /nvme/gopath/pkg/mod/github.com/kubernetes-csi/csi-test/v3@v3.1.1/pkg/sanity/tests.go:44 should work /nvme/gopath/pkg/mod/github.com/kubernetes-csi/csi-test/v3@v3.1.1/pkg/sanity/node.go:779 Error Trace: volumeleaks.go:78 deploy.go:990 runner.go:113 runner.go:64 setup_nodes.go:15 spec.go:180 spec.go:218 spec.go:138 spec_runner.go:200 spec_runner.go:170 spec_runner.go:66 suite.go:79 ginkgo_dsl.go:229 ginkgo_dsl.go:217 e2e.go:169 e2e_test.go:71 Error: Not equal: expected: deploy.Volumes{"host #0, LVM: ", "host #0, direct: ", "host intel#1, LVM: sanity-09f3bdaa1cf7ba19a84b3d98f13566f15b5f7fa243af965cc3f3f72c ndbus0region0fsdax -wi-a----- 4.00m", "host intel#1, LVM: sanity-4c41bdae3a73008815097201e98b73cd93c7651d2f500c056a841a9b ndbus0region0fsdax -wi-a----- 4.00m", "host intel#1, LVM: sanity-6db7a339756c86301de2b2017abf4cc2ca66a1d52982f66e7ac0299d ndbus0region0fsdax -wi-a----- 4.00m", "host intel#1, LVM: ", "host intel#1, direct: mode:fsdax,size:32212254720,uuid:6c79e657-9b9d-11eb-a703-0e16888583db,", "host intel#2, LVM: ", "host intel#2, direct: mode:fsdax,size:32212254720,uuid:6bfc18d7-9b9d-11eb-bd85-7e089b645596,", "host intel#3, LVM: ", "host intel#3, direct: mode:fsdax,size:32212254720,uuid:6c007833-9b9d-11eb-9e8f-4e1e819e52d7,"} actual : deploy.Volumes{"host #0, LVM: ", "host #0, direct: ", "host intel#1, LVM: sanity-09f3bdaa1cf7ba19a84b3d98f13566f15b5f7fa243af965cc3f3f72c ndbus0region0fsdax -wi-a----- 4.00m", "host intel#1, LVM: sanity-4c41bdae3a73008815097201e98b73cd93c7651d2f500c056a841a9b ndbus0region0fsdax -wi-a----- 4.00m", "host intel#1, LVM: sanity-6db7a339756c86301de2b2017abf4cc2ca66a1d52982f66e7ac0299d ndbus0region0fsdax -wi-a----- 4.00m", "host intel#1, LVM: sanity-fcd36a1f738ddb2a778fc0c2db31a15d03734daf8967525aa6698efd ndbus0region0fsdax -wi-a----- 4.00m", "host intel#1, LVM: ", "host intel#1, direct: mode:fsdax,size:32212254720,uuid:6c79e657-9b9d-11eb-a703-0e16888583db,", "host intel#2, LVM: ", "host intel#2, direct: mode:fsdax,size:32212254720,uuid:6bfc18d7-9b9d-11eb-bd85-7e089b645596,", "host intel#3, LVM: ", "host intel#3, direct: mode:fsdax,size:32212254720,uuid:6c007833-9b9d-11eb-9e8f-4e1e819e52d7,"} Diff: --- Expected +++ Actual @@ -1,2 +1,2 @@ -(deploy.Volumes) (len=11) { +(deploy.Volumes) (len=12) { (string) (len=14) "host #0, LVM: ", @@ -6,2 +6,3 @@ (string) (len=113) "host intel#1, LVM: sanity-6db7a339756c86301de2b2017abf4cc2ca66a1d52982f66e7ac0299d ndbus0region0fsdax -wi-a----- 4.00m", + (string) (len=113) "host intel#1, LVM: sanity-fcd36a1f738ddb2a778fc0c2db31a15d03734daf8967525aa6698efd ndbus0region0fsdax -wi-a----- 4.00m", (string) (len=14) "host intel#1, LVM: ", Test: lvm-testing sanity Node Service should work /nvme/gopath/src/github.com/intel/pmem-csi/test/e2e/deploy/volumeleaks.go:78 • Failure in Spec Teardown (AfterEach) [5.575 seconds] direct-testing /nvme/gopath/src/github.com/intel/pmem-csi/test/e2e/deploy/deploy.go:1304 sanity [AfterEach] /nvme/gopath/src/github.com/intel/pmem-csi/test/e2e/deploy/deploy.go:1289 Node Service /nvme/gopath/pkg/mod/github.com/kubernetes-csi/csi-test/v3@v3.1.1/pkg/sanity/tests.go:44 should work /nvme/gopath/pkg/mod/github.com/kubernetes-csi/csi-test/v3@v3.1.1/pkg/sanity/node.go:779 Error Trace: volumeleaks.go:74 deploy.go:990 runner.go:113 runner.go:64 setup_nodes.go:15 spec.go:180 spec.go:218 spec.go:138 spec_runner.go:200 spec_runner.go:170 spec_runner.go:66 suite.go:79 ginkgo_dsl.go:229 ginkgo_dsl.go:217 e2e.go:169 e2e_test.go:71 Error: Not equal: expected: deploy.Volumes{"host #0, LVM: ", "host #0, direct: ", "host intel#1, LVM: sanity-09f3bdaa1cf7ba19a84b3d98f13566f15b5f7fa243af965cc3f3f72c ndbus0region0fsdax -wi-a----- 4.00m", "host intel#1, LVM: sanity-4c41bdae3a73008815097201e98b73cd93c7651d2f500c056a841a9b ndbus0region0fsdax -wi-a----- 4.00m", "host intel#1, LVM: sanity-6db7a339756c86301de2b2017abf4cc2ca66a1d52982f66e7ac0299d ndbus0region0fsdax -wi-a----- 4.00m", "host intel#1, LVM: sanity-fcd36a1f738ddb2a778fc0c2db31a15d03734daf8967525aa6698efd ndbus0region0fsdax -wi-a----- 4.00m", "host intel#1, LVM: ", "host intel#1, direct: [", "host intel#1, direct: {", "host intel#1, direct: \"dev\":\"namespace0.1\",", "host intel#1, direct: \"mode\":\"fsdax\",", "host intel#1, direct: \"map\":\"dev\",", "host intel#1, direct: \"size\":1073741824,", "host intel#1, direct: \"uuid\":\"518a57c1-9ba0-11eb-9a15-86510ed27694\",", "host intel#1, direct: \"sector_size\":512,", "host intel#1, direct: \"align\":1073741824,", "host intel#1, direct: \"blockdev\":\"pmem0.1\",", "host intel#1, direct: \"name\":\"sanity-0512c2daa4f8cd859c714b167ee80ed890e4208fb04471e4fe205451\"", "host intel#1, direct: },", "host intel#1, direct: {", "host intel#1, direct: \"dev\":\"namespace0.0\",", "host intel#1, direct: \"mode\":\"fsdax\",", "host intel#1, direct: \"map\":\"dev\",", "host intel#1, direct: \"size\":32212254720,", "host intel#1, direct: \"uuid\":\"6c79e657-9b9d-11eb-a703-0e16888583db\",", "host intel#1, direct: \"sector_size\":512,", "host intel#1, direct: \"align\":1073741824,", "host intel#1, direct: \"blockdev\":\"pmem0\",", "host intel#1, direct: \"name\":\"pmem-csi\"", "host intel#1, direct: }", "host intel#1, direct: ]", "host intel#1, direct: ", "host intel#2, LVM: ", "host intel#2, direct: [", "host intel#2, direct: {", "host intel#2, direct: \"dev\":\"namespace0.0\",", "host intel#2, direct: \"mode\":\"fsdax\",", "host intel#2, direct: \"map\":\"dev\",", "host intel#2, direct: \"size\":32212254720,", "host intel#2, direct: \"uuid\":\"6bfc18d7-9b9d-11eb-bd85-7e089b645596\",", "host intel#2, direct: \"sector_size\":512,", "host intel#2, direct: \"align\":1073741824,", "host intel#2, direct: \"blockdev\":\"pmem0\",", "host intel#2, direct: \"name\":\"pmem-csi\"", "host intel#2, direct: }", "host intel#2, direct: ]", "host intel#2, direct: ", "host intel#3, LVM: ", "host intel#3, direct: [", "host intel#3, direct: {", "host intel#3, direct: \"dev\":\"namespace0.0\",", "host intel#3, direct: \"mode\":\"fsdax\",", "host intel#3, direct: \"map\":\"dev\",", "host intel#3, direct: \"size\":32212254720,", "host intel#3, direct: \"uuid\":\"6c007833-9b9d-11eb-9e8f-4e1e819e52d7\",", "host intel#3, direct: \"sector_size\":512,", "host intel#3, direct: \"align\":1073741824,", "host intel#3, direct: \"blockdev\":\"pmem0\",", "host intel#3, direct: \"name\":\"pmem-csi\"", "host intel#3, direct: }", "host intel#3, direct: ]", "host intel#3, direct: "} actual : deploy.Volumes{"host #0, LVM: ", "host #0, direct: ", "host intel#1, LVM: sanity-09f3bdaa1cf7ba19a84b3d98f13566f15b5f7fa243af965cc3f3f72c ndbus0region0fsdax -wi-a----- 4.00m", "host intel#1, LVM: sanity-4c41bdae3a73008815097201e98b73cd93c7651d2f500c056a841a9b ndbus0region0fsdax -wi-a----- 4.00m", "host intel#1, LVM: sanity-6db7a339756c86301de2b2017abf4cc2ca66a1d52982f66e7ac0299d ndbus0region0fsdax -wi-a----- 4.00m", "host intel#1, LVM: sanity-fcd36a1f738ddb2a778fc0c2db31a15d03734daf8967525aa6698efd ndbus0region0fsdax -wi-a----- 4.00m", "host intel#1, LVM: ", "host intel#1, direct: [", "host intel#1, direct: {", "host intel#1, direct: \"dev\":\"namespace0.2\",", "host intel#1, direct: \"mode\":\"fsdax\",", "host intel#1, direct: \"map\":\"dev\",", "host intel#1, direct: \"size\":1073741824,", "host intel#1, direct: \"uuid\":\"3b665ace-9ba1-11eb-9a15-86510ed27694\",", "host intel#1, direct: \"sector_size\":512,", "host intel#1, direct: \"align\":1073741824,", "host intel#1, direct: \"blockdev\":\"pmem0.2\",", "host intel#1, direct: \"name\":\"sanity-b8f9f5e5110b6c7c2d7318c441a1fcde435bf48357f264cc2124267e\"", "host intel#1, direct: },", "host intel#1, direct: {", "host intel#1, direct: \"dev\":\"namespace0.1\",", "host intel#1, direct: \"mode\":\"fsdax\",", "host intel#1, direct: \"map\":\"dev\",", "host intel#1, direct: \"size\":1073741824,", "host intel#1, direct: \"uuid\":\"518a57c1-9ba0-11eb-9a15-86510ed27694\",", "host intel#1, direct: \"sector_size\":512,", "host intel#1, direct: \"align\":1073741824,", "host intel#1, direct: \"blockdev\":\"pmem0.1\",", "host intel#1, direct: \"name\":\"sanity-0512c2daa4f8cd859c714b167ee80ed890e4208fb04471e4fe205451\"", "host intel#1, direct: },", "host intel#1, direct: {", "host intel#1, direct: \"dev\":\"namespace0.0\",", "host intel#1, direct: \"mode\":\"fsdax\",", "host intel#1, direct: \"map\":\"dev\",", "host intel#1, direct: \"size\":32212254720,", "host intel#1, direct: \"uuid\":\"6c79e657-9b9d-11eb-a703-0e16888583db\",", "host intel#1, direct: \"sector_size\":512,", "host intel#1, direct: \"align\":1073741824,", "host intel#1, direct: \"blockdev\":\"pmem0\",", "host intel#1, direct: \"name\":\"pmem-csi\"", "host intel#1, direct: }", "host intel#1, direct: ]", "host intel#1, direct: ", "host intel#2, LVM: ", "host intel#2, direct: [", "host intel#2, direct: {", "host intel#2, direct: \"dev\":\"namespace0.0\",", "host intel#2, direct: \"mode\":\"fsdax\",", "host intel#2, direct: \"map\":\"dev\",", "host intel#2, direct: \"size\":32212254720,", "host intel#2, direct: \"uuid\":\"6bfc18d7-9b9d-11eb-bd85-7e089b645596\",", "host intel#2, direct: \"sector_size\":512,", "host intel#2, direct: \"align\":1073741824,", "host intel#2, direct: \"blockdev\":\"pmem0\",", "host intel#2, direct: \"name\":\"pmem-csi\"", "host intel#2, direct: }", "host intel#2, direct: ]", "host intel#2, direct: ", "host intel#3, LVM: ", "host intel#3, direct: [", "host intel#3, direct: {", "host intel#3, direct: \"dev\":\"namespace0.0\",", "host intel#3, direct: \"mode\":\"fsdax\",", "host intel#3, direct: \"map\":\"dev\",", "host intel#3, direct: \"size\":32212254720,", "host intel#3, direct: \"uuid\":\"6c007833-9b9d-11eb-9e8f-4e1e819e52d7\",", "host intel#3, direct: \"sector_size\":512,", "host intel#3, direct: \"align\":1073741824,", "host intel#3, direct: \"blockdev\":\"pmem0\",", "host intel#3, direct: \"name\":\"pmem-csi\"", "host intel#3, direct: }", "host intel#3, direct: ]", "host intel#3, direct: "} Diff: --- Expected +++ Actual @@ -1,2 +1,2 @@ -(deploy.Volumes) (len=62) { +(deploy.Volumes) (len=73) { (string) (len=14) "host #0, LVM: ", @@ -9,2 +9,13 @@ (string) (len=18) "host intel#1, direct: [", + (string) (len=18) "host intel#1, direct: {", + (string) (len=32) "host intel#1, direct: \"mode\":\"fsdax\",", + (string) (len=29) "host intel#1, direct: \"map\":\"dev\",", + (string) (len=35) "host intel#1, direct: \"size\":1073741824,", + (string) (len=63) "host intel#1, direct: \"uuid\":\"3b665ace-9ba1-11eb-9a15-86510ed27694\",", + (string) (len=35) "host intel#1, direct: \"sector_size\":512,", + (string) (len=36) "host intel#1, direct: \"align\":1073741824,", + (string) (len=89) "host intel#1, direct: \"name\":\"sanity-b8f9f5e5110b6c7c2d7318c441a1fcde435bf48357f264cc2124267e\"", + (string) (len=19) "host intel#1, direct: },", (string) (len=18) "host intel#1, direct: {", Test: direct-testing sanity Node Service should work /nvme/gopath/src/github.com/intel/pmem-csi/test/e2e/deploy/volumeleaks.go:78
8e8ded8
to
5473860
Compare
Changed, please merge. |
This automates the process of converting a raw namespace on VMware vSphere to fsdax mode.
Fixes: #884