-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sysinfo: Ignore "hidden" sysfs device entries #3260
Conversation
Hi @sagigrimberg. Thanks for your PR. I'm waiting for a google member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Thanks for the PR. What is the issue this is solving currently? Are these devices somehow reported currently? |
Kubelet fails to access these devices, preventing csi metrics collection. kubelet[4192]: E0102 10:01:46.083009 4192 info.go:99] Failed to get disk map: open /sys/block/nvme0c1n1/dev: no such file or directory |
Got it. Is the |
Yes, it is a common way. You can also refer to util-linux/lsblk.c that also does the same thing... Unfortunately, we didn't add a proper documentation entry for it in sysfs/block ABI documentation |
d1e25fc
to
38d4b11
Compare
ping? |
How can I trigger tests on this PR? |
/ok-to-test |
Can I get a review on this? |
ping? |
38d4b11
to
1dfb815
Compare
Any reason why this PR was not merged already? @bobbypage @iwankgb ? |
1dfb815
to
4149224
Compare
@bobbypage this is sitting here forever now, any reason why this is not taken? |
/retest |
retest |
@bobbypage can we get this in please? The failed test looks completely unrelated to the change? This has been sitting here forever and I don't understand why this should be the case TBH. This PR has been primarily ignored while fixing a real issue in cadvisor. |
ea1dac6
to
2393946
Compare
/retest |
@sagigrimberg, while I understand your sentiment regarding maintainers not reviewing code in timely manner, I have to say that majority of us are volunteers and we work on cAdvisor in our free time. |
@iwankgb I was not assigning blame. I was just stating a fact. This PR was primarily ignored. This PR stands since March 2'nd, which since then cadvisor accumulated a few dosands of of PRs merged, while this one has gone, presumably ready to merge since about then (outside of breaking lately and me fixing it after a rebase). If you don't have free time to work on the project, or you don't think that it is a suitable fix for the project, or if you think that there is risk associated it is fine, but it would be appreciated to get at least some feedback... Glad this at least got some attention though... |
utils/sysfs/sysfs.go
Outdated
// https://git.kernel.org/pub/scm/utils/util-linux/util-linux.git | ||
// - c8487d854ba5 ("lsblk: Ignore hidden devices") | ||
hidden, err := os.ReadFile(path.Join(blockDir, name, "/hidden")) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-nil error that is not os.ErrNotExists must not be silently ignored. Returned wrapped error and handle it accordingly wherever this function is called, please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense @iwankgb, fixing this now.
2393946
to
b7b05f8
Compare
utils/sysfs/sysfs.go
Outdated
return false, nil | ||
} | ||
if err != nil { | ||
return false, fmt.Errorf("failed to read %s: %s", devHiddenPath, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather wrap the error:
return false, fmt.Errorf("failed to read %s: %s", devHiddenPath, err) | |
return false, fmt.Errorf("failed to read %s: %w", devHiddenPath, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some devices are "hidden" from userspace due to various reasons. One reason is native nvme multipathing, where the path devices do not present a device node to the user but only exposes the stacking device as the only nvme device. Still there are sysfs attributes for these hidden devices, but we should ignore them altogether because these are not "real" devices at all. Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
b7b05f8
to
88bfa0d
Compare
Some devices are "hidden" from userspace due to various reasons. One reason is native nvme multipathing, where the path devices do not present a device node to the user but only exposes the stacking device as the only nvme device. Still there are sysfs attributes for these hidden devices, but we should ignore them altogether because these are not "real" devices at all.