-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: mdadm disk fail metric #261
Comments
Maybe instead of
|
May I ask if this issue is still pursued with PR #492 closed? |
I didn't manage to find corresponding PR and can't check on a real system now. However issue looks addressed, let's close it for now. |
@hryamzik (and anyone else who is searching for why node_exporter doesn't have a metric for md software raid disk states (e.g. failed, active, etc)), I found a few PRs with updates to add the disk states, but none of them got merged.. e.g. Seems like there's more debate then consensus in those PRs and they get closed over time... See that PR and updated md_info textcollector here... |
@mpursley You're right, the PRs weren't merged. As mentioned here: #648 (comment) |
Yeah, makes sense. Another option people can use in the mean time is this (now merged) text_collector script (running in a cronjob as root)... https://github.com/prometheus/node_exporter/blob/master/text_collector_examples/md_info_detail.sh Thanks @discordianfish |
Hi everyone. I was planning to extract away the stat-extraction complexity in |
@You-NeverKnow Yes, sounds great, we've been moving all of the generic |
I'm confused. I was under the impression that we were using GET_ARRAY_INFO IOCTL call to retrieve raid array statuses according to this comment: #648 (comment). Would you rather want to use the mdstat parser in procfs instead? |
#648 was never merged, so we're still just doing proc file parsing. We could go the syscall method route, or the parsing route. I haven't looked at the mdadm stuff recently, but afaik there's some stuff you can only get with parsing. Also, we need to make sure any syscalls are available as non-root. We don't allow code in the node_exporter that requires root level access for safety. |
* Closes issue #261 on node_exporter. Delegated mdstat parsing to procfs project. mdadm_linux.go now only exports the metrics. -> Added disk labels: "fail", "spare", "active" to indicate disk status -> hanged metric node_md_disks_total ==> node_md_disks_required -> Removed test cases for mdadm_linux.go, as the functionality they tested for has been moved to procfs project. Signed-off-by: Advait Bhatwadekar <advait123@ymail.com>
Just came across this very useful feature request. Anything changed since last year? For users: |
Yes, I would say so. #1403 was merged, which included the refactoring into procfs and the addition of the state label. I think the merge of that PR was also supposed to close this issue? @SuperQ The change is part of v1.0.0-rc.0. |
Awesome I changed to use the v1.0.0-rc.0 version and I get the metrics I wanted.
I would also consider the ticket closed 😄 Thanks for the fast help! |
Great and thanks for confirming. Closing. |
* Closes issue prometheus#261 on node_exporter. Delegated mdstat parsing to procfs project. mdadm_linux.go now only exports the metrics. -> Added disk labels: "fail", "spare", "active" to indicate disk status -> hanged metric node_md_disks_total ==> node_md_disks_required -> Removed test cases for mdadm_linux.go, as the functionality they tested for has been moved to procfs project. Signed-off-by: Advait Bhatwadekar <advait123@ymail.com>
* Closes issue prometheus#261 on node_exporter. Delegated mdstat parsing to procfs project. mdadm_linux.go now only exports the metrics. -> Added disk labels: "fail", "spare", "active" to indicate disk status -> hanged metric node_md_disks_total ==> node_md_disks_required -> Removed test cases for mdadm_linux.go, as the functionality they tested for has been moved to procfs project. Signed-off-by: Advait Bhatwadekar <advait123@ymail.com>
Node exporter doesn't report amount of failed disks in mdadm, probably the most useful metric for this collector.
mdstat:
current metrics:
P.S.: I do see the
node_md_disks - node_md_disks_active
calculation but not sure how should it work with hot spares.The text was updated successfully, but these errors were encountered: