-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PV monitoring proposal #1484
PV monitoring proposal #1484
Conversation
@NickrenREN: GitHub didn't allow me to request PR reviews from the following users: ddysher. Note that only kubernetes members can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
4b7a149
to
484ad79
Compare
Take local storage as an example, implementation may be like this: | ||
|
||
``` | ||
func (mon *minitor) CheckStatus(spec *v1.PersistentVolumeSpec) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs further discussion
It is not clear how apps can react to bad news and recover from failure. I'd be very hesitant to allow Statefulset to delete PVC should there be failure detected:
|
|
||
### Monitoring controller: | ||
|
||
Like PV controller, monitoring controller should check PVs’ health condition periodically and taint them if PVs are unhealthy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when PVs are tainted? Currently we do not have ability to taint PVs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, at the first phase, do not need to change PV struct, we will mark PV by adding annotations instead.
|
||
We can separate the proposal into two parts: | ||
|
||
* monitoring PVs and marking them if they have problems |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly will it monitor btw? Are you talking about bad blocks or fsck? Please add some examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will monitor if volumes still exist or if their status are ok.
e.g.
For local PV: If its Path( directory) is deleted by mistake or node breaks down... this will cause data loss, kubernetes needs to know that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to explicitly write down some specific error conditions somewhere.
@redbaron agree. That is why we should clearly document here exactly what use cases we are trying to solve. And also this reaction part needs to be opt-in by the application, so that only those that can recover from these scenarios can benefit. From a workloads point of view, we want to target distributed applications like Cassandra, that replica their data across multiple instances for redundancy. From a platform point of view, the infrastructure failures I'm thinking of are:
|
a3b2291
to
908ad48
Compare
@redbaron sorry for the delay. yeah, if we want to delete the PVC and pods directly and reschedule them to a new node, the application needs to have data backup and can restore it or can tolerate data loss |
269479f
to
9393e1a
Compare
cc @vkamra |
return ... | ||
} | ||
|
||
// check volume health condition depending on device type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you outline here what kind of checks you want to do? What are some general checks, and what are some specific checks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
general checks: volume accessing check, volume existing check ...
specific checks: mountpoint check, local path check,...
will update
|
||
At the first phase, we just consider the statefulset reaction. | ||
|
||
statefulset reaction: check the annotation timestamp, if the PV can recover within the predefined time interval, we will do nothing, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this controller know which StatefulSets to monitor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by monitoring relatedPVs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we mark which PVs we should monitor? For example, not all StatefulSets may want this behavior. So we need a way to be able to say which StatefulSets/PVs should be handled by the reaction controller.
BTW, @jingxu97 just showed me some cool work related to a Snapshot policy controller using the metacontroller framework. A CRD is defined that specifies:
- How often a snapshot should be taken
- Label selector to choose which PVCs should follow this policy.
I think we could do something similar for this controller, and have a CRD to define a policy with:
- What conditions to react to
- Time thresholds before reacting
- What should the action be for each condition
- Label selector for which StatefulSets to monitor
|
||
} | ||
``` | ||
If monitor finds that one PV is unhealthy, it will mark the PV by adding annotations including timestamp. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you define the annotation key/value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, sure, defined in demo PR: kubernetes-retired/external-storage#528
will update this proposal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you define the annotation format and syntax in this design doc, and also put some of the examples here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
which is responsible for creating informers, watching Node and PV events and calling each plugin’s monitor functions. | ||
And each volume plugin will create its own monitor to check its volumes’ status. | ||
|
||
#### For local storage: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be very interested in seeing an outline for monitoring node failures, which is a failure mode that local volumes are especially prone to compared to other volume types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, will do
|
||
We can separate the proposal into two parts: | ||
|
||
* monitoring PVs and marking them if they have problems |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to explicitly write down some specific error conditions somewhere.
## User Experience | ||
### Use Cases | ||
|
||
* Users create many local PVs manually and want to monitor their health condition; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since all of the use cases here are for local PVs, should we just have this document be specific to local PVs for now? We can consider extending this to other PV types in the future if the need arises.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we want to focus on local storage at the first stage, and then move to other storage drivers if needed.
* Fill PV cache from etcd; | ||
* Watch for PV update events; | ||
* Resync and populate periodically; | ||
* Delete related PVC and pods if needed (just for statefulsets and reclaim PV depending on reclaim policy); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, what application are you planning to use with this? It may be good to mention in the beginning use cases what kinds of applications this is suitable for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we want to delete the PV directly, the application should have data backup and can restore it or can tolerate data loss.
9393e1a
to
d082a85
Compare
@ddysher @msau42 update the use cases here, and will modify the rest accordingly. |
|
||
At the first phase, we just consider the statefulset reaction. | ||
|
||
statefulset reaction: check the annotation timestamp, if the PV can recover within the predefined time interval, we will do nothing, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we mark which PVs we should monitor? For example, not all StatefulSets may want this behavior. So we need a way to be able to say which StatefulSets/PVs should be handled by the reaction controller.
BTW, @jingxu97 just showed me some cool work related to a Snapshot policy controller using the metacontroller framework. A CRD is defined that specifies:
- How often a snapshot should be taken
- Label selector to choose which PVCs should follow this policy.
I think we could do something similar for this controller, and have a CRD to define a policy with:
- What conditions to react to
- Time thresholds before reacting
- What should the action be for each condition
- Label selector for which StatefulSets to monitor
|
||
} | ||
``` | ||
If monitor finds that one PV is unhealthy, it will mark the PV by adding annotations including timestamp. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you define the annotation format and syntax in this design doc, and also put some of the examples here too?
## User Experience | ||
### Use Cases | ||
|
||
* If the local PV path is deleted, users should know that and the local PV should be marked and deleted; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like there's two different reaction controllers being proposed here?
- One for StatefulSets that are using these PVs
- One for local PVs
Could you try to clarify this in the doc and separate out which behaviors will be handled by which controllers? I think one of the confusing things here is there's multiple levels/layers of reaction and it's not clear how they will all interact with each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe this design should just focus on the local PV monitoring part. And we can leave StatefulSet handling to a different design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, reaction needs more discussion, so i will move it to next stage.
d082a85
to
9c6eb40
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
9c6eb40
to
8e0a47a
Compare
@NickrenREN @msau42 what's the status of the proposal :) |
@ddysher I am prototyping PV monitor here in https://github.com/caicloud/kube-storage-monitor |
I probably won't have time to thoroughly review this until Q3 timeframe. But it would help if in this design proposal, you could summarize things into a table with the following columns:
Also it would be good to think about some of these questions:
I really like the idea of having CRDs where you can define your reaction policies, and then using something like the metacontroller to write a reaction controller. I would sync up with @jingxu97 on the work she did for snapshot policies. |
I opened up kubernetes-retired/external-storage#817 to discuss ideas for handling the node deletion scenario. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle rotten |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Closing, in favor of kubernetes/enhancements#1077 |
Add a proposal for PV monitoring.
We may focus on local storage PV monitoring at the first phase.
/assign @msau42
/cc @ddysher @jingxu97
/sig storage