-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the race between PVC finalizer and snapshot finalizer removal #360
Fix the race between PVC finalizer and snapshot finalizer removal #360
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: xing-yang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold |
I've tested this, and it addresses the issue that I was mentioning in #354 . I am able to delete a VolumeSnapshot that has a source of a non-CSI PVC. |
Just tested this but was unable to delete the PVC nor VolumeSnapshot. This was the manifest that was used, which tries creates a VolumeSnapshot at the same time the PVC is created. |
@xing-yang are we targetting this for 2.2? |
Yes. |
d9e0e44
to
1eaa86f
Compare
@chrishenzie I updated the PR. Can you test again? |
@xing-yang Just tested using the same manifest linked above. All resources deleted successfully without any issues. Thank you! |
@chrishenzie Thank you for testing this! |
/hold cancel |
1eaa86f
to
521502e
Compare
@ggriffiths Addressed your comments. PTAL. Thanks. |
@@ -1303,6 +1285,16 @@ func (ctrl *csiSnapshotCommonController) removeSnapshotFinalizer(snapshot *crdv1 | |||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removeSourceFinalizer will always be true, let's just remove this parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You meant the caller always passes in removeSourceFinalizer as true here? removeSnapshotFinalizer is a utility function so I'd like to make it usable even if removeSourceFinalizer is false for the completeness.
// If we are here, it means we are going to remove finalizers from the snapshot so that the snapshot can be deleted. | ||
// We need to check if there is still PVC finalizer that needs to be removed before removing the snapshot finalizers. | ||
// Once snapshot is deleted, there won't be any snapshot update event that can trigger the PVC finalizer removal. | ||
if err := ctrl.checkandRemovePVCFinalizer(snapshot, true); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this is the best place to put the fix.
checkandRemoveSnapshotFinalizersAndCheckandDeleteContent, which is the caller of this function, has quite some returning points, i.e., this:
if !utils.IsSnapshotDeletionCandidate(snapshot) { |
if any of those points is hit, it will still cause the PVC not deletable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little tricky. We have to make sure PVC finalizer is deleted before VolumeSnapshot is deleted. Once VolumeSnapshot is deleted, there's no sync routine that handles PVC finalizer deletion any more. We also can't remove PVC finalizer too early. PVC finalizer should stay on if a VolumeSnapshot is still using it.
I add the logic in the current place because this is the place where we are sure we want to delete VolumeSnapshot object.
@@ -170,17 +170,14 @@ func (ctrl *csiSnapshotCommonController) syncSnapshot(snapshot *crdv1.VolumeSnap | |||
klog.V(5).Infof("syncSnapshot [%s]: check if we should remove finalizer on snapshot PVC source and remove it if we can", utils.SnapshotKey(snapshot)) | |||
|
|||
// Check if we should remove finalizer on PVC and remove it if we can. | |||
if err := ctrl.checkandRemovePVCFinalizer(snapshot); err != nil { | |||
if err := ctrl.checkandRemovePVCFinalizer(snapshot, false); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, checkandRemovePVCFinalizer can take the responsibility to handle "snapshot.ObjectMeta.DeletionTimestamp != nil" case,
and the bool flag can be saved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is safer to have this bool flag passed in. Even when "snapshot.ObjectMeta.DeletionTimestamp != nil", it is possible that snapshot is also used as a source for another PVC. In that case, we still want to keep PVC finalizer on the snapshot's source if snapshot is not ready yet.
e077f67
to
505a0a4
Compare
/retest |
505a0a4
to
1ba5602
Compare
/lgtm |
Hi.
when list snapshot
When I try to delete everything is stuck
No any snapshot present in AWS
When I describe snapshot:
Log snapshot-controller:
Questins: How i can solve this issue ? please help |
What version of external-snapshotter are you using? |
I am using version:
aws-ebs-csi-driver
Additional test: In two different namespace I have two ready snapshots in each
When I delete volumesnapshot from near namespace result is successful :
But when I delete snapshot from geth namespace the command is stuck:
Both snapshot created with same way When I describe this snapshot I show followed events ErrorPVCFinalyzer:
Snapshotter LOG:
But if I execute follow command the snapshot delete successful:
|
Are you trying to modify VolumeSnapshot name? This field is immutable.
|
No I try to delete volumesnapshot.
LOG |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #349
Special notes for your reviewer:
Does this PR introduce a user-facing change?: