Skip to content

Commit

Permalink
vdev_open: clear async fault flag after reopen
Browse files Browse the repository at this point in the history
After c3f2f1a, vdev_fault_wanted is set on a vdev after a probe fails.
An end-of-txg async task is charged with actually faulting the vdev.

In a single-disk pool, the probe failure will degrade the last disk, and
then suspend the pool. However, vdev_fault_wanted is not cleared. After
the pool returns, the transaction finishes and the async task runs and
faults the vdev, which suspends the pool again.

The fix is simple: when reopening a vdev, clear the async fault flag. If
the vdev is still failed, the startup probe will quickly notice and
degrade/suspend it again. If not, all is well!

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Co-authored-by: Don Brady <don.brady@klarasystems.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Don Brady <don.brady@klarasystems.com>
  • Loading branch information
2 people authored and tonyhutter committed Jul 17, 2024
1 parent 393b7ad commit 5de3ac2
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions module/zfs/vdev.c
Original file line number Diff line number Diff line change
Expand Up @@ -2021,6 +2021,7 @@ vdev_open(vdev_t *vd)
vd->vdev_stat.vs_aux = VDEV_AUX_NONE;
vd->vdev_cant_read = B_FALSE;
vd->vdev_cant_write = B_FALSE;
vd->vdev_fault_wanted = B_FALSE;
vd->vdev_min_asize = vdev_get_min_asize(vd);

/*
Expand Down

0 comments on commit 5de3ac2

Please sign in to comment.