pool corrupted after hibernation #14118

a14a · 2022-10-31T19:13:43Z

System information

Type	Version/Name
Distribution Name	Gentoo
Distribution Version	-
Kernel Version	5.18.0
Architecture	x86_64
OpenZFS Version	zfs-2.1.99-694_g17b2ae0b2

Describe the problem you're observing

Zpool corrupted after resume from swap, hardware verified as healthy. Single ZFS disk, separate swap device, no zvols or luks.

This is a laptop with ZFS as root filesystem. The hardware is a single 1 TB NVMe drive with three GPT partitions for EFI, boot pool and root pool. No luks or dm-crypt involved. No separate log devices.

I use hibernation to a separate physical SSD block device, formatted as raw swap, no zvol. Resume is done by Gentoo's standard genkernel initrd built via genkernel initramfs --zfs, i don't expect initrd to be causing it.

The hardware is 1 year old, filesystem was 2 months old, with hibernation in daily use. The corruption occurred after several quick hibernate/resume cycles, i hibernated and resumed the machine maybe five or six times in a row.

# zpool import -R /mnt2 amber_rpool
cannot import 'amber_rpool': I/O error
        Destroy and re-create the pool from
        a backup source.
  
# zpool import -R /mnt2 -o readonly=on amber_rpool
cannot import 'amber_rpool': I/O error
        Destroy and re-create the pool from
        a backup source.

I attempted rewinding to an earlier transaction without success. Some zdb info below.

pool debug info

# zpool import -FfXm -T 3938230 -d /dev/nvme1n1p3
   pool: amber_rpool
     id: 13295185738152581673
  state: FAULTED
status: The pool metadata is corrupted.
 action: The pool cannot be imported due to damaged devices or data.
        The pool may be active on another system, but can be imported using
        the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-72
 config:

        amber_rpool  FAULTED  corrupted data
          nvme1n1p3  ONLINE

# zdb -ei amber_rpool
Dataset mos [META], ID 0, cr_txg 4, 179M, 343 objects
Dataset amber_rpool/crypt/log@crash-recovery-2022-03-02 [ZPL], ID 562, cr_txg 18246, 434M, 228 objects
Dataset amber_rpool/crypt/log@manual-2022-04-08 [ZPL], ID 56483, cr_txg 1430821, 435M, 226 objects
Dataset amber_rpool/crypt/log [ZPL], ID 680, cr_txg 132, 435M, 225 objects

    ZIL header: claim_txg 3938577, claim_blk_seq 2, claim_lr_seq 0 replay_seq 0, flags 0x2
Dataset amber_rpool/crypt/home/n/t@crash-recovery-2022-03-02 [ZPL], ID 814, cr_txg 18246, 216K, 7 objects
Dataset amber_rpool/crypt/home/n/t@manual-2022-04-08 [ZPL], ID 56489, cr_txg 1430821, 929M, 415 objects
Dataset amber_rpool/crypt/home/n/t [ZPL], ID 77, cr_txg 114, 1.47G, 781 objects

    ZIL header: claim_txg 3938577, claim_blk_seq 1, claim_lr_seq 0 replay_seq 0, flags 0x2
Dataset amber_rpool/crypt/home/n@crash-recovery-2022-03-02 [ZPL], ID 812, cr_txg 18246, 62.8G, 438804 objects
Dataset amber_rpool/crypt/home/n@znap_frequent-2022-02-28-2000 [ZPL], ID 29, cr_txg 298, 200K, 7 objects
Dataset amber_rpool/crypt/home/n@znap_hourly-2022-02-28-2000 [ZPL], ID 273, cr_txg 299, 200K, 7 objects
Dataset amber_rpool/crypt/home/n@manual-2022-04-08 [ZPL], ID 56487, cr_txg 1430821, 61.9G, 532301 objects
Dataset amber_rpool/crypt/home/n [ZPL], ID 138, cr_txg 113, 88.1G, 538753 objects

    ZIL header: claim_txg 3938577, claim_blk_seq 825, claim_lr_seq 0 replay_seq 0, flags 0x2
Dataset amber_rpool/crypt/home@manual-2022-04-08 [ZPL], ID 56485, cr_txg 1430821, 13.9G, 166866 objects
Dataset amber_rpool/crypt/home@znap_frequent-2022-02-28-2000 [ZPL], ID 396, cr_txg 296, 200K, 7 objects
Dataset amber_rpool/crypt/home@znap_hourly-2022-02-28-2000 [ZPL], ID 25, cr_txg 297, 200K, 7 objects
Dataset amber_rpool/crypt/home@crash-recovery-2022-03-02 [ZPL], ID 810, cr_txg 18246, 13.9G, 166866 objects
Dataset amber_rpool/crypt/home [ZPL], ID 671, cr_txg 110, 13.9G, 166866 objects
Dataset amber_rpool/crypt/ROOT/gentoo/distfiles@manual-2022-04-08 [ZPL], ID 56495, cr_txg 1430821, 1.66G, 321 objects
Dataset amber_rpool/crypt/ROOT/gentoo/distfiles@crash-recovery-2022-03-02 [ZPL], ID 824, cr_txg 18246, 1.32G, 308 objects
failed to hold dataset 'amber_rpool/crypt/ROOT/gentoo/distfiles': Input/output error
Dataset amber_rpool/crypt/ROOT/gentoo/packages@manual-2022-04-08 [ZPL], ID 56497, cr_txg 1430821, 27.2G, 4486 objects
Dataset amber_rpool/crypt/ROOT/gentoo/packages@crash-recovery-2022-03-02 [ZPL], ID 826, cr_txg 18246, 24.1G, 4215 objects
Dataset amber_rpool/crypt/ROOT/gentoo/packages [ZPL], ID 663, cr_txg 102, 30.9G, 5039 objects
Dataset amber_rpool/crypt/ROOT/gentoo@znap_frequent-2022-02-28-2000 [ZPL], ID 153, cr_txg 294, 352K, 14 objects
Dataset amber_rpool/crypt/ROOT/gentoo@znap_hourly-2022-02-28-2000 [ZPL], ID 512, cr_txg 295, 352K, 14 objects
Dataset amber_rpool/crypt/ROOT/gentoo@crash-recovery-2022-03-02 [ZPL], ID 820, cr_txg 18246, 35.3G, 2193754 objects
Dataset amber_rpool/crypt/ROOT/gentoo@manual-2022-04-08 [ZPL], ID 56493, cr_txg 1430821, 36.5G, 2298111 objects
Dataset amber_rpool/crypt/ROOT/gentoo [ZPL], ID 655, cr_txg 85, 39.5G, 2545567 objects

    ZIL header: claim_txg 3938577, claim_blk_seq 292, claim_lr_seq 0 replay_seq 0, flags 0x2
Dataset amber_rpool/crypt/ROOT@znap_frequent-2022-02-28-2000 [ZPL], ID 21, cr_txg 292, 192K, 6 objects
Dataset amber_rpool/crypt/ROOT@manual-2022-04-08 [ZPL], ID 56491, cr_txg 1430821, 192K, 6 objects
Dataset amber_rpool/crypt/ROOT@crash-recovery-2022-03-02 [ZPL], ID 816, cr_txg 18246, 192K, 6 objects
Dataset amber_rpool/crypt/ROOT@znap_hourly-2022-02-28-2000 [ZPL], ID 149, cr_txg 293, 192K, 6 objects
Dataset amber_rpool/crypt/ROOT [ZPL], ID 772, cr_txg 83, 192K, 6 objects
Dataset amber_rpool/crypt@znap_frequent-2022-02-28-2000 [ZPL], ID 134, cr_txg 290, 192K, 6 objects
Dataset amber_rpool/crypt@crash-recovery-2022-03-02 [ZPL], ID 560, cr_txg 18246, 192K, 6 objects
Dataset amber_rpool/crypt@manual-2022-04-08 [ZPL], ID 56481, cr_txg 1430821, 192K, 6 objects
Dataset amber_rpool/crypt@znap_hourly-2022-02-28-2000 [ZPL], ID 145, cr_txg 291, 192K, 6 objects
Dataset amber_rpool/crypt [ZPL], ID 387, cr_txg 45, 192K, 6 objects
Dataset amber_rpool@znap_hourly-2022-02-28-2000 [ZPL], ID 392, cr_txg 289, 96K, 6 objects
Dataset amber_rpool@znap_frequent-2022-02-28-2015 [ZPL], ID 278, cr_txg 360, 96K, 6 objects
Dataset amber_rpool@manual-2022-04-08 [ZPL], ID 56479, cr_txg 1430821, 96K, 6 objects
Dataset amber_rpool@znap_frequent-2022-02-28-2000 [ZPL], ID 269, cr_txg 288, 96K, 6 objects
Dataset amber_rpool@crash-recovery-2022-03-02 [ZPL], ID 556, cr_txg 18246, 96K, 6 objects
Dataset amber_rpool [ZPL], ID 54, cr_txg 1, 96K, 6 objects
MOS object 2 (bpobj) leaked
MOS object 67 (zap) leaked
MOS object 68 (DSL props) leaked
MOS object 69 (DSL directory child map) leaked
MOS object 70 (zap) leaked
MOS object 71 (DSL dataset snap map) leaked
MOS object 73 (zap) leaked
MOS object 20691 (bpobj) leaked
MOS object 56496 (DSL deadlist map) leaked
Verified large_blocks feature refcount of 0 is correct
large_dnode feature refcount mismatch: 36 consumers != 37 refcount
Verified sha512 feature refcount of 0 is correct
Verified skein feature refcount of 0 is correct
Verified edonr feature refcount of 0 is correct
userobj_accounting feature refcount mismatch: 36 consumers != 37 refcount
encryption feature refcount mismatch: 36 consumers != 37 refcount
project_quota feature refcount mismatch: 36 consumers != 37 refcount
Verified redaction_bookmarks feature refcount of 0 is correct
Verified redacted_datasets feature refcount of 0 is correct
Verified bookmark_written feature refcount of 0 is correct
Verified livelist feature refcount of 0 is correct
Verified zstd_compress feature refcount of 0 is correct
root@nkarch ~ # zdb -ul /dev/nvme1n1p3
------------------------------------
LABEL 0
------------------------------------
    version: 5000
    name: 'amber_rpool'
    state: 0
    txg: 3937721
    pool_guid: 13295185738152581673
    errata: 0
    hostname: '(none)'
    top_guid: 6749422823089631006
    guid: 6749422823089631006
    vdev_children: 1
    vdev_tree:
        type: 'disk'
        id: 0
        guid: 6749422823089631006
        path: '/dev/nvme0n1p3'
        whole_disk: 0
        metaslab_array: 256
        metaslab_shift: 33
        ashift: 12
        asize: 1021519331328
        is_log: 0
        create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 0 1 2 3
    Uberblock[0]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938528
        guid_sum = 1597864487532661063
        timestamp = 1655822150 UTC = Tue Jun 21 16:35:50 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[1]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938529
        guid_sum = 1597864487532661063
        timestamp = 1655822156 UTC = Tue Jun 21 16:35:56 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[2]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938466
        guid_sum = 1597864487532661063
        timestamp = 1655822076 UTC = Tue Jun 21 16:34:36 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[3]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938467
        guid_sum = 1597864487532661063
        timestamp = 1655822076 UTC = Tue Jun 21 16:34:36 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[4]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938468
        guid_sum = 1597864487532661063
        timestamp = 1655822076 UTC = Tue Jun 21 16:34:36 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[5]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938501
        guid_sum = 1597864487532661063
        timestamp = 1655822107 UTC = Tue Jun 21 16:35:07 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[6]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938566
        guid_sum = 1597864487532661063
        timestamp = 1655822194 UTC = Tue Jun 21 16:36:34 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[7]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938567
        guid_sum = 1597864487532661063
        timestamp = 1655822199 UTC = Tue Jun 21 16:36:39 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[8]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938568
        guid_sum = 1597864487532661063
        timestamp = 1655822204 UTC = Tue Jun 21 16:36:44 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[9]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938569
        guid_sum = 1597864487532661063
        timestamp = 1655822209 UTC = Tue Jun 21 16:36:49 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[10]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938570
        guid_sum = 1597864487532661063
        timestamp = 1655822215 UTC = Tue Jun 21 16:36:55 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[11]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938475
        guid_sum = 1597864487532661063
        timestamp = 1655822084 UTC = Tue Jun 21 16:34:44 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[12]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938540
        guid_sum = 1597864487532661063
        timestamp = 1655822171 UTC = Tue Jun 21 16:36:11 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[13]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938541
        guid_sum = 1597864487532661063
        timestamp = 1655822176 UTC = Tue Jun 21 16:36:16 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[14]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938542
        guid_sum = 1597864487532661063
        timestamp = 1655822181 UTC = Tue Jun 21 16:36:21 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[15]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938575
        guid_sum = 1597864487532661063
        timestamp = 1655822215 UTC = Tue Jun 21 16:36:55 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[16]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938576
        guid_sum = 1597864487532661063
        timestamp = 1655822215 UTC = Tue Jun 21 16:36:55 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[17]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938577
        guid_sum = 1597864487532661063
        timestamp = 1655857063 UTC = Wed Jun 22 02:17:43 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[18]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938578
        guid_sum = 1597864487532661063
        timestamp = 1655861150 UTC = Wed Jun 22 03:25:50 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[19]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938579
        guid_sum = 1597864487532661063
        timestamp = 1655879645 UTC = Wed Jun 22 08:34:05 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[20]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938580
        guid_sum = 1597864487532661063
        timestamp = 1656359461 UTC = Mon Jun 27 21:51:01 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[21]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938581
        guid_sum = 1597864487532661063
        timestamp = 1656364059 UTC = Mon Jun 27 23:07:39 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[22]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938582
        guid_sum = 1597864487532661063
        timestamp = 1656931598 UTC = Mon Jul  4 12:46:38 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[23]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938327
        guid_sum = 1597864487532661063
        timestamp = 1655821889 UTC = Tue Jun 21 16:31:29 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[24]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938552
        guid_sum = 1597864487532661063
        timestamp = 1655822194 UTC = Tue Jun 21 16:36:34 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[25]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938553
        guid_sum = 1597864487532661063
        timestamp = 1655822194 UTC = Tue Jun 21 16:36:34 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[26]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938394
        guid_sum = 1597864487532661063
        timestamp = 1655821988 UTC = Tue Jun 21 16:33:08 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[27]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938171
        guid_sum = 1597864487532661063
        timestamp = 1655821725 UTC = Tue Jun 21 16:28:45 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[28]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938460
        guid_sum = 1597864487532661063
        timestamp = 1655822071 UTC = Tue Jun 21 16:34:31 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[29]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938461
        guid_sum = 1597864487532661063
        timestamp = 1655822076 UTC = Tue Jun 21 16:34:36 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[30]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938462
        guid_sum = 1597864487532661063
        timestamp = 1655822076 UTC = Tue Jun 21 16:34:36 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3
    Uberblock[31]
        magic = 0000000000bab10c
        version = 5000
        txg = 3938463
        guid_sum = 1597864487532661063
        timestamp = 1655822076 UTC = Tue Jun 21 16:34:36 2022
        mmp_magic = 00000000a11cea11
        mmp_delay = 0
        mmp_valid = 0
        checkpoint_txg = 0
        labels = 0 1 2 3

Eventually i was able to override the errors by disabling spa_load_verify_data and spa_load_verify_metadata. After that, import would simply freeze indefinitely. However, read-only import works when used with those overrides.

trace

# echo 0 > /sys/module/zfs/parameters/spa_load_verify_metadata; echo 0 > /sys/module/zfs/parameters/spa_load_verify_data
# zpool import -R /mnt2 amber_rpool
(this will freeze)

# dmesg
...
[ 8154.710601] WARNING: can't open objset 70, error 5
[ 8154.735379] WARNING: can't open objset 70, error 5
[ 8161.259640] WARNING: can't open objset 70, error 5
[ 8161.284331] WARNING: can't open objset 70, error 5
[ 8167.034045] WARNING: can't open objset 70, error 5
[ 8167.035696] WARNING: can't open objset for 70, error 5
[ 8167.113492] WARNING: Pool 'amber_rpool' has encountered an uncorrectable I/O failure and has been suspended.

[ 8355.560061] INFO: task zpool:9424 blocked for more than 122 seconds.
[ 8355.560072]       Tainted: P           OE     5.18.8-arch1-1 #1
[ 8355.560076] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 8355.560079] task:zpool           state:D stack:    0 pid: 9424 ppid:  1709 flags:0x00004006
[ 8355.560090] Call Trace:
[ 8355.560093]  <TASK>
[ 8355.560101]  __schedule+0x37c/0x11f0
[ 8355.560119]  ? autoremove_wake_function+0x2f/0x60
[ 8355.560134]  schedule+0x4f/0xb0
[ 8355.560144]  io_schedule+0x46/0x70
[ 8355.560156]  cv_wait_common+0xaa/0x130 [spl 2cf84daf96686f1282819fc77b136156523175de]
[ 8355.560180]  ? cpuacct_percpu_seq_show+0x20/0x20
[ 8355.560193]  txg_wait_synced_impl+0xc8/0x110 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.560536]  txg_wait_synced+0x10/0x40 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.560846]  spa_config_update+0x46/0x170 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.561145]  spa_import+0x60e/0x870 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.561443]  zfs_ioc_pool_import+0x158/0x180 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.561762]  zfsdev_ioctl_common+0x8d3/0xa00 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.562090]  zfsdev_ioctl+0x53/0xd0 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.562399]  __x64_sys_ioctl+0x8e/0xc0
[ 8355.562410]  do_syscall_64+0x5c/0x90
[ 8355.562420]  ? do_user_addr_fault+0x1db/0x680
[ 8355.562433]  ? exc_page_fault+0x74/0x170
[ 8355.562439]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 8355.562449] RIP: 0033:0x7f7d877077af
[ 8355.562456] RSP: 002b:00007fff312a4470 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 8355.562464] RAX: ffffffffffffffda RBX: 0000559051f93630 RCX: 00007f7d877077af
[ 8355.562468] RDX: 00007fff312a4540 RSI: 0000000000005a02 RDI: 0000000000000003
[ 8355.562473] RBP: 00007fff312a8430 R08: 000000000000b0b0 R09: 00007f7d877fe3d0
[ 8355.562476] R10: 00007f7d877fe3d0 R11: 0000000000000246 R12: 00007fff312a4540
[ 8355.562480] R13: 00007f7d700014f8 R14: 0000000000000000 R15: 0000559051fc4d90
[ 8355.562488]  </TASK>
[ 8355.562504] INFO: task txg_sync:9554 blocked for more than 122 seconds.
[ 8355.562509]       Tainted: P           OE     5.18.8-arch1-1 #1
[ 8355.562512] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 8355.562514] task:txg_sync        state:D stack:    0 pid: 9554 ppid:     2 flags:0x00004000
[ 8355.562523] Call Trace:
[ 8355.562526]  <TASK>
[ 8355.562529]  __schedule+0x37c/0x11f0
[ 8355.562543]  schedule+0x4f/0xb0
[ 8355.562552]  schedule_timeout+0x98/0x150
[ 8355.562559]  ? __bpf_trace_tick_stop+0x10/0x10
[ 8355.562566]  io_schedule_timeout+0x50/0x80
[ 8355.562576]  __cv_timedwait_common+0x12c/0x170 [spl 2cf84daf96686f1282819fc77b136156523175de]
[ 8355.562598]  ? cpuacct_percpu_seq_show+0x20/0x20
[ 8355.562610]  __cv_timedwait_io+0x19/0x20 [spl 2cf84daf96686f1282819fc77b136156523175de]
[ 8355.562629]  zio_wait+0x13a/0x2b0 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.562953]  dmu_buf_will_dirty_impl+0xa6/0x150 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.563189]  dmu_write_impl+0x3f/0xd0 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.563404]  dmu_write+0xcc/0x190 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.563617]  spa_history_write+0x18f/0x1d0 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.563916]  spa_history_log_sync+0x19d/0x7c0 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.564213]  dsl_sync_task_sync+0xa6/0xf0 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.564481]  dsl_pool_sync+0x3f6/0x510 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.564607]  spa_sync+0x553/0xf70 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.564697]  ? spa_txg_history_init_io+0x114/0x120 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.564791]  txg_sync_thread+0x279/0x480 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.564885]  ? txg_fini+0x270/0x270 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8355.564978]  ? __thread_exit+0x20/0x20 [spl 2cf84daf96686f1282819fc77b136156523175de]
[ 8355.564987]  thread_generic_wrapper+0x5b/0x70 [spl 2cf84daf96686f1282819fc77b136156523175de]
[ 8355.564995]  kthread+0xdb/0x110
[ 8355.564999]  ? kthread_complete_and_exit+0x20/0x20
[ 8355.565003]  ret_from_fork+0x1f/0x30
[ 8355.565007]  </TASK>

[ 8478.439751] INFO: task zpool:9424 blocked for more than 245 seconds.
[ 8478.439763]       Tainted: P           OE     5.18.8-arch1-1 #1
[ 8478.439766] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 8478.439769] task:zpool           state:D stack:    0 pid: 9424 ppid:  1709 flags:0x00004006
[ 8478.439780] Call Trace:
[ 8478.439783]  <TASK>
[ 8478.439790]  __schedule+0x37c/0x11f0
[ 8478.439809]  ? autoremove_wake_function+0x2f/0x60
[ 8478.439823]  schedule+0x4f/0xb0
[ 8478.439832]  io_schedule+0x46/0x70
[ 8478.439844]  cv_wait_common+0xaa/0x130 [spl 2cf84daf96686f1282819fc77b136156523175de]
[ 8478.439867]  ? cpuacct_percpu_seq_show+0x20/0x20
[ 8478.439879]  txg_wait_synced_impl+0xc8/0x110 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8478.440191]  txg_wait_synced+0x10/0x40 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8478.440476]  spa_config_update+0x46/0x170 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8478.440752]  spa_import+0x60e/0x870 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8478.441028]  zfs_ioc_pool_import+0x158/0x180 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8478.441324]  zfsdev_ioctl_common+0x8d3/0xa00 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8478.441634]  zfsdev_ioctl+0x53/0xd0 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8478.441919]  __x64_sys_ioctl+0x8e/0xc0
[ 8478.441930]  do_syscall_64+0x5c/0x90
[ 8478.441941]  ? do_user_addr_fault+0x1db/0x680
[ 8478.441953]  ? exc_page_fault+0x74/0x170
[ 8478.441960]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 8478.441969] RIP: 0033:0x7f7d877077af
[ 8478.441977] RSP: 002b:00007fff312a4470 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 8478.441985] RAX: ffffffffffffffda RBX: 0000559051f93630 RCX: 00007f7d877077af
[ 8478.441989] RDX: 00007fff312a4540 RSI: 0000000000005a02 RDI: 0000000000000003
[ 8478.441993] RBP: 00007fff312a8430 R08: 000000000000b0b0 R09: 00007f7d877fe3d0
[ 8478.441997] R10: 00007f7d877fe3d0 R11: 0000000000000246 R12: 00007fff312a4540
[ 8478.442001] R13: 00007f7d700014f8 R14: 0000000000000000 R15: 0000559051fc4d90
[ 8478.442009]  </TASK>
[ 8601.319173] INFO: task zpool:9424 blocked for more than 368 seconds.
[ 8601.319185]       Tainted: P           OE     5.18.8-arch1-1 #1
[ 8601.319190] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 8601.319192] task:zpool           state:D stack:    0 pid: 9424 ppid:  1709 flags:0x00004006
[ 8601.319203] Call Trace:
[ 8601.319206]  <TASK>
[ 8601.319213]  __schedule+0x37c/0x11f0
[ 8601.319230]  ? autoremove_wake_function+0x2f/0x60
[ 8601.319244]  schedule+0x4f/0xb0
[ 8601.319253]  io_schedule+0x46/0x70
[ 8601.319265]  cv_wait_common+0xaa/0x130 [spl 2cf84daf96686f1282819fc77b136156523175de]
[ 8601.319288]  ? cpuacct_percpu_seq_show+0x20/0x20
[ 8601.319301]  txg_wait_synced_impl+0xc8/0x110 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8601.319612]  txg_wait_synced+0x10/0x40 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8601.319901]  spa_config_update+0x46/0x170 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8601.320176]  spa_import+0x60e/0x870 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8601.320448]  zfs_ioc_pool_import+0x158/0x180 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8601.320743]  zfsdev_ioctl_common+0x8d3/0xa00 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8601.321046]  zfsdev_ioctl+0x53/0xd0 [zfs 8aa59c09c28351724b1ba2221e0acd9070264074]
[ 8601.321331]  __x64_sys_ioctl+0x8e/0xc0
[ 8601.321342]  do_syscall_64+0x5c/0x90
[ 8601.321352]  ? do_user_addr_fault+0x1db/0x680
[ 8601.321365]  ? exc_page_fault+0x74/0x170
[ 8601.321371]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 8601.321380] RIP: 0033:0x7f7d877077af
[ 8601.321387] RSP: 002b:00007fff312a4470 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 8601.321395] RAX: ffffffffffffffda RBX: 0000559051f93630 RCX: 00007f7d877077af
[ 8601.321399] RDX: 00007fff312a4540 RSI: 0000000000005a02 RDI: 0000000000000003
[ 8601.321404] RBP: 00007fff312a8430 R08: 000000000000b0b0 R09: 00007f7d877fe3d0
[ 8601.321408] R10: 00007f7d877fe3d0 R11: 0000000000000246 R12: 00007fff312a4540
[ 8601.321411] R13: 00007f7d700014f8 R14: 0000000000000000 R15: 0000559051fc4d90
[ 8601.321419]  </TASK>

As long as i import read-only, the pool seems to be okay, the failures mentioned in zdb dump only seem to have broken one disposable dataset.

# echo 0 > /sys/module/zfs/parameters/spa_load_verify_metadata; echo 0 > /sys/module/zfs/parameters/spa_load_verify_data
# zpool import -R /mnt2 -o readonly=on amber_rpool
# zfs list
cannot iterate filesystems: I/O error
NAME                                USED  AVAIL     REFER  MOUNTPOINT
amber_rpool                         217G   697G       96K  /mnt2
amber_rpool/crypt                   217G   697G      192K  none
amber_rpool/crypt/ROOT             91.8G   697G      192K  none
amber_rpool/crypt/ROOT/gentoo      91.8G   697G     39.5G  /mnt2
amber_rpool/crypt/home              125G   697G     13.9G  /mnt2/home
amber_rpool/crypt/home/n            111G   697G     88.1G  /mnt2/home/n
amber_rpool/crypt/home/n/t         1.47G   697G     1.47G  /mnt2/home/n/t
amber_rpool/crypt/log               437M   697G      435M  /mnt2/var/log

# zpool status -v
  pool: amber_rpool
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
config:

        NAME         STATE     READ WRITE CKSUM
        amber_rpool  ONLINE       0     0     0
          nvme1n1p3  ONLINE       0     0    24

errors: Permanent errors have been detected in the following files:

        amber_rpool/crypt/ROOT/gentoo/distfiles:<0x0>

With read-only access i was able to salvage all my data and migrate the system to a different device. However the original device containing the crashed pool seems to be in perfect health. I have successfully copied the entire device with no errors, also i did not observe any block device error messages.

block device test

# dd if=/dev/disk/by-id/nvme-KXG6AZNV1T02_TOSHIBA_31DF31IHFDS3 bs=128M status=progress | gzip | ssh n@d dd bs=128M of=downloads/toshiba-dd.gz
1024209543168 bytes (1.0 TB, 954 GiB) copied, 12904.8 s, 79.4 MB/s
0+7765443 records in
0+7765443 records out
255021655697 bytes (255 GB, 238 GiB) copied, 12904.7 s, 19.8 MB/s
# md5sum /dev/disk/by-id/nvme-KXG6AZNV1T02_TOSHIBA_31DF31IHFDS3
255183a6436e6607ebe6707e41a213ee  /dev/disk/by-id/nvme-KXG6AZNV1T02_TOSHIBA_31DF31IHFDS3
# ssh n@d "zcat downloads/toshiba-dd.gz | md5sum"
255183a6436e6607ebe6707e41a213ee  -

I still have the disk and the crashed pool on it available for tests. I already salvaged all data but this seems like a perfect test bed to hunt a possible bug? To the best of my knowledge the initrd image doesn't attempt mounting the pool like in NixOS/nixpkgs#106093 and i don't think i have any scripts that would modify the pool outside the intended resume sequence. What would be the steps to identify the specific cause of this failure?

The text was updated successfully, but these errors were encountered:

rincebrain · 2022-11-03T02:31:27Z

Speaking only for myself, my advice was, is, and remains don't hibernate with ZFS. There's, AFAIK, nothing in the test suite that tests that it works, and it's very easy for the resume process to do the wrong thing and eat your data.

Other people disagree with me, but I've had too many people reporting that their data got eaten after hibernation, and as nobody seems to be actively working on making it less of a footgun, I just advise people not to play with fire for now.

cf. #12842, this comment.

a14a · 2022-11-04T13:47:08Z

I definitely agree with the recommendation, but despite the brittleness this is already "sort of" supported by distributions, I guess the best we can do is document the possible scenarios. As for me personally, S3 state overheats the laptop's RAM in my backpack and i can afford risking data loss so i'll keep trying.

Resume is done by Gentoo's standard genkernel initrd built via genkernel initramfs --zfs, i don't expect initrd to be causing it.

This turned out false, genkernel's initrd does import my root pool before resume in https://gitweb.gentoo.org/proj/genkernel.git/tree/defaults/linuxrc#n657 , i.e. it runs start_volumes before do_resume, exactly as in https://bugs.gentoo.org/827281 . I reversed that order in my initrd and it seems to be working so far.

If i were to run two drives in a mirror configuration for the root pool, would that make things even worse for hibernation, would the extra device make the pool more likely to fail upon resume, or not? (obviously raid1 is preferential in many cases but what about the specific resume case?)

bghira · 2022-11-04T19:08:57Z

if it is supported by your distribution, you should open a downstream ticket with them to have it resolved. because the OpenZFS project does not yet support that use case.

a14a added the Type: Defect Incorrect behavior (e.g. crash, hang) label Oct 31, 2022

nh2 mentioned this issue Nov 28, 2022

make Linux hibernation (suspend-to-disk) more robust #12842

Open

edanaher mentioned this issue Jan 30, 2023

Allow management of faulted zpools #14444

Open

GregorKopka mentioned this issue Mar 8, 2023

ability to supend (and resume) all zpool IO (zpool suspend|resume) #12843

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pool corrupted after hibernation #14118

pool corrupted after hibernation #14118

a14a commented Oct 31, 2022

rincebrain commented Nov 3, 2022

a14a commented Nov 4, 2022 •

edited

Loading

bghira commented Nov 4, 2022

pool corrupted after hibernation #14118

pool corrupted after hibernation #14118

Comments

a14a commented Oct 31, 2022

System information

Describe the problem you're observing

rincebrain commented Nov 3, 2022

a14a commented Nov 4, 2022 • edited Loading

bghira commented Nov 4, 2022

a14a commented Nov 4, 2022 •

edited

Loading