Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel NULL pointer dereference, address: 0000000000000058 #15171

Open
edward-shen opened this issue Aug 13, 2023 · 15 comments
Open

kernel NULL pointer dereference, address: 0000000000000058 #15171

edward-shen opened this issue Aug 13, 2023 · 15 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@edward-shen
Copy link

edward-shen commented Aug 13, 2023

System information

Type Version/Name
Distribution Name Arch Linux
Distribution Version Rolling release
Kernel Version 6.2.12-zen1-1-zen
Architecture x86_64
OpenZFS Version 2.1.11

Describe the problem you're observing

I've recently noticed that the zfs kernel module occasionally drops a kernel oops. This results in a semi-broken state, where it appears that the filesystem is still capable (e.g. touch foo on the dataset), but zpool status hangs, zfs list hangs, etc. Based on the call trace, anything that affects snapshots is affected.

This also stops me from updating the kernel or shutting down the system.

Describe how to reproduce the problem

I'm not sure quite how to reproduce it, but I know only of two operations going on:

  1. Snapshots being created/sent via zrepl.
  2. Scrubbing is occuring.

Unfortunately, this is pretty reproducible so I'm running into it <5 minutes after every boot.

Include any warning/errors/backtraces from the system logs

[  242.696596] BUG: kernel NULL pointer dereference, address: 0000000000000058
[  242.696601] #PF: supervisor read access in kernel mode
[  242.696602] #PF: error_code(0x0000) - not-present page
[  242.696603] PGD 0 P4D 0
[  242.696605] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  242.696607] CPU: 6 PID: 11527 Comm: txg_sync Tainted: P           OE      6.2.12-zen1-1-zen #1 1faa92b78502bd84fae46f72b6f278b9e8b79472
[  242.696609] Hardware name: System manufacturer System Product Name/ROG STRIX B450-I GAMING, BIOS 5003 02/03/2023
[  242.696610] RIP: 0010:dsl_deadlist_remove_key+0xce/0x1b0 [zfs]
[  242.696713] Code: ff ff 31 d2 4c 89 e6 4c 89 ef 48 89 5c 24 30 e8 88 05 d5 ff 31 d2 4c 89 ef 48 89 c6 49 89 c4 e8 68 04 d5 ff 49 8b 3f 48 89 c6 <49> 8b 44 24 58 48 8b 56 58 48 89 74 24 10 48 89 04 24 48 89 54 24
[  242.696714] RSP: 0018:ffffbb739d447848 EFLAGS: 00010246
[  242.696716] RAX: 0000000000000000 RBX: 000000000179684d RCX: 0000000000000001
[  242.696717] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9b21ce78d800
[  242.696718] RBP: ffff9b22e005d440 R08: 0000000000000000 R09: 0000000000000000
[  242.696719] R10: 2000000000000000 R11: 0000008000000000 R12: 0000000000000000
[  242.696720] R13: ffff9b21f2dc91a8 R14: ffff9b21f2dc9210 R15: ffff9b21f2dc9198
[  242.696721] FS:  0000000000000000(0000) GS:ffff9b28de980000(0000) knlGS:0000000000000000
[  242.696722] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  242.696723] CR2: 0000000000000058 CR3: 000000020c8fe000 CR4: 0000000000350ee0
[  242.696725] Call Trace:
[  242.696728]  <TASK>
[  242.696730]  dsl_destroy_snapshot_sync_impl+0xb0c/0xdb0 [zfs c4187efd0e8626b7a62e598db8552909369c5258]
[  242.696815]  dsl_destroy_snapshot_sync+0x69/0xb0 [zfs c4187efd0e8626b7a62e598db8552909369c5258]
[  242.696897]  zcp_synctask_destroy+0xb1/0x1d0 [zfs c4187efd0e8626b7a62e598db8552909369c5258]
[  242.696992]  zcp_synctask_wrapper+0xa3/0x170 [zfs c4187efd0e8626b7a62e598db8552909369c5258]
[  242.697081]  ? __pfx_zcp_synctask_wrapper+0x10/0x10 [zfs c4187efd0e8626b7a62e598db8552909369c5258]
[  242.697168]  luaD_precall+0x233/0x550 [zlua 6411112d9667d42cb66a075480012db78fa67471]
[  242.697179]  luaV_execute+0x1058/0x2180 [zlua 6411112d9667d42cb66a075480012db78fa67471]
[  242.697189]  ? luaM_realloc_+0x3a/0xa0 [zlua 6411112d9667d42cb66a075480012db78fa67471]
[  242.697201]  ? luaE_extendCI+0x1b/0x40 [zlua 6411112d9667d42cb66a075480012db78fa67471]
[  242.697209]  ? luaD_precall+0x494/0x550 [zlua 6411112d9667d42cb66a075480012db78fa67471]
[  242.697217]  luaD_call+0x12b/0x140 [zlua 6411112d9667d42cb66a075480012db78fa67471]
[  242.697225]  ? luaH_newkey+0x31d/0x370 [zlua 6411112d9667d42cb66a075480012db78fa67471]
[  242.697234]  luaD_rawrunprotected+0x83/0xc0 [zlua 6411112d9667d42cb66a075480012db78fa67471]
[  242.697242]  ? __pfx_f_call+0x10/0x10 [zlua 6411112d9667d42cb66a075480012db78fa67471]
[  242.697250]  ? luaD_rawrunprotected+0x68/0xc0 [zlua 6411112d9667d42cb66a075480012db78fa67471]
[  242.697258]  luaD_pcall+0x38/0x160 [zlua 6411112d9667d42cb66a075480012db78fa67471]
[  242.697266]  lua_pcallk+0xa8/0x1d0 [zlua 6411112d9667d42cb66a075480012db78fa67471]
[  242.697275]  zcp_eval_impl+0xc5/0x470 [zfs c4187efd0e8626b7a62e598db8552909369c5258]
[  242.697365]  dsl_sync_task_sync+0xac/0xf0 [zfs c4187efd0e8626b7a62e598db8552909369c5258]
[  242.697457]  dsl_pool_sync+0x41c/0x5d0 [zfs c4187efd0e8626b7a62e598db8552909369c5258]
[  242.697545]  spa_sync+0x56c/0xf90 [zfs c4187efd0e8626b7a62e598db8552909369c5258]
[  242.697642]  ? spa_txg_history_init_io+0x193/0x1c0 [zfs c4187efd0e8626b7a62e598db8552909369c5258]
[  242.697737]  txg_sync_thread+0x22b/0x3f0 [zfs c4187efd0e8626b7a62e598db8552909369c5258]
[  242.697830]  ? __pfx_txg_sync_thread+0x10/0x10 [zfs c4187efd0e8626b7a62e598db8552909369c5258]
[  242.697920]  ? __pfx_thread_generic_wrapper+0x10/0x10 [spl 9a231b3cca7f56f59d3d0383366e226e8262190c]
[  242.697928]  thread_generic_wrapper+0x5e/0x70 [spl 9a231b3cca7f56f59d3d0383366e226e8262190c]
[  242.697936]  kthread+0xde/0x110
[  242.697939]  ? __pfx_kthread+0x10/0x10
[  242.697940]  ret_from_fork+0x2c/0x50
[  242.697944]  </TASK>
[  242.697944] Modules linked in: xt_MASQUERADE xt_mark snd_seq_dummy snd_hrtimer snd_seq tun wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha ip6_udp_tunnel udp_tunnel ip6table_nat ip6table_filter ip6_tables iptable_nat iptable_filter ext4 intel_rapl_msr intel_rapl_common edac_mce_amd mbcache kvm_amd snd_hda_codec_realtek rtw88_8822be jbd2 snd_hda_codec_generic snd_hda_codec_hdmi rtw88_8822b amdgpu rtw88_pci kvm rtw88_core snd_hda_intel irqbypass snd_intel_dspcfg crct10dif_pclmul snd_usb_audio btusb crc32_pclmul eeepc_wmi snd_intel_sdw_acpi polyval_clmulni snd_usbmidi_lib snd_hda_codec mac80211 asus_wmi polyval_generic btrtl snd_rawmidi drm_buddy btbcm gf128mul uvcvideo snd_hda_core ledtrig_audio ghash_clmulni_intel btintel videobuf2_vmalloc gpu_sched snd_seq_device snd_hwdep sha512_ssse3 videobuf2_memops libarc4 sparse_keymap btmtk aesni_intel videobuf2_v4l2 drm_ttm_helper platform_profile vfat ttm crypto_simd fat video
[  242.697978]  wmi_bmof asus_wmi_sensors sp5100_tco cryptd snd_pcm videodev bluetooth i2c_piix4 pcspkr cfg80211 drm_display_helper igb videobuf2_common snd_timer rapl zenpower(OE) ecdh_generic snd cec mc dca soundcore rfkill ccp crc16 gpio_amdpt wmi gpio_generic acpi_cpufreq mousedev joydev mac_hid nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_masq nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables libcrc32c crc32c_generic crc32c_intel nfnetlink usbip_host usbip_core dm_multipath sg crypto_user fuse dm_mod loop zram bpf_preload ip_tables x_tables usbhid uas usb_storage zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) nvme nvme_core xhci_pci xhci_pci_renesas nvme_common
[  242.698010] CR2: 0000000000000058
[  242.698011] ---[ end trace 0000000000000000 ]---
[  242.698012] RIP: 0010:dsl_deadlist_remove_key+0xce/0x1b0 [zfs]
[  242.698098] Code: ff ff 31 d2 4c 89 e6 4c 89 ef 48 89 5c 24 30 e8 88 05 d5 ff 31 d2 4c 89 ef 48 89 c6 49 89 c4 e8 68 04 d5 ff 49 8b 3f 48 89 c6 <49> 8b 44 24 58 48 8b 56 58 48 89 74 24 10 48 89 04 24 48 89 54 24
[  242.698100] RSP: 0018:ffffbb739d447848 EFLAGS: 00010246
[  242.698101] RAX: 0000000000000000 RBX: 000000000179684d RCX: 0000000000000001
[  242.698102] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9b21ce78d800
[  242.698103] RBP: ffff9b22e005d440 R08: 0000000000000000 R09: 0000000000000000
[  242.698104] R10: 2000000000000000 R11: 0000008000000000 R12: 0000000000000000
[  242.698105] R13: ffff9b21f2dc91a8 R14: ffff9b21f2dc9210 R15: ffff9b21f2dc9198
[  242.698106] FS:  0000000000000000(0000) GS:ffff9b28de980000(0000) knlGS:0000000000000000
[  242.698107] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  242.698108] CR2: 0000000000000058 CR3: 000000020c8fe000 CR4: 0000000000350ee0
[  242.698109] note: txg_sync[11527] exited with irqs disabled
@edward-shen edward-shen added the Type: Defect Incorrect behavior (e.g. crash, hang) label Aug 13, 2023
@edward-shen
Copy link
Author

I've updated to Linux 6.4.9 and zfs 2.1.12 and I've narrowed it down to deleting a snapshot. When zrepl prunes snapshots, it seems to trigger the issue.

The kernel oops is a little different, so I'll provide it as well:

[  285.939477] BUG: kernel NULL pointer dereference, address: 0000000000000058
[  285.939482] #PF: supervisor read access in kernel mode
[  285.939484] #PF: error_code(0x0000) - not-present page
[  285.939486] PGD 0 P4D 0
[  285.939489] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  285.939492] CPU: 8 PID: 9947 Comm: txg_sync Tainted: P           OE      6.4.9-zen1-1-zen #1 ca877e4dbf13ba6c89be10931a05c2536cbe2867
[  285.939496] Hardware name: System manufacturer System Product Name/ROG STRIX B450-I GAMING, BIOS 5003 02/03/2023
[  285.939498] RIP: 0010:dsl_deadlist_remove_key+0xce/0x1b0 [zfs]
[  285.939616] Code: ff ff 31 d2 4c 89 e6 4c 89 ef 48 89 5c 24 30 e8 48 03 ca ff 31 d2 4c 89 ef 48 89 c6 49 89 c4 e8 28 02 ca ff 49 8b 3f 48 89 c6 <49> 8b 44 24 58 48 8b 56 58 48 89 74 24 10 48 89 04 24 48 89 54 24
[  285.939619] RSP: 0018:ffffb9f81d857860 EFLAGS: 00010282
[  285.939622] RAX: 0000000000000000 RBX: 000000000179684d RCX: 0000000000000001
[  285.939625] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff902e8c73d800
[  285.939627] RBP: ffff902e9a15a240 R08: 0000000000000000 R09: 0000000000000000
[  285.939629] R10: 2000000000000000 R11: 0000008000000000 R12: 0000000000000000
[  285.939631] R13: ffff902eb15eb1a8 R14: ffff902eb15eb210 R15: ffff902eb15eb198
[  285.939633] FS:  0000000000000000(0000) GS:ffff90359ec00000(0000) knlGS:0000000000000000
[  285.939636] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  285.939638] CR2: 0000000000000058 CR3: 000000010f820000 CR4: 0000000000350ee0
[  285.939640] Call Trace:
[  285.939644]  <TASK>
[  285.939647]  ? __die+0x10f/0x120
[  285.939653]  ? page_fault_oops+0x171/0x4e0
[  285.939659]  ? exc_page_fault+0x7f/0x180
[  285.939664]  ? asm_exc_page_fault+0x26/0x30
[  285.939670]  ? dsl_deadlist_remove_key+0xce/0x1b0 [zfs aa1eb18e4c61ca7eb3090da8843216a35f3c1d18]
[  285.939771]  dsl_destroy_snapshot_sync_impl+0xb3f/0xda0 [zfs aa1eb18e4c61ca7eb3090da8843216a35f3c1d18]
[  285.939872]  dsl_destroy_snapshot_sync+0x69/0xb0 [zfs aa1eb18e4c61ca7eb3090da8843216a35f3c1d18]
[  285.939970]  zcp_synctask_destroy+0xb1/0x1d0 [zfs aa1eb18e4c61ca7eb3090da8843216a35f3c1d18]
[  285.940083]  zcp_synctask_wrapper+0xa3/0x170 [zfs aa1eb18e4c61ca7eb3090da8843216a35f3c1d18]
[  285.940189]  ? __pfx_zcp_synctask_wrapper+0x10/0x10 [zfs aa1eb18e4c61ca7eb3090da8843216a35f3c1d18]
[  285.940292]  luaD_precall+0x233/0x550 [zlua 7fc0c622338354adc04520f51f4507f37d620181]
[  285.940305]  luaV_execute+0x997/0x2130 [zlua 7fc0c622338354adc04520f51f4507f37d620181]
[  285.940317]  ? __x86_return_thunk+0x9/0x10
[  285.940320]  ? luaE_extendCI+0x1b/0x40 [zlua 7fc0c622338354adc04520f51f4507f37d620181]
[  285.940330]  ? __x86_return_thunk+0x9/0x10
[  285.940333]  ? luaD_precall+0x496/0x550 [zlua 7fc0c622338354adc04520f51f4507f37d620181]
[  285.940344]  luaD_call+0x12b/0x140 [zlua 7fc0c622338354adc04520f51f4507f37d620181]
[  285.940354]  ? luaH_newkey+0x31c/0x370 [zlua 7fc0c622338354adc04520f51f4507f37d620181]
[  285.940365]  luaD_rawrunprotected+0x83/0xc0 [zlua 7fc0c622338354adc04520f51f4507f37d620181]
[  285.940376]  ? __pfx_f_call+0x10/0x10 [zlua 7fc0c622338354adc04520f51f4507f37d620181]
[  285.940387]  ? luaD_rawrunprotected+0x68/0xc0 [zlua 7fc0c622338354adc04520f51f4507f37d620181]
[  285.940397]  luaD_pcall+0x38/0x160 [zlua 7fc0c622338354adc04520f51f4507f37d620181]
[  285.940407]  ? __x86_return_thunk+0x9/0x10
[  285.940410]  lua_pcallk+0xa8/0x1d0 [zlua 7fc0c622338354adc04520f51f4507f37d620181]
[  285.940422]  zcp_eval_impl+0xc5/0x470 [zfs aa1eb18e4c61ca7eb3090da8843216a35f3c1d18]
[  285.940528]  dsl_sync_task_sync+0xac/0xf0 [zfs aa1eb18e4c61ca7eb3090da8843216a35f3c1d18]
[  285.940635]  dsl_pool_sync+0x420/0x5c0 [zfs aa1eb18e4c61ca7eb3090da8843216a35f3c1d18]
[  285.940737]  spa_sync+0x57e/0x1020 [zfs aa1eb18e4c61ca7eb3090da8843216a35f3c1d18]
[  285.940846]  ? __x86_return_thunk+0x9/0x10
[  285.940850]  ? spa_txg_history_init_io+0x193/0x1c0 [zfs aa1eb18e4c61ca7eb3090da8843216a35f3c1d18]
[  285.940967]  txg_sync_thread+0x22b/0x3f0 [zfs aa1eb18e4c61ca7eb3090da8843216a35f3c1d18]
[  285.941078]  ? __pfx_txg_sync_thread+0x10/0x10 [zfs aa1eb18e4c61ca7eb3090da8843216a35f3c1d18]
[  285.941185]  ? __pfx_thread_generic_wrapper+0x10/0x10 [spl 5d6b0e8be260c74211ac3afa8dcf41589ddd9ae9]
[  285.941195]  thread_generic_wrapper+0x5e/0x70 [spl 5d6b0e8be260c74211ac3afa8dcf41589ddd9ae9]
[  285.941205]  kthread+0xe8/0x120
[  285.941208]  ? __pfx_kthread+0x10/0x10
[  285.941211]  ret_from_fork+0x2c/0x50
[  285.941218]  </TASK>
[  285.941219] Modules linked in: xt_MASQUERADE xt_mark snd_seq_dummy snd_hrtimer snd_seq tun ip6table_nat ip6table_filter ip6_tables iptable_nat iptable_filter wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha ip6_udp_tunnel udp_tunnel intel_rapl_msr intel_rapl_common edac_mce_amd amdgpu kvm_amd rtw88_8822be vfat rtw88_8822b fat snd_hda_codec_realtek kvm ext4 rtw88_pci snd_hda_codec_generic snd_hda_codec_hdmi rtw88_core snd_hda_intel mbcache irqbypass drm_buddy jbd2 crct10dif_pclmul gpu_sched snd_intel_dspcfg crc32_pclmul drm_suballoc_helper polyval_clmulni mac80211 drm_ttm_helper uvcvideo snd_intel_sdw_acpi polyval_generic btusb eeepc_wmi snd_usb_audio gf128mul snd_hda_codec asus_wmi btrtl btbcm ttm videobuf2_vmalloc ghash_clmulni_intel snd_usbmidi_lib uvc libarc4 sha512_ssse3 btintel snd_hda_core ledtrig_audio snd_rawmidi btmtk aesni_intel snd_hwdep snd_seq_device sparse_keymap videobuf2_memops crypto_simd platform_profile asus_wmi_sensors wmi_bmof snd_pcm
[  285.941285]  videobuf2_v4l2 cryptd drm_display_helper igb bluetooth cfg80211 rapl videodev snd_timer cec sp5100_tco videobuf2_common i2c_algo_bit snd video ecdh_generic mc mousedev pcspkr joydev acpi_cpufreq dca rfkill soundcore ccp crc16 i2c_piix4 zenpower(OE) wmi gpio_amdpt nft_reject_inet gpio_generic nf_reject_ipv4 nf_reject_ipv6 mac_hid nft_reject nft_masq nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables libcrc32c crc32c_generic crc32c_intel nfnetlink usbip_host usbip_core dm_multipath sg crypto_user fuse dm_mod loop zram bpf_preload ip_tables x_tables usbhid zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) nvme nvme_core xhci_pci xhci_pci_renesas nvme_common
[  285.941344] CR2: 0000000000000058
[  285.941347] ---[ end trace 0000000000000000 ]---
[  285.941348] RIP: 0010:dsl_deadlist_remove_key+0xce/0x1b0 [zfs]
[  285.941452] Code: ff ff 31 d2 4c 89 e6 4c 89 ef 48 89 5c 24 30 e8 48 03 ca ff 31 d2 4c 89 ef 48 89 c6 49 89 c4 e8 28 02 ca ff 49 8b 3f 48 89 c6 <49> 8b 44 24 58 48 8b 56 58 48 89 74 24 10 48 89 04 24 48 89 54 24
[  285.941454] RSP: 0018:ffffb9f81d857860 EFLAGS: 00010282
[  285.941457] RAX: 0000000000000000 RBX: 000000000179684d RCX: 0000000000000001
[  285.941458] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff902e8c73d800
[  285.941460] RBP: ffff902e9a15a240 R08: 0000000000000000 R09: 0000000000000000
[  285.941462] R10: 2000000000000000 R11: 0000008000000000 R12: 0000000000000000
[  285.941463] R13: ffff902eb15eb1a8 R14: ffff902eb15eb210 R15: ffff902eb15eb198
[  285.941465] FS:  0000000000000000(0000) GS:ffff90359ec00000(0000) knlGS:0000000000000000
[  285.941467] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  285.941469] CR2: 0000000000000058 CR3: 000000010f820000 CR4: 0000000000350ee0
[  285.941471] note: txg_sync[9947] exited with irqs disabled

@edward-shen
Copy link
Author

Okay, I've found the specific snapshot that I can't destroy. Running the command zfs destroy zroot/ROOT/default@zrepl_20230812_063637_000 consistently results in the null deref.

@edward-shen
Copy link
Author

Unfortunately, this isn't considered corruption by scrub. zpool status reports 0 errors; zpool events -v reports the delete request but never completes:

Aug 13 2023 14:23:00.329902736 sysevent.fs.zfs.history_event
        version = 0x0
        class = "sysevent.fs.zfs.history_event"
        pool = "zroot"
        pool_guid = 0x6365c74923a255d
        pool_state = 0x0
        pool_context = 0x0
        history_hostname = "kurante"
        history_internal_str = "func=1 mintxg=0 maxtxg=24747947"
        history_internal_name = "scan setup"
        history_txg = 0x1799fab
        history_time = 0x64d949b4
        time = 0x64d949b4 0x13a9ea90
        eid = 0xe

Aug 13 2023 14:30:36.739901684 sysevent.fs.zfs.history_event
        version = 0x0
        class = "sysevent.fs.zfs.history_event"
        pool = "zroot"
        pool_guid = 0x6365c74923a255d
        pool_state = 0x0
        pool_context = 0x0
        history_hostname = "kurante"
        history_internal_str = "errors=0"
        history_internal_name = "scan done"
        history_txg = 0x179a09a
        history_time = 0x64d94b7c
        time = 0x64d94b7c 0x2c1a00f4
        eid = 0xf

Aug 13 2023 14:30:36.740901685 sysevent.fs.zfs.scrub_finish
        version = 0x0
        class = "sysevent.fs.zfs.scrub_finish"
        pool = "zroot"
        pool_guid = 0x6365c74923a255d
        pool_state = 0x0
        pool_context = 0x0
        time = 0x64d94b7c 0x2c294335
        eid = 0x10

Aug 13 2023 14:31:41.913178342 sysevent.fs.zfs.history_event
        version = 0x0
        class = "sysevent.fs.zfs.history_event"
        pool = "zroot"
        pool_guid = 0x6365c74923a255d
        pool_state = 0x0
        pool_context = 0x0
        history_hostname = "kurante"
        history_dsname = "zroot/ROOT/default@zrepl_20230812_063637_000"
        history_internal_str = " "
        history_internal_name = "destroy"
        history_dsid = 0x9658
        history_txg = 0x179a0d7
        history_time = 0x64d94bbd
        time = 0x64d94bbd 0x366dfee6
        eid = 0x11

@edward-shen
Copy link
Author

Thankfully, so long as I don't try to destroy this snapshot, I don't think I'll trigger the bug. Unfortunately, I can't seem to figure out how to delete the snapshot without going into this codepath.

I can send this snapshot around, but I'm not sure if I should -- I'm not sure how to further debug this issue (and if it's a module bug or a corrupted snapshot).

@rincebrain
Copy link
Contributor

I'd probably run zdb -AAA -G -dddddd [path to that snapshot] and see if it prints anything really angry.

I also wouldn't suggest running any version of OpenZFS on 6.3/6.4 right now - cf. #15140.

@edward-shen
Copy link
Author

@rincebrain, thanks for the reply.

zdb -G seems to note 2 data errors:

kurante# zdb -G zroot/ROOT/default@zrepl_20230812_063637_000
Dataset zroot/ROOT/default@zrepl_20230812_063637_000 [ZPL], ID 38488, cr_txg 24733773, 130G, 1002344 objects

ZFS_DBGMSG(zdb) START:
spa.c:5294:spa_open_common(): spa_open_common: opening zroot/ROOT/default@zrepl_20230812_063637_000
spa_misc.c:417:spa_load_note(): spa_load(zroot, config trusted): LOADING
vdev.c:161:vdev_dbgmsg(): disk vdev '/dev/disk/by-id/nvme-Samsung_SSD_970_EVO_500GB_S466NX0K506705N-part2': best uberblock found for spa zroot. txg 24756646
spa_misc.c:417:spa_load_note(): spa_load(zroot, config untrusted): using uberblock with txg=24756646
spa_misc.c:417:spa_load_note(): spa_load(zroot, config trusted): spa_load_verify found 0 metadata errors and 2 data errors
spa.c:8514:svpa_async_request(): spa=zroot async request task=2048
spa_misc.c:417:spa_load_note(): spa_load(zroot, config trusted): LOADED
ZFS_DBGMSG(zdb) END

The number of data errors seem to vary; on another attempt it only printed one data error.

Throwing -dddddd gave a whopping 723Mb, 12,966,171 line file to read through:

kurante# zdb -G -dddddd zroot/ROOT/default@zrepl_20230812_063637_000 > /tmp/logs
kurante# wc -l /tmp/logs
12966171 /tmp/logs
kurante# tail /tmp/logs -n20
Indirect blocks:
               0 L0 0:643524e000:1000 400L/400P F=1 B=24702407/24702407 cksum=00141f3d44bd50a4:0508b0f9917ae224:036ba4c3b70e5a79:afaddc411b79afc5

		segment [0000000000000000, 0000000000000400) size    1K

    Dnode slots:
	Total used:       1002344
	Max used:         1632127
	Percent empty:  38.586642


ZFS_DBGMSG(zdb) START:
spa.c:5294:spa_open_common(): spa_open_common: opening zroot/ROOT/default@zrepl_20230812_063637_000
spa_misc.c:417:spa_load_note(): spa_load(zroot, config trusted): LOADING
vdev.c:161:vdev_dbgmsg(): disk vdev '/dev/disk/by-id/nvme-Samsung_SSD_970_EVO_500GB_S466NX0K506705N-part2': best uberblock found for spa zroot. txg 24757546
spa_misc.c:417:spa_load_note(): spa_load(zroot, config untrusted): using uberblock with txg=24757546
spa_misc.c:417:spa_load_note(): spa_load(zroot, config trusted): spa_load_verify found 0 metadata errors and 1 data errors
spa.c:8514:spa_async_request(): spa=zroot async request task=2048
spa_misc.c:417:spa_load_note(): spa_load(zroot, config trusted): LOADED
ZFS_DBGMSG(zdb) END

But I was able to run it without -AAA without any difference, which I hope implies that there's nothing angry being printed.

A very brief visual skim also didn't seem to report anything unusual, but of course it's possible that I missed something. Is there any keywords for me to search for?

I also wouldn't suggest running any version of OpenZFS on 6.3/6.4 right now

Thanks for the heads-up; I'll revert back to 6.2.

@rincebrain
Copy link
Contributor

Yeah, it might find some if you run it while the pool's imported, since it's reading the pool independently of the running system, so it might change underneath zdb.

Mostly the reason to run it with all those -d was not for the individual output but to see if it barfed anywhere in running. Hmmm.

@edward-shen
Copy link
Author

edward-shen commented Aug 17, 2023

Okay, building on cae502c, loading zfs.ko, and reproducing the bug again:

...
[68291.032126] RIP: 0010:dsl_deadlist_remove_key+0xce/0x1b0 [zfs]
... snip ...
[68291.032233] Call Trace:
[68291.032235]  <TASK>
[68291.032237]  dsl_destroy_snapshot_sync_impl+0xb0c/0xdb0 [zfs c4187efd0e8626b7a62e598db8552909369c5258]
[68291.032324]  dsl_destroy_snapshot_sync+0x69/0xb0 [zfs c4187efd0e8626b7a62e598db8552909369c5258]
[68291.032407]  zcp_synctask_destroy+0xb1/0x1d0 [zfs c4187efd0e8626b7a62e598db8552909369c5258]
...

This time, I've loaded gdb, and that seems to provide some hints. I unfortunately don't know which is the actual faulting line here (it also looks like we're missing some in the call trace), so I'm assuming it's the RIP?

(gdb) list *(dsl_deadlist_remove_key+0xce)
0x1959ae is in dsl_deadlist_remove_key (/home/edward/repos/zfs/module/zfs/dsl_deadlist.c:559).
554             dle = avl_find(&dl->dl_tree, &dle_tofind, NULL);
555             ASSERT3P(dle, !=, NULL);
556             dle_prev = AVL_PREV(&dl->dl_tree, dle);
557             ASSERT3P(dle_prev, !=, NULL);
558
559             dle_enqueue_subobj(dl, dle_prev, dle->dle_bpobj.bpo_object, tx);
560
561             avl_remove(&dl->dl_tree, dle);
562             bpobj_close(&dle->dle_bpobj);
563             kmem_free(dle, sizeof (*dle));

(gdb) list *(dsl_destroy_snapshot_sync_impl+0xb0c)
0x19b33c is in dsl_destroy_snapshot_sync_impl (/home/edward/repos/zfs/module/zfs/dsl_destroy.c:466).
461                         dsl_dataset_phys(ds)->ds_prev_snap_txg,
462                         dsl_dataset_phys(ds)->ds_creation_txg,
463                         &used, &comp, &uncomp);
464                     dsl_dataset_phys(ds_next)->ds_unique_bytes += used;
465                     dsl_dataset_rele(ds_nextnext, FTAG);
466                     ASSERT3P(ds_next->ds_prev, ==, NULL);
467
468                     /* Collapse range in this head. */
469                     dsl_dataset_t *hds;
470                     VERIFY0(dsl_dataset_hold_obj(dp,

As much as I would love to figure out how to debug and fix this myself, I get absolutely terrified every single time I try messing around with my live machine, especially since this bug has stopped snapshot management for me.

Just now I had to drop into an archiso to repair and recover my machine since I apparently screwed something up by running sudo make install without running sudo make uninstall while it was still live. I definitely chalk this up to inexperience with working with kernel module.

@edward-shen
Copy link
Author

I've also posted on the zfs-discuss@list.zfsonlinux.org here.

@edward-shen
Copy link
Author

I'm starting to believe there's no way to recover this, and I'll need to do a full wipe and restore. Because datasets can't be destroyed with dependent snapshots, I can't even destroy the dataset with the problematic snapshot. As a result, even if I could send and restore from a snapshot such that there's a new dataset without the problematic snapshot, I wouldn't ever be able to destroy the dataset with the problematic snapshot.

@rincebrain
Copy link
Contributor

rincebrain commented Sep 3, 2023

My only wild thought would be if this were a strangely reproducible example of a failure case like #13605 or #13620 where the custom kernel is forcing auto-vectorization someplace inappropriate and it's breaking very strangely, so you could try pulling in #14649, but that's just wild speculation.

I'll be back at a real computer to test theories later this week, so hopefully I'll have other ideas then.

edit: The other other thought would be, if we think the problem is somehow specific to the ZCP batch deletion, you could force it to take the codepath that doesn't use that, I think...probably require code changes though, not at a machine with a git clone to see, but that's how it used to work before that was refactored into a ZCP.

@jpwarren
Copy link

I've got a similar issue, which also seems to be related to a automated snapshot that now can't be deleted.
Slightly different call trace, which I'll put here in case it helps figure out what's wrong.

This is happening on the live system running kernel 6.2, and also when booted into my rescue ISO of Ubuntu 22.04.01 as I'm copying data off the broken pool onto a fresh one. Or my data has gotten corrupted in a strange way.

As per OP, I can copy data off and use other pools, but anything that touches the broken pool like zfs list or zpool export will just hang.

Linux ubuntu 5.19.0-32-generic #33~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Jan 30 17:03:34 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
[  715.627220] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  715.627224] #PF: supervisor read access in kernel mode
[  715.627226] #PF: error_code(0x0000) - not-present page
[  715.627227] PGD 0 P4D 0 
[  715.627230] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  715.627231] CPU: 8 PID: 7605 Comm: txg_sync Tainted: P           O      5.19.0-32-generic #33~22.04.1-Ubuntu
[  715.627233] Hardware name: Micro-Star International Co., Ltd. MS-7C56/B550-A PRO (MS-7C56), BIOS A.C0 03/13/2023
[  715.627234] RIP: 0010:memcpy_erms+0x6/0x10
[  715.627239] Code: cc cc cc cc eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 cc cc cc cc 66 90 48 89 f8 48 89 d1 <f3> a4 c3 cc cc cc cc 0f 1f 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe
[  715.627240] RSP: 0018:ffffa849dbd8b628 EFLAGS: 00010282
[  715.627242] RAX: ffffa84b89252680 RBX: ffff8e57416c1780 RCX: 0000000000000080
[  715.627243] RDX: 0000000000000080 RSI: 0000000000000000 RDI: ffffa84b89252680
[  715.627243] RBP: ffffa849dbd8b678 R08: 0000000000000000 R09: 0000000000000000
[  715.627244] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000680
[  715.627245] R13: 0000000000000080 R14: 0000000000000000 R15: ffff8e5cd2659800
[  715.627246] FS:  0000000000000000(0000) GS:ffff8e6551c00000(0000) knlGS:0000000000000000
[  715.627247] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  715.627248] CR2: 0000000000000000 CR3: 00000004eae10000 CR4: 0000000000750ee0
[  715.627249] PKRU: 55555554
[  715.627249] Call Trace:
[  715.627251]  <TASK>
[  715.627254]  ? dmu_write_impl+0x5a/0xf0 [zfs]
[  715.627313]  dmu_write+0x9e/0xe0 [zfs]
[  715.627356]  bpobj_enqueue_subobj+0x2ae/0x670 [zfs]
[  715.627396]  ? __slab_free+0xc3/0x340
[  715.627400]  dsl_deadlist_move_bpobj+0xb2/0x250 [zfs]
[  715.627447]  ? dsl_dir_diduse_space+0x135/0x1a0 [zfs]
[  715.627490]  dsl_destroy_snapshot_sync_impl+0x2a4/0xab0 [zfs]
[  715.627531]  dsl_destroy_snapshot_sync+0x62/0xc0 [zfs]
[  715.627572]  ? dsl_destroy_snapshot_check_impl+0x150/0x150 [zfs]
[  715.627633]  zcp_sync_task+0x62/0xf0 [zfs]
[  715.627729]  ? dsl_destroy_head_sync+0xd0/0xd0 [zfs]
[  715.627779]  zcp_synctask_destroy+0x87/0x120 [zfs]
[  715.627827]  zcp_synctask_wrapper+0xa6/0x180 [zfs]
[  715.627873]  ? dsl_dir_phys+0x20/0x20 [zfs]
[  715.627917]  luaD_precall+0xe8/0x290 [zlua]
[  715.627922]  luaV_execute+0xce4/0x1380 [zlua]
[  715.627927]  luaD_call+0x12d/0x150 [zlua]
[  715.627931]  f_call+0x16/0x30 [zlua]
[  715.627934]  luaD_rawrunprotected+0x65/0xa0 [zlua]
[  715.627938]  ? lua_setmetatable+0x110/0x110 [zlua]
[  715.627942]  ? luaD_rawrunprotected+0x50/0xa0 [zlua]
[  715.627945]  luaD_pcall+0x3a/0xa0 [zlua]
[  715.627949]  lua_pcallk+0x8c/0x140 [zlua]
[  715.627953]  zcp_eval_impl+0xba/0x480 [zfs]
[  715.628019]  zcp_eval_sync+0x2c/0x70 [zfs]
[  715.628092]  dsl_sync_task_sync+0xc6/0x120 [zfs]
[  715.628159]  dsl_pool_sync+0x379/0x410 [zfs]
[  715.628202]  spa_sync_iterate_to_convergence+0xe0/0x200 [zfs]
[  715.628251]  spa_sync+0x2dc/0x5c0 [zfs]
[  715.628294]  txg_sync_thread+0x266/0x2f0 [zfs]
[  715.628344]  ? txg_dispatch_callbacks+0x110/0x110 [zfs]
[  715.628389]  ? __thread_exit+0x20/0x20 [spl]
[  715.628394]  thread_generic_wrapper+0x64/0x80 [spl]
[  715.628398]  kthread+0xee/0x120
[  715.628401]  ? kthread_complete_and_exit+0x20/0x20
[  715.628403]  ret_from_fork+0x22/0x30
[  715.628407]  </TASK>
[  715.628407] Modules linked in: nfnetlink binfmt_misc snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_usb_audio snd_hda_codec zfs(PO) snd_hda_core snd_usbmidi_lib snd_hwdep snd_pcm intel_rapl_msr zunicode(PO) intel_rapl_common zzstd(O) uvcvideo snd_seq_midi snd_seq_midi_event videobuf2_vmalloc snd_rawmidi videobuf2_memops zlua(O) videobuf2_v4l2 edac_mce_amd videobuf2_common zavl(PO) snd_seq kvm_amd videodev icp(PO) snd_seq_device snd_timer zcommon(PO) input_leds joydev mc kvm znvpair(PO) snd spl(O) soundcore k10temp ccp rapl mac_hid wmi_bmof sch_fq_codel msr parport_pc ppdev lp parport ramoops pstore_blk reed_solomon pstore_zone efi_pstore ip_tables x_tables autofs4 overlay isofs dm_mirror dm_region_hash dm_log hid_generic uas usbhid hid usb_storage amdgpu iommu_v2 gpu_sched i2c_algo_bit drm_ttm_helper ttm drm_display_helper cec rc_core drm_kms_helper syscopyarea sysfillrect sysimgblt crct10dif_pclmul
[  715.628445]  crc32_pclmul nvme fb_sys_fops ghash_clmulni_intel aesni_intel crypto_simd drm cryptd r8169 nvme_core ahci i2c_piix4 xhci_pci realtek libahci xhci_pci_renesas wmi video gpio_amdpt
[  715.628455] CR2: 0000000000000000
[  715.628456] ---[ end trace 0000000000000000 ]---
[  715.769260] RIP: 0010:memcpy_erms+0x6/0x10
[  715.769273] Code: cc cc cc cc eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 cc cc cc cc 66 90 48 89 f8 48 89 d1 <f3> a4 c3 cc cc cc cc 0f 1f 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe
[  715.769275] RSP: 0018:ffffa849dbd8b628 EFLAGS: 00010282
[  715.769278] RAX: ffffa84b89252680 RBX: ffff8e57416c1780 RCX: 0000000000000080
[  715.769280] RDX: 0000000000000080 RSI: 0000000000000000 RDI: ffffa84b89252680
[  715.769281] RBP: ffffa849dbd8b678 R08: 0000000000000000 R09: 0000000000000000
[  715.769282] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000680
[  715.769283] R13: 0000000000000080 R14: 0000000000000000 R15: ffff8e5cd2659800
[  715.769284] FS:  0000000000000000(0000) GS:ffff8e6551c00000(0000) knlGS:0000000000000000
[  715.769286] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  715.769287] CR2: 0000000000000000 CR3: 0000000490496000 CR4: 0000000000750ee0
[  715.769289] PKRU: 55555554

@edward-shen
Copy link
Author

To give an update on my end: since my last post I've given up on trying to fix it. I've made a snapshot of the latest system state and restored them from on a new zfs system. Transplanting a zfs on root system was interesting to say the least, but at least my data was still present at the end.

I still have the original drive should someone else need to run some tests, but otherwise I'm no longer affected by this.

@edward-shen
Copy link
Author

With the noise about the recent silent corruption bug (#15526), I'm curious if this could have been a result of it, where some magical things aligned in concurrent ZFS operations that resulted in malformed ZFS metadata.

Though I suppose with such a nasty bug, you can point a lot of things to it as a potential resolution...

@rincebrain
Copy link
Contributor

Nope.

The Lua interpreter in the kernel module is just buggy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

3 participants