btrfs check --mode=lowmem Segmentation fault (version 5.14.2) #412

wangyugui · 2021-10-09T04:17:29Z

steps to reproduce:

$ make test-check-lowmem
then a core file is left under tests/fsck-tests/012-leaf-corruption/

$ file tests/fsck-tests/012-leaf-corruption/core.67317
tests/fsck-tests/012-leaf-corruption/core.67317: ELF 64-bit LSB core file, x86-64, version 1 (SYSV), SVR4-style, from '/ssd/git/os/btrfs-progs/btrfs check --mode=lowmem ./good.img.restored', real uid: 0, effective uid: 0, real gid: 0, effective gid: 0, execfn: '/ssd/git/os/btrfs-progs/btrfs', platform: 'x86_64'

$ gdb /ssd/git/os/btrfs-progs/btrfs tests/fsck-tests/012-leaf-corruption/core.67317
Core was generated by `/ssd/git/os/btrfs-progs/btrfs check --mode=lowmem ./good.img.restored'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 btrfs_inode_size (s=0x642e6cd1, eb=0xc28fc0) at ./kernel-shared/ctree.h:1709
1709 BTRFS_SETGET_FUNCS(inode_size, struct btrfs_inode_item, size, 64);
(gdb) where
#0 btrfs_inode_size (s=0x642e6cd1, eb=0xc28fc0) at ./kernel-shared/ctree.h:1709
#1 check_inode_item (root=root@entry=0xa43e90, path=path@entry=0x7fff8e5f36a0) at check/mode-lowmem.c:2628
#2 0x0000000000458c61 in process_one_leaf (level=, nrefs=0x7fff8e5f35c0, path=0x7fff8e5f36a0,
root=0xa43e90) at check/mode-lowmem.c:2896
#3 walk_down_tree (check_all=0, nrefs=0x7fff8e5f35c0, level=, path=0x7fff8e5f36a0, root=0xa43e90)
at check/mode-lowmem.c:4953
#4 check_btrfs_root (root=root@entry=0xa43e90, check_all=check_all@entry=0) at check/mode-lowmem.c:5254
#5 0x000000000045b908 in check_fs_root (root=0xa43e90) at check/mode-lowmem.c:5288
#6 check_fs_roots_lowmem () at check/mode-lowmem.c:5449
#7 0x0000000000432301 in do_check_fs_roots (root_cache=root_cache@entry=0x7fff8e5f43d8) at check/main.c:3911
#8 0x000000000043ea7f in cmd_check (cmd=, argc=, argv=)
at check/main.c:10818
#9 0x000000000040e130 in cmd_execute (argv=0x7fff8e5f4550, argc=3, cmd=0x6dce60 <cmd_struct_check>)
at cmds/commands.h:125
#10 main (argc=3, argv=0x7fff8e5f4550) at btrfs.c:405
(gdb)

wangyugui · 2021-10-30T00:15:21Z

this problem is still happen on pre-release 5.15-rc1(branch v5.15.x)

adam900710 · 2021-10-30T02:34:02Z

Can not reproduce here.

I'm testing commit 330b86c

adam900710 · 2021-10-30T03:00:02Z

And the result shows it's indeed running in lowmem mode, and everything is fine:
fsck-tests-results.txt

wangyugui · 2021-10-31T10:20:54Z

upload the core the file and elf file.

upload.tar.gz

v5.15.x branch, b40d2c7

adam900710 · 2021-10-31T11:14:57Z

Mind to use valgrind or "make D=asan" build and provide the full output?

It looks like some kind of memory corruption thus it has some randomness related to the memory layout.

adam900710 · 2021-10-31T11:18:11Z

BTW, for both modes I'm seeing a WARN_ON() triggered inside __free_extent().

But I don't think that's the direct cause of the crash.

wangyugui · 2021-10-31T11:39:49Z

build on centos 7(make D=asan) & test on centos 7 => NOT happen
this is the fsck-tests-results.txt of 'make test-check-lowmem'
fsck-tests-results.zip

adam900710 · 2021-10-31T11:48:20Z

One trick, if you only need to run one test, it can be done like this:

$ sudo TEST=012\* make test-check-lowmem

And if D=asan is not detecting the problem, you may want to go with valgrind.

I guess the problem happens for the --repair part, thus what you need is:

$ cp tests/fsck-tests/012/good.img.xz /tmp
$ unxz /tmp/good.img.xz
$ ./btrfs-image -r /tmp/goog.img /tmp/image.raw
$ xfs_io -f -c "pwrite 4206592 32" -c "pwrite 20905984 32" /tmp/image.raw
$ valgrind ./btrfs check --mode=lowmem --repair --force /tmp/image.raw

wangyugui · 2021-10-31T12:39:30Z

valgrind catch something
fsck-tests-results.txt

wangyugui · 2021-11-01T00:21:36Z

valgrind catch almost same thing even without '--mode lowmem'
fsck-tests-results.txt

so this problem may happen without '--mode lowmem' too.

adam900710 · 2021-11-01T02:45:38Z

Oh, I forgot to check the .lowmem_repairable beacon, and that test case doesn't support lowmem repair anyway.

So the repair is all done in original mode, you can verify that in the fsck-tests-results even for lowmem mode:

====== RUN CHECK valgrind /ssd/git/os/btrfs-progs/btrfs check --repair --force ./good.img.restored

No --mode=lowmem.

So it's a bug in the original mode repair code.

Then the pwrite part seems to be a known false alert:

==24563== Syscall param pwrite64(buf) points to uninitialised byte(s)

So no need to worry about that.

But the important part is the warning part:

==24563== Conditional jump or move depends on uninitialised value(s)
==24563==    at 0x4214F8: warning_trace (kerncompat.h:107)
==24563==    by 0x4214F8: __free_extent (extent-tree.c:2049)
==24563==    by 0x4251C6: run_delayed_tree_ref (extent-tree.c:3785)
==24563==    by 0x4251C6: run_one_delayed_ref (extent-tree.c:3805)

This means the WARN_ON() can be randomly triggered.

The possible uninitialized value seems to be owner_objectid, but I don't know why btrfs_add_delayed_tree_ref() is not warning.

BTW, does the D=asan output anything?

wangyugui · 2021-11-01T03:12:21Z

D=asan report almost same thing for lowmen and no-lowmen.

fsck-tests-results-lowmem.txt
fsck-tests-results-no-lowmem.txt

adam900710 · 2021-11-01T06:39:14Z

BTW, do you have the original segfault tests result?

wangyugui · 2021-11-01T12:55:28Z

the patch from Wu works well.
https://patchwork.kernel.org/project/linux-btrfs/patch/20211101113017.52665-1-wqu@suse.com/

kdave · 2021-11-01T19:26:55Z

Thanks for the report and tracking it down. Fixed in devel and will be in 5.15.

@err

…properly handled [BUG] When a special image (diverted from fsck/012) has its unused slots (slot number >= nritems) with garbage, lowmem mode btrfs check can crash: (gdb) run check --mode=lowmem ~/downloads/good.img.restored Starting program: /home/adam/btrfs/btrfs-progs/btrfs check --mode=lowmem ~/downloads/good.img.restored ... ERROR: root 5 INODE[5044031582654955520] nlink(257228800) not equal to inode_refs(0) ERROR: root 5 INODE[5044031582654955520] nbytes 474624 not equal to extent_size 0 Program received signal SIGSEGV, Segmentation fault. 0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703 1703 BTRFS_SETGET_FUNCS(inode_size, struct btrfs_inode_item, size, 64); (gdb) bt #0 0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703 #1 0x0000555555641544 in check_inode_item (root=0x5555556c2290, path=0x7fffffffd960) at check/mode-lowmem.c:2628 [CAUSE] At check_inode_item() we have path->slot[0] at 29, while the tree block only has 26 items. This happens because two reasons: - btrfs_next_item() never reverts its slots Even if we failed to read next leaf. - check_inode_item() doesn't inform the caller that a fatal error happened In check_inode_item(), if btrfs_next_item() failed, it goes to out label, which doesn't really set @err properly. This means, when check_inode_item() fails at btrfs_next_item(), it will increase path->slots[0], while it's already beyond current tree block nritems. When the slot increases furthermore, and if the unused item slots have some garbage, we will get invalid btrfs_item_ptr() result, and causing above segfault. [FIX] Fix the problems by two ways: - Make btrfs_next_item() to revert its path->slots[0] on failure - Properly detect fatal error from check_inode_item() By this, we will no longer crash on the crafted image. Reported-by: Wang Yugui <wangyugui@e16-tech.com> Issue: #412 Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>

@err

…properly handled [BUG] When a special image (diverted from fsck/012) has its unused slots (slot number >= nritems) with garbage, lowmem mode btrfs check can crash: (gdb) run check --mode=lowmem ~/downloads/good.img.restored Starting program: /home/adam/btrfs/btrfs-progs/btrfs check --mode=lowmem ~/downloads/good.img.restored ... ERROR: root 5 INODE[5044031582654955520] nlink(257228800) not equal to inode_refs(0) ERROR: root 5 INODE[5044031582654955520] nbytes 474624 not equal to extent_size 0 Program received signal SIGSEGV, Segmentation fault. 0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703 1703 BTRFS_SETGET_FUNCS(inode_size, struct btrfs_inode_item, size, 64); (gdb) bt #0 0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703 #1 0x0000555555641544 in check_inode_item (root=0x5555556c2290, path=0x7fffffffd960) at check/mode-lowmem.c:2628 [CAUSE] At check_inode_item() we have path->slot[0] at 29, while the tree block only has 26 items. This happens because two reasons: - btrfs_next_item() never reverts its slots Even if we failed to read next leaf. - check_inode_item() doesn't inform the caller that a fatal error happened In check_inode_item(), if btrfs_next_item() failed, it goes to out label, which doesn't really set @err properly. This means, when check_inode_item() fails at btrfs_next_item(), it will increase path->slots[0], while it's already beyond current tree block nritems. When the slot increases furthermore, and if the unused item slots have some garbage, we will get invalid btrfs_item_ptr() result, and causing above segfault. [FIX] Fix the problems by two ways: - Make btrfs_next_item() to revert its path->slots[0] on failure - Properly detect fatal error from check_inode_item() By this, we will no longer crash on the crafted image. Reported-by: Wang Yugui <wangyugui@e16-tech.com> Issue: #412 Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>

@err

…properly handled [BUG] When a special image (diverted from fsck/012) has its unused slots (slot number >= nritems) with garbage, lowmem mode btrfs check can crash: (gdb) run check --mode=lowmem ~/downloads/good.img.restored Starting program: /home/adam/btrfs/btrfs-progs/btrfs check --mode=lowmem ~/downloads/good.img.restored ... ERROR: root 5 INODE[5044031582654955520] nlink(257228800) not equal to inode_refs(0) ERROR: root 5 INODE[5044031582654955520] nbytes 474624 not equal to extent_size 0 Program received signal SIGSEGV, Segmentation fault. 0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703 1703 BTRFS_SETGET_FUNCS(inode_size, struct btrfs_inode_item, size, 64); (gdb) bt #0 0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703 #1 0x0000555555641544 in check_inode_item (root=0x5555556c2290, path=0x7fffffffd960) at check/mode-lowmem.c:2628 [CAUSE] At check_inode_item() we have path->slot[0] at 29, while the tree block only has 26 items. This happens because two reasons: - btrfs_next_item() never reverts its slots Even if we failed to read next leaf. - check_inode_item() doesn't inform the caller that a fatal error happened In check_inode_item(), if btrfs_next_item() failed, it goes to out label, which doesn't really set @err properly. This means, when check_inode_item() fails at btrfs_next_item(), it will increase path->slots[0], while it's already beyond current tree block nritems. When the slot increases furthermore, and if the unused item slots have some garbage, we will get invalid btrfs_item_ptr() result, and causing above segfault. [FIX] Fix the problems by two ways: - Make btrfs_next_item() to revert its path->slots[0] on failure - Properly detect fatal error from check_inode_item() By this, we will no longer crash on the crafted image. Reported-by: Wang Yugui <wangyugui@e16-tech.com> Issue: #412 Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>

@err

…properly handled [BUG] When a special image (diverted from fsck/012) has its unused slots (slot number >= nritems) with garbage, lowmem mode btrfs check can crash: (gdb) run check --mode=lowmem ~/downloads/good.img.restored Starting program: /home/adam/btrfs/btrfs-progs/btrfs check --mode=lowmem ~/downloads/good.img.restored ... ERROR: root 5 INODE[5044031582654955520] nlink(257228800) not equal to inode_refs(0) ERROR: root 5 INODE[5044031582654955520] nbytes 474624 not equal to extent_size 0 Program received signal SIGSEGV, Segmentation fault. 0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703 1703 BTRFS_SETGET_FUNCS(inode_size, struct btrfs_inode_item, size, 64); (gdb) bt #0 0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703 kdave#1 0x0000555555641544 in check_inode_item (root=0x5555556c2290, path=0x7fffffffd960) at check/mode-lowmem.c:2628 [CAUSE] At check_inode_item() we have path->slot[0] at 29, while the tree block only has 26 items. This happens because two reasons: - btrfs_next_item() never reverts its slots Even if we failed to read next leaf. - check_inode_item() doesn't inform the caller that a fatal error happened In check_inode_item(), if btrfs_next_item() failed, it goes to out label, which doesn't really set @err properly. This means, when check_inode_item() fails at btrfs_next_item(), it will increase path->slots[0], while it's already beyond current tree block nritems. When the slot increases furthermore, and if the unused item slots have some garbage, we will get invalid btrfs_item_ptr() result, and causing above segfault. [FIX] Fix the problems by two ways: - Make btrfs_next_item() to revert its path->slots[0] on failure - Properly detect fatal error from check_inode_item() By this, we will no longer crash on the crafted image. Reported-by: Wang Yugui <wangyugui@e16-tech.com> Issue: kdave#412 Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>

kdave added the bug label Oct 11, 2021

kdave added the check Changes in btrfs check label Nov 1, 2021

kdave added this to the v5.15 milestone Nov 1, 2021

kdave closed this as completed Nov 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

btrfs check --mode=lowmem Segmentation fault (version 5.14.2) #412

btrfs check --mode=lowmem Segmentation fault (version 5.14.2) #412

wangyugui commented Oct 9, 2021

wangyugui commented Oct 30, 2021

adam900710 commented Oct 30, 2021

adam900710 commented Oct 30, 2021

wangyugui commented Oct 31, 2021 •

edited

Loading

adam900710 commented Oct 31, 2021

adam900710 commented Oct 31, 2021

wangyugui commented Oct 31, 2021

adam900710 commented Oct 31, 2021

wangyugui commented Oct 31, 2021 •

edited

Loading

wangyugui commented Nov 1, 2021

adam900710 commented Nov 1, 2021

wangyugui commented Nov 1, 2021

adam900710 commented Nov 1, 2021

wangyugui commented Nov 1, 2021 •

edited

Loading

kdave commented Nov 1, 2021

btrfs check --mode=lowmem Segmentation fault (version 5.14.2) #412

btrfs check --mode=lowmem Segmentation fault (version 5.14.2) #412

Comments

wangyugui commented Oct 9, 2021

wangyugui commented Oct 30, 2021

adam900710 commented Oct 30, 2021

adam900710 commented Oct 30, 2021

wangyugui commented Oct 31, 2021 • edited Loading

adam900710 commented Oct 31, 2021

adam900710 commented Oct 31, 2021

wangyugui commented Oct 31, 2021

adam900710 commented Oct 31, 2021

wangyugui commented Oct 31, 2021 • edited Loading

wangyugui commented Nov 1, 2021

adam900710 commented Nov 1, 2021

wangyugui commented Nov 1, 2021

adam900710 commented Nov 1, 2021

wangyugui commented Nov 1, 2021 • edited Loading

kdave commented Nov 1, 2021

wangyugui commented Oct 31, 2021 •

edited

Loading

wangyugui commented Oct 31, 2021 •

edited

Loading

wangyugui commented Nov 1, 2021 •

edited

Loading