-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
btrfs check --mode=lowmem Segmentation fault (version 5.14.2) #412
Comments
this problem is still happen on pre-release 5.15-rc1(branch v5.15.x) |
Can not reproduce here. I'm testing commit 330b86c |
And the result shows it's indeed running in lowmem mode, and everything is fine: |
upload the core the file and elf file. v5.15.x branch, b40d2c7 |
Mind to use valgrind or "make D=asan" build and provide the full output? It looks like some kind of memory corruption thus it has some randomness related to the memory layout. |
BTW, for both modes I'm seeing a WARN_ON() triggered inside __free_extent(). But I don't think that's the direct cause of the crash. |
build on centos 7(make D=asan) & test on centos 7 => NOT happen |
One trick, if you only need to run one test, it can be done like this:
And if D=asan is not detecting the problem, you may want to go with valgrind. I guess the problem happens for the --repair part, thus what you need is:
|
valgrind catch something |
valgrind catch almost same thing even without '--mode lowmem' so this problem may happen without '--mode lowmem' too. |
Oh, I forgot to check the So the repair is all done in original mode, you can verify that in the fsck-tests-results even for lowmem mode:
No --mode=lowmem. So it's a bug in the original mode repair code. Then the pwrite part seems to be a known false alert:
So no need to worry about that. But the important part is the warning part:
This means the WARN_ON() can be randomly triggered. The possible uninitialized value seems to be BTW, does the D=asan output anything? |
D=asan report almost same thing for lowmen and no-lowmen. fsck-tests-results-lowmem.txt |
BTW, do you have the original segfault tests result? |
the patch from Wu works well. |
Thanks for the report and tracking it down. Fixed in devel and will be in 5.15. |
…properly handled [BUG] When a special image (diverted from fsck/012) has its unused slots (slot number >= nritems) with garbage, lowmem mode btrfs check can crash: (gdb) run check --mode=lowmem ~/downloads/good.img.restored Starting program: /home/adam/btrfs/btrfs-progs/btrfs check --mode=lowmem ~/downloads/good.img.restored ... ERROR: root 5 INODE[5044031582654955520] nlink(257228800) not equal to inode_refs(0) ERROR: root 5 INODE[5044031582654955520] nbytes 474624 not equal to extent_size 0 Program received signal SIGSEGV, Segmentation fault. 0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703 1703 BTRFS_SETGET_FUNCS(inode_size, struct btrfs_inode_item, size, 64); (gdb) bt #0 0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703 #1 0x0000555555641544 in check_inode_item (root=0x5555556c2290, path=0x7fffffffd960) at check/mode-lowmem.c:2628 [CAUSE] At check_inode_item() we have path->slot[0] at 29, while the tree block only has 26 items. This happens because two reasons: - btrfs_next_item() never reverts its slots Even if we failed to read next leaf. - check_inode_item() doesn't inform the caller that a fatal error happened In check_inode_item(), if btrfs_next_item() failed, it goes to out label, which doesn't really set @err properly. This means, when check_inode_item() fails at btrfs_next_item(), it will increase path->slots[0], while it's already beyond current tree block nritems. When the slot increases furthermore, and if the unused item slots have some garbage, we will get invalid btrfs_item_ptr() result, and causing above segfault. [FIX] Fix the problems by two ways: - Make btrfs_next_item() to revert its path->slots[0] on failure - Properly detect fatal error from check_inode_item() By this, we will no longer crash on the crafted image. Reported-by: Wang Yugui <wangyugui@e16-tech.com> Issue: #412 Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
…properly handled [BUG] When a special image (diverted from fsck/012) has its unused slots (slot number >= nritems) with garbage, lowmem mode btrfs check can crash: (gdb) run check --mode=lowmem ~/downloads/good.img.restored Starting program: /home/adam/btrfs/btrfs-progs/btrfs check --mode=lowmem ~/downloads/good.img.restored ... ERROR: root 5 INODE[5044031582654955520] nlink(257228800) not equal to inode_refs(0) ERROR: root 5 INODE[5044031582654955520] nbytes 474624 not equal to extent_size 0 Program received signal SIGSEGV, Segmentation fault. 0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703 1703 BTRFS_SETGET_FUNCS(inode_size, struct btrfs_inode_item, size, 64); (gdb) bt #0 0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703 #1 0x0000555555641544 in check_inode_item (root=0x5555556c2290, path=0x7fffffffd960) at check/mode-lowmem.c:2628 [CAUSE] At check_inode_item() we have path->slot[0] at 29, while the tree block only has 26 items. This happens because two reasons: - btrfs_next_item() never reverts its slots Even if we failed to read next leaf. - check_inode_item() doesn't inform the caller that a fatal error happened In check_inode_item(), if btrfs_next_item() failed, it goes to out label, which doesn't really set @err properly. This means, when check_inode_item() fails at btrfs_next_item(), it will increase path->slots[0], while it's already beyond current tree block nritems. When the slot increases furthermore, and if the unused item slots have some garbage, we will get invalid btrfs_item_ptr() result, and causing above segfault. [FIX] Fix the problems by two ways: - Make btrfs_next_item() to revert its path->slots[0] on failure - Properly detect fatal error from check_inode_item() By this, we will no longer crash on the crafted image. Reported-by: Wang Yugui <wangyugui@e16-tech.com> Issue: #412 Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
…properly handled [BUG] When a special image (diverted from fsck/012) has its unused slots (slot number >= nritems) with garbage, lowmem mode btrfs check can crash: (gdb) run check --mode=lowmem ~/downloads/good.img.restored Starting program: /home/adam/btrfs/btrfs-progs/btrfs check --mode=lowmem ~/downloads/good.img.restored ... ERROR: root 5 INODE[5044031582654955520] nlink(257228800) not equal to inode_refs(0) ERROR: root 5 INODE[5044031582654955520] nbytes 474624 not equal to extent_size 0 Program received signal SIGSEGV, Segmentation fault. 0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703 1703 BTRFS_SETGET_FUNCS(inode_size, struct btrfs_inode_item, size, 64); (gdb) bt #0 0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703 #1 0x0000555555641544 in check_inode_item (root=0x5555556c2290, path=0x7fffffffd960) at check/mode-lowmem.c:2628 [CAUSE] At check_inode_item() we have path->slot[0] at 29, while the tree block only has 26 items. This happens because two reasons: - btrfs_next_item() never reverts its slots Even if we failed to read next leaf. - check_inode_item() doesn't inform the caller that a fatal error happened In check_inode_item(), if btrfs_next_item() failed, it goes to out label, which doesn't really set @err properly. This means, when check_inode_item() fails at btrfs_next_item(), it will increase path->slots[0], while it's already beyond current tree block nritems. When the slot increases furthermore, and if the unused item slots have some garbage, we will get invalid btrfs_item_ptr() result, and causing above segfault. [FIX] Fix the problems by two ways: - Make btrfs_next_item() to revert its path->slots[0] on failure - Properly detect fatal error from check_inode_item() By this, we will no longer crash on the crafted image. Reported-by: Wang Yugui <wangyugui@e16-tech.com> Issue: #412 Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
…properly handled [BUG] When a special image (diverted from fsck/012) has its unused slots (slot number >= nritems) with garbage, lowmem mode btrfs check can crash: (gdb) run check --mode=lowmem ~/downloads/good.img.restored Starting program: /home/adam/btrfs/btrfs-progs/btrfs check --mode=lowmem ~/downloads/good.img.restored ... ERROR: root 5 INODE[5044031582654955520] nlink(257228800) not equal to inode_refs(0) ERROR: root 5 INODE[5044031582654955520] nbytes 474624 not equal to extent_size 0 Program received signal SIGSEGV, Segmentation fault. 0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703 1703 BTRFS_SETGET_FUNCS(inode_size, struct btrfs_inode_item, size, 64); (gdb) bt #0 0x0000555555639b11 in btrfs_inode_size (eb=0x5555558a7540, s=0x642e6cd1) at ./kernel-shared/ctree.h:1703 kdave#1 0x0000555555641544 in check_inode_item (root=0x5555556c2290, path=0x7fffffffd960) at check/mode-lowmem.c:2628 [CAUSE] At check_inode_item() we have path->slot[0] at 29, while the tree block only has 26 items. This happens because two reasons: - btrfs_next_item() never reverts its slots Even if we failed to read next leaf. - check_inode_item() doesn't inform the caller that a fatal error happened In check_inode_item(), if btrfs_next_item() failed, it goes to out label, which doesn't really set @err properly. This means, when check_inode_item() fails at btrfs_next_item(), it will increase path->slots[0], while it's already beyond current tree block nritems. When the slot increases furthermore, and if the unused item slots have some garbage, we will get invalid btrfs_item_ptr() result, and causing above segfault. [FIX] Fix the problems by two ways: - Make btrfs_next_item() to revert its path->slots[0] on failure - Properly detect fatal error from check_inode_item() By this, we will no longer crash on the crafted image. Reported-by: Wang Yugui <wangyugui@e16-tech.com> Issue: kdave#412 Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
steps to reproduce:
$ make test-check-lowmem
then a core file is left under tests/fsck-tests/012-leaf-corruption/
$ file tests/fsck-tests/012-leaf-corruption/core.67317
tests/fsck-tests/012-leaf-corruption/core.67317: ELF 64-bit LSB core file, x86-64, version 1 (SYSV), SVR4-style, from '/ssd/git/os/btrfs-progs/btrfs check --mode=lowmem ./good.img.restored', real uid: 0, effective uid: 0, real gid: 0, effective gid: 0, execfn: '/ssd/git/os/btrfs-progs/btrfs', platform: 'x86_64'
$ gdb /ssd/git/os/btrfs-progs/btrfs tests/fsck-tests/012-leaf-corruption/core.67317
Core was generated by `/ssd/git/os/btrfs-progs/btrfs check --mode=lowmem ./good.img.restored'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 btrfs_inode_size (s=0x642e6cd1, eb=0xc28fc0) at ./kernel-shared/ctree.h:1709
1709 BTRFS_SETGET_FUNCS(inode_size, struct btrfs_inode_item, size, 64);
(gdb) where
#0 btrfs_inode_size (s=0x642e6cd1, eb=0xc28fc0) at ./kernel-shared/ctree.h:1709
#1 check_inode_item (root=root@entry=0xa43e90, path=path@entry=0x7fff8e5f36a0) at check/mode-lowmem.c:2628
#2 0x0000000000458c61 in process_one_leaf (level=, nrefs=0x7fff8e5f35c0, path=0x7fff8e5f36a0,
root=0xa43e90) at check/mode-lowmem.c:2896
#3 walk_down_tree (check_all=0, nrefs=0x7fff8e5f35c0, level=, path=0x7fff8e5f36a0, root=0xa43e90)
at check/mode-lowmem.c:4953
#4 check_btrfs_root (root=root@entry=0xa43e90, check_all=check_all@entry=0) at check/mode-lowmem.c:5254
#5 0x000000000045b908 in check_fs_root (root=0xa43e90) at check/mode-lowmem.c:5288
#6 check_fs_roots_lowmem () at check/mode-lowmem.c:5449
#7 0x0000000000432301 in do_check_fs_roots (root_cache=root_cache@entry=0x7fff8e5f43d8) at check/main.c:3911
#8 0x000000000043ea7f in cmd_check (cmd=, argc=, argv=)
at check/main.c:10818
#9 0x000000000040e130 in cmd_execute (argv=0x7fff8e5f4550, argc=3, cmd=0x6dce60 <cmd_struct_check>)
at cmds/commands.h:125
#10 main (argc=3, argv=0x7fff8e5f4550) at btrfs.c:405
(gdb)
The text was updated successfully, but these errors were encountered: