Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

btrfs-progs: cmds: Allow setting compression level per file #229

Closed
wants to merge 27 commits into from

Conversation

qwerty123443
Copy link

This fixes #184

kdave and others added 26 commits December 4, 2019 13:31
The variable XMLTO_EXTRA was reset to empty by mistake, so this does not
skip validation for docbook5 target and thus the build fails.

Issue: kdave#222
Signed-off-by: David Sterba <dsterba@suse.com>
The value of checksum was not set making mkfs fail.

Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
It seems to be a typo, since $dev1 was not set, and we are dealing with
TEST_DEV and not with loop devices. As this is now equivalent to the
common mount/umount helpers, use them.

Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Documentation folder path is wrong on exported testsutie. Fix this by
replace TOP with INTERNAL_BIN.

Signed-off-by: An Long <lan@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
All utilities should be found in the $TOP directory, INTERNAL_BIN was
there by accident.

Signed-off-by: David Sterba <dsterba@suse.com>
The comment about data checksum on disk_bytes is completely wrong.  Sync
it with fixed kernel comment to avoid confusion.

Reviewed-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since commit e388bf3 ("btrfs-progs: check: warn users about the
possible dangers of --repair") `btrfs check --repair` will wait 10
seconds before really repair the fs.

This hugely slow down the fsck tests. Add --force for check_image()

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
For certain fuzzed image, `btrfs check` will fail with the following
call trace:

  Checking filesystem on issue_213.raw
  UUID: 99e50868-0bda-4d89-b0e4-7e8560312ef9
  [1/7] checking root items
  [2/7] checking extents
  Program received signal SIGABRT, Aborted.
  0x00007ffff7c88f25 in raise () from /usr/lib/libc.so.6
  (gdb) bt
  #0  0x00007ffff7c88f25 in raise () from /usr/lib/libc.so.6
  kdave#1  0x00007ffff7c72897 in abort () from /usr/lib/libc.so.6
  kdave#2  0x00005555555abc3e in run_next_block (...) at check/main.c:6398
  kdave#3  0x00005555555b0f36 in deal_root_from_list (...) at check/main.c:8408
  kdave#4  0x00005555555b1a3d in check_chunks_and_extents (fs_info=0x5555556a1e30) at check/main.c:8690
  kdave#5  0x00005555555b1e3e in do_check_chunks_and_extents (fs_info=0x5555556a1e30) a
  kdave#6  0x00005555555b5710 in cmd_check (cmd=0x555555696920 <cmd_struct_check>, argc
  kdave#7  0x0000555555568dc7 in cmd_execute (cmd=0x555555696920 <cmd_struct_check>, ar
  kdave#8  0x0000555555569713 in main (argc=2, argv=0x7fffffffde70) at btrfs.c:386

[CAUSE]
This fuzzed images has a corrupted EXTENT_DATA item in data reloc tree:
        item 1 key (256 EXTENT_DATA 256) itemoff 16111 itemsize 12
                generation 0 type 2 (prealloc)
                prealloc data disk byte 16777216 nr 0
                prealloc data offset 0 nr 0

There are several problems with the item:
- Bad item size
  12 is too small.
- Bad key offset
  offset of EXTENT_DATA type key represents file offset, which should
  always be aligned to sector size (4K in this particular case).

[FIX]
Do extra item size and key offset check for original mode, and remove
the abort() call in run_next_block().

And to show off how robust lowmem mode is, lowmem can handle it without
any hiccup.

With this fix, original mode can detect the problem properly:
  Checking filesystem on issue_213.raw
  UUID: 99e50868-0bda-4d89-b0e4-7e8560312ef9
  [1/7] checking root items
  [2/7] checking extents
  ERROR: invalid file extent item size, have 12 expect (21, 16283]
  ERROR: errors found in extent allocation tree or chunk allocation
  [3/7] checking free space cache
  [4/7] checking fs roots
  root 18446744073709551607 root dir 256 error
  root 18446744073709551607 inode 256 errors 62, no orphan item, odd file extent, bad file extent
  ERROR: errors found in fs roots
  found 131072 bytes used, error(s) found
  total csum bytes: 0
  total tree bytes: 131072
  total fs tree bytes: 32768
  total extent tree bytes: 16384
  btree space waste bytes: 124774
  file data blocks allocated: 0
   referenced 0

Issue: kdave#213
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
…tree_block()

[BUG]
For a fuzzed image, `btrfs check` will segfault at open_ctree() stage:

  $ btrfs check --mode=lowmem issue_207.raw
  Opening filesystem to check...
  extent_io.c:665: free_extent_buffer_internal: BUG_ON `eb->refs < 0` triggered, value 1
  btrfs(+0x6bf67)[0x56431d278f67]
  btrfs(+0x6c16e)[0x56431d27916e]
  btrfs(alloc_extent_buffer+0x45)[0x56431d279db5]
  btrfs(read_tree_block+0x59)[0x56431d2848f9]
  btrfs(btrfs_setup_all_roots+0x29c)[0x56431d28535c]
  btrfs(+0x78903)[0x56431d285903]
  btrfs(open_ctree_fs_info+0x90)[0x56431d285b60]
  btrfs(+0x45a01)[0x56431d252a01]
  btrfs(main+0x94)[0x56431d2220c4]
  /usr/lib/libc.so.6(__libc_start_main+0xf3)[0x7f6e28519153]
  btrfs(_start+0x2e)[0x56431d22235e]

[CAUSE]
The fuzzed image has a strange log root bytenr:

  log_root                61440
  log_root_transid        0

In fact, the log_root seems to be fuzzed, as its transid is 0, which is
invalid.

Note that range [61440, 77824) covers the physical offset of the primary
super block.

The bug is caused by the following sequence:

1. cache for tree block [64K, 68K) is created by open_ctree()
   __open_ctree_fd()
   |- btrfs_setup_chunk_tree_and_device_map()
      |- btrfs_read_sys_array()
         |- sb = btrfs_find_create_tree_block()
         |- free_extent_buffer(sb)

   This created an extent buffer [64K, 68K) in fs_info->extent_cache, then
   reduce the refcount of that eb back to 0, but not freed yet.

2. Try to read that corrupted log root
   __open_ctree_fd()
   |- btrfs_setup_chunk_tree_and_device_map()
   |- btrfs_setup_all_roots()
      |- find_and_setup_log_root()
         |- read_tree_block()
            |- btrfs_find_create_tree_block()
               |- alloc_extent_buffer()

   The final alloc_extent_buffer() will try to free that cached eb
   [64K, 68K), since it doesn't match with current search.
   And since that cached eb is already released (refcount == 0), the
   extra free_extent_buffer() will cause above BUG_ON().

[FIX]
Here we fix it through a more comprehensive method, instead of simply
verifying log_root_transid, here we just don't pollute eb cache when
reading sys chunk array.

So that we won't have an eb cache [64K, 68K), and will error out at
logical mapping phase.

Issue: kdave#207
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
For one fuzzed image, `btrfs check` both modes will trigger a BUG_ON():
  Opening filesystem to check...
  Checking filesystem on issue_208.raw
  UUID: 99e50868-0bda-4d89-b0e4-7e8560312ef9
  [1/7] checking root items
  [2/7] checking extents
  ctree.h:320: btrfs_chunk_item_size: BUG_ON `num_stripes == 0` triggered, value 1
  btrfs(+0x2f712)[0x55829afa6712]
  btrfs(+0x322e5)[0x55829afa92e5]
  btrfs(+0x6892a)[0x55829afdf92a]
  btrfs(+0x69099)[0x55829afe0099]
  btrfs(+0x69c68)[0x55829afe0c68]
  btrfs(+0x6dc27)[0x55829afe4c27]
  btrfs(main+0x94)[0x55829af8b0c4]
  /usr/lib/libc.so.6(__libc_start_main+0xf3)[0x7f3edc715ee3]
  btrfs(_start+0x2e)[0x55829af8b35e]

[CAUSE]
The fuzzed image has an invalid chunk item in chunk tree:
  item 1 key (FIRST_CHUNK_TREE CHUNK_ITEM 13631488) itemoff 16105 itemsize 80
  invalid num_stripes: 0

Which triggers that BUG_ON().

[FIX]
Here we enhance the verification of btrfs_check_chunk_valid(), to check
the num_stripes and item size.

Issue: kdave#208
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
For a fuzzed image, `btrfs check` both modes trigger BUG_ON():

  Opening filesystem to check...
  volumes.c:1795: btrfs_chunk_readonly: BUG_ON `!ce` triggered, value 1
  btrfs(+0x2f712)[0x557beff3b712]
  btrfs(+0x32059)[0x557beff3e059]
  btrfs(btrfs_read_block_groups+0x282)[0x557beff30972]
  btrfs(btrfs_setup_all_roots+0x3f3)[0x557beff2ab23]
  btrfs(+0x1ef53)[0x557beff2af53]
  btrfs(open_ctree_fs_info+0x90)[0x557beff2b1a0]
  btrfs(+0x6d3f9)[0x557beff793f9]
  btrfs(main+0x94)[0x557beff200c4]
  /usr/lib/libc.so.6(__libc_start_main+0xf3)[0x7f623ac97ee3]
  btrfs(_start+0x2e)[0x557beff2035e]

[CAUSE]
The fuzzed image has a bad extent tree:

        item 0 key (288230376165343232 BLOCK_GROUP_ITEM 8388608) itemoff 16259 itemsize 24
                block group used 0 chunk_objectid 256 flags DATA

There is no corresponding chunk for the block group.

In then we hit the BUG_ON(), which expects chunk mapping for
btrfs_chunk_readonly().

[FIX]
Remove that BUG_ON() with proper error handling, and make
btrfs_read_block_groups() handle the -ENOENT error from
read_one_block_group() to continue.

So one corrupted block group item won't screw up the remaining block
group items.

Issue: kdave#209
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
For certain btrfs images, a BUG_ON() can be triggered at open_ctree()
time:

  Opening filesystem to check...
  extent_io.c:158: insert_state: BUG_ON `end < start` triggered, value 1
  btrfs(+0x2de57)[0x560c4d7cfe57]
  btrfs(+0x2e210)[0x560c4d7d0210]
  btrfs(set_extent_bits+0x254)[0x560c4d7d0854]
  btrfs(exclude_super_stripes+0xbf)[0x560c4d7c65ff]
  btrfs(btrfs_read_block_groups+0x29d)[0x560c4d7c698d]
  btrfs(btrfs_setup_all_roots+0x3f3)[0x560c4d7c0b23]
  btrfs(+0x1ef53)[0x560c4d7c0f53]
  btrfs(open_ctree_fs_info+0x90)[0x560c4d7c11a0]
  btrfs(+0x6d3f9)[0x560c4d80f3f9]
  btrfs(main+0x94)[0x560c4d7b60c4]
  /usr/lib/libc.so.6(__libc_start_main+0xf3)[0x7fd189773ee3]
  btrfs(_start+0x2e)[0x560c4d7b635e]

[CAUSE]
This is caused by passing @len == 0 to add_excluded_extent(), which
means one reverse mapped range is just out of the block group range,
normally means a by-one error.

[FIX]
Fix the boundary check on the reserve mapped range against block group
range.  If a reverse mapped super block starts at the end of the block
group, it doesn't cover so we don't need to bother the case.

Issue: kdave#210
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There are two ASSERT() with completely the same check introduced in
commit f7717d8 ("btrfs-progs: Remove fsid/metdata_uuid fields
from fs_info").

Just remove it.

Fixes: f7717d8 ("btrfs-progs: Remove fsid/metdata_uuid fields from fs_info")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
…ctory

Ran misc-tests/038 in /root/btrfs-progs:

  make test-misc TEST=038\*
      [TEST]   misc-tests.sh
      [TEST/misc]   038-backup-root-corruption
  ./test.sh: line 33: [: bytenr=65536,: integer expression expected
  Backup slot 2 is not in use
  test failed for case 038-backup-root-corruption
  make: *** [Makefile:401: test-misc] Error 1

It's caused by the wrong line filtered by
$(dump_super | grep root | head -n1 | awk '{print $2}').

The command $(dump-super | grep root) outputs:

  superblock: bytenr=65536, device=/root/btrfs-progs/tests/test.img
  root                    30605312
  chunk_root_generation   5
  root_level              0
  chunk_root              22036480
  chunk_root_level        0
  log_root                0
  log_root_transid        0
  log_root_level          0
  root_dir                6
  backup_roots[4]:

Here
"superblock: bytenr=65536, device=/root/btrfs-progs/tests/test.img" is
selected but not the line "root                    30605312".

Restricting the awk rule can fix it.

Fixes: 78a3831 ("btrfs-progs: tests: Test backup root retention logic")
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When kernel returns ENOTCONN after the user tries to create a qgroup on
a subvolume without quota enabled, present a meaningful message to the
user. Kernels before 5.5 return EINVAL for that.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
When using `btrfs check --init-extent-tree`, there is a pretty high
chance that the result fs can't pass tree-checker:

  BTRFS critical (device dm-3): corrupt leaf: block=5390336 slot=149 extent bytenr=20115456 len=4096 invalid generation, have 16384 expect (0, 360]
  BTRFS error (device dm-3): block=5390336 read time tree block corruption detected
  BTRFS error (device dm-3): failed to read block groups: -5
  BTRFS error (device dm-3): open_ctree failed

[CAUSE]
The result fs has a pretty screwed up EXTENT_ITEMs for data extents:

        item 148 key (20111360 EXTENT_ITEM 4096) itemoff 8777 itemsize 53
                refs 1 gen 0 flags DATA
                extent data backref root FS_TREE objectid 841 offset 0 count 1
        item 149 key (20115456 EXTENT_ITEM 4096) itemoff 8724 itemsize 53
                refs 1 gen 16384 flags DATA
                extent data backref root FS_TREE objectid 906 offset 0 count 1

Kernel tree-checker will accept 0 generation, but that 16384 generation
is definitely going to trigger the alarm.

Looking into the code, it's add_extent_rec_nolookup() allocating a new
extent_rec, but not copying all members from parameter @tmpl, resulting
generation not properly initialized.

[FIX]
Just copy tmpl->generation in add_extent_rec_nolookup(). And since all
call sites have set all members of @tmpl to 0 before
add_extent_rec_nolookup(), we shouldn't get garbage values.

For the 0 generation problem, it will be solved in another patch.

Issue: kdave#225 (Not the initial report, but extent tree rebuild result)
Reviewed-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
…ents

[BUG]
When using `btrfs check --init-extent-tree`, we will create incorrect
generation number for data extents in extent tree:

        item 10 key (13631488 EXTENT_ITEM 1048576) itemoff 15828 itemsize 53
                refs 1 gen 0 flags DATA
                extent data backref root FS_TREE objectid 257 offset 0 count 1

[CAUSE]
Since data extent generation is not as obvious as tree blocks, which has
header containing its generations, so for data extents, its
extent_record::generation is not really updated, resulting such 0
generation.

[FIX]
To get generation of a data extent, there are two sources we can rely:
- EXTENT_ITEM
  There is always a btrfs_extent_item::generation can be utilized.
  Although this is not possible for --init-extent-tree use case.

- EXTENT_DATA
  We have btrfs_file_extent_item::generation for regular and
  preallocated data extents.
  Since --init-extent-tree will go through subvolume trees, this would
  be the main source for extent data generation.

Then we only need to make add_data_backref() to accept @gen parameter,
and pass it down to extent_record structure.

And for the final extent item generation update, here we add extra
fallback values, if we can't find FILE_EXTENT items.
In that case, we just fall back to current transid.

With this modification, recreated data EXTENT_ITEM now has correct
generation number:

        item 10 key (13631488 EXTENT_ITEM 1048576) itemoff 15828 itemsize 53
                refs 1 gen 6 flags DATA
                extent data backref root FS_TREE objectid 257 offset 0 count 1

Reviewed-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
… generation

Since older `btrfs check --init-extent-tree` could cause invalid
EXTENT_ITEM generation for data extents, add such check to lowmem mode
check.

Also add such generation check to file extents too.

For the repair part, I don't have any good idea yet. So affected user
may depend on --init-extent-tree again.

Reviewed-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Much like what we have done in lowmem mode, also detect and report
invalid extent generation in original mode.

Unlike lowmem mode, we have extent_record::generation, which is the
higher number of generations in EXTENT_ITEM, EXTENT_DATA or tree block
header, so there is no need to check generations in different locations.

For repair, we still need to depend on --init-extent-tree, as directly
modifying extent items can easily cause conflicts with delayed refs,
thus it should be avoided.

Reviewed-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The new image has a bad data extent item generation:

        item 0 key (13631488 EXTENT_ITEM 4096) itemoff 16230 itemsize 53
                refs 1 gen 16384 flags DATA
                extent data backref root FS_TREE objectid 257 offset 0 count 1

Just like what older `btrfs check --init-extent-tree` could cause.

Reviewed-by: Su Yue <Damenly_Su@gmx.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Same as LOGICAL_INO, except a different magic number.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Signed-off-by: David Sterba <dsterba@suse.com>
Update the args structure, add the flags constant and the ioctl magic
number.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Signed-off-by: David Sterba <dsterba@suse.com>
Update the args structure, add the flags constant and the ioctl magic
number.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Signed-off-by: David Sterba <dsterba@suse.com>
Increase the maximum buffer size to SZ_16M.

Add an option (-o) to set the ..._IGNORE_OFFSET flag.

If the buffer size is greater than 64K or the IGNORE_OFFSET option
is used, call ioctl V2; otherwise, use ioctl V1 to be compatible with
older kernels.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Signed-off-by: David Sterba <dsterba@suse.com>
… 64K

Filesystems with nontrivial snapshots or dedupe will easily overflow
a 4K buffer.  Bump the size up to the largest size supported by the
V1 ioctl.

Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
…nd kernel requirements

Document the new options requiring the V2 ioctl and the increased
default buffer size.

Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
@qwerty123443
Copy link
Author

I should note, this is in conjunction with kdave/btrfs-devel#6

@kdave kdave force-pushed the devel branch 7 times, most recently from 7bfedf8 to cda1f14 Compare February 22, 2023 02:06
@kdave kdave force-pushed the devel branch 4 times, most recently from 63a7417 to 4fc291a Compare March 1, 2023 14:20
@kdave kdave force-pushed the devel branch 4 times, most recently from e53f194 to 405f4c7 Compare April 26, 2023 22:51
@kdave kdave force-pushed the devel branch 3 times, most recently from af681f4 to 1628b2f Compare May 10, 2023 19:47
@kdave kdave force-pushed the devel branch 2 times, most recently from a8c6f08 to 34c5de3 Compare May 22, 2023 12:32
@kdave kdave force-pushed the devel branch 5 times, most recently from 5e02de0 to adfc8a9 Compare May 26, 2023 17:53
@kdave
Copy link
Owner

kdave commented Jun 9, 2023

This namely depends on the kernel part, the remaining work is in #184.

@kdave kdave closed this Jun 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants