Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting zstd compression level with btrfs defrag #184

Open
ghost opened this issue Jun 20, 2019 · 28 comments
Open

Allow setting zstd compression level with btrfs defrag #184

ghost opened this issue Jun 20, 2019 · 28 comments
Labels
enhancement kernel something in kernel has to be done too

Comments

@ghost
Copy link

ghost commented Jun 20, 2019

Currently btrfs defrag -c zstd uses a fixed compression level. It would be useful to be able to set it manually. The same way we can in mount options. zstd:1-15.

Thanks.

@denharad
Copy link

This would be great for forcing greater compression on large static files without archiving them.

@ghost
Copy link
Author

ghost commented Jun 29, 2019

This is my thinking. Stale files could be compressed higher without much loss of performance in case they need to be accessed. There are other use-cases too, I'm sure.

@kdave
Copy link
Owner

kdave commented Jul 2, 2019

This needs enhancement of the ioctl interface, there's space in the data structure to specify the level but it needs to be properly handled in kernel.

@qwerty123443
Copy link

Is this being worked on? Otherwise I'd like to attempt to implement this

@qwerty123443
Copy link

As visible above, the code is now written. I have no clue if i'm putting the pull request in the right place so please let me know if I should rebase or merge into another repo

@inodentry
Copy link

Any progress on this? AFAIK kernel (and hence, btrfs) development does not typically happen through GitHub. What would it take for the patches from the above pull requests to be merged into upstream?

@marcosps
Copy link
Contributor

@qwerty123443 I believe it would be better if you could send the patches through the mailing list :)

@qwerty123443
Copy link

Ah yes, I've kind of forgotten about this. I'll take a look at the mailing list soon (tm). There is still some refinement to be done and at the moment I do not have the time to fix it up properly. For anyone reading this: feel free to use the code I've written and change it + submit it to the linux/btrfs mailing list

@kdave kdave added the kernel something in kernel has to be done too label Apr 8, 2020
@RubenKelevra
Copy link

I was wondering why the compression level is capped at 15 for zstd. They actually go up to 22 (when ultra is on).

Would be nice to be able to compress files with more effort, if you don't bother about the speed.

@qwerty123443
Copy link

qwerty123443 commented Apr 12, 2020

I've submitted a patch to the mailing list, and I hope it'll be merged soon.
@RubenKelevra It's been a while since I've looked in to it, but if I recall correctly, the kernel does not support levels higher than 15 because of the way the workspaces are (were?) set up. Definately not sure about this one though.

The patch can be found here

@automorphism88
Copy link
Contributor

I was wondering why the compression level is capped at 15 for zstd. They actually go up to 22 (when ultra is on).

Would be nice to be able to compress files with more effort, if you don't bother about the speed.

Just a guess, but maybe it has to do with the fact that btrfs compresses files in 128 KiB "ranges", where each 128 KiB extent is effectively treated as a separate file for compression purposes. This allows for fast random access to compressed files, since only the 128 KiB extent you want needs to be decompressed, but probably negates most or all of the benefits of higher zstd levels (e.g. larger window). You could test that by compressing some 128 KiB sample files with the command-line zstd and seeing if there's any difference between the ratio you get at 15 vs 22. The 128 KiB ranges' compressed size is also effectively rounded up to a multiple of 4KiB, since each is stored as a separate extent on disk. A small difference in compression ratio won't even matter unless it's enough to make that extent one 4KiB block smaller. Bottom line, filesystem-level compression inevitably involves some tradeoffs of this nature and if you don't need the benefits that it offers, like fast random access, then just use file-level compression with whatever tool you like. Zstd also supports negative compression levels now with --fast, for even faster compression along the lines of lz4, but we can't specify those with btrfs either. That might also be useful, but in my experience zstd:1 is fast enough even with a PCIe 4 SSD.

@jfcg
Copy link

jfcg commented Mar 16, 2022

just tried btrfs filesystem defragment -r -czstd:4 and did not work :/
would love to have this.

@CrawX
Copy link

CrawX commented Mar 28, 2022

@qwerty123443 Your patch hasn't been merged into the kernel right? I couldn't find the mailing list thread via Google, any way we can track the status? Thanks for putting the work in anyway!

@qwerty123443
Copy link

@qwerty123443 Your patch hasn't been merged into the kernel right?

Correct, life happened and I got distracted by other things unfortunately. I've written a new patch that should be a lot closer to getting accepted into the mailing list, though I don't have the time to write a proper patch and submit it.

the attached code compiles, but isn't tested; It modifies the btrfs_ioctl_defrag_range_args struct so that you can pass a compress_level along with the compress_type and I think it should work out of the box (given you make slightly modified version of btrfs-progs that makes use of the compress_level)

If anyone reads this and does have the time and knowledge to finalize this, feel welcome to do it :) (most of the underlying work is already done, so it shouldn't be too much work)

I couldn't find the mailing list thread via Google, any way we can track the status? Thanks for putting the work in anyway!

That's a good question, I'm not sure to be hones.

Patch can be found here: download

From 55e07d6dd155269d4b8b8bdcb03716ca5df815cb Mon Sep 17 00:00:00 2001
Date: Tue, 29 Mar 2022 11:50:39 +0200
Subject: [PATCH] fs/btrfs: Allow for a per-extent compression level

---
 fs/btrfs/inode.c           | 34 ++++++++++++++++++++++++----------
 fs/btrfs/ioctl.c           |  6 +++++-
 include/uapi/linux/btrfs.h |  9 ++++++++-
 3 files changed, 37 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index aa0a60ee26cb..9f751076a00e 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -597,6 +597,7 @@ static noinline int compress_file_range(struct async_chunk *async_chunk)
 	int i;
 	int will_compress;
 	int compress_type = fs_info->compress_type;
+	int compress_level = 0;
 	int compressed_extents = 0;
 	int redirty = 0;

@@ -675,8 +676,10 @@ static noinline int compress_file_range(struct async_chunk *async_chunk)
 			goto cont;
 		}

-		if (BTRFS_I(inode)->defrag_compress)
-			compress_type = BTRFS_I(inode)->defrag_compress;
+		if (BTRFS_I(inode)->defrag_compress){
+			compress_type = BTRFS_I(inode)->defrag_compress & 0xF;
+			compress_level = BTRFS_I(inode)->defrag_compress >> 4;
+		}
 		else if (BTRFS_I(inode)->prop_compress)
 			compress_type = BTRFS_I(inode)->prop_compress;

@@ -697,14 +700,25 @@ static noinline int compress_file_range(struct async_chunk *async_chunk)
 			redirty = 1;
 		}

-		/* Compression level is applied here and only here */
-		ret = btrfs_compress_pages(
-			compress_type | (fs_info->compress_level << 4),
-					   inode->i_mapping, start,
-					   pages,
-					   &nr_pages,
-					   &total_in,
-					   &total_compressed);
+		/* Compression level is applied here and only here
+		 * Takes the compression level from the inode, if set
+		 */
+		if (compress_level)
+			ret = btrfs_compress_pages(
+				compress_type | compress_level << 4,
+				inode->i_mapping, start,
+				pages,
+				&nr_pages,
+				&total_in,
+				&total_compressed);
+		else
+			ret = btrfs_compress_pages(
+				compress_type | fs_info->compress_level << 4,
+				inode->i_mapping, start,
+				pages,
+				&nr_pages,
+				&total_in,
+				&total_compressed);

 		if (!ret) {
 			unsigned long offset = offset_in_page(total_compressed);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 238cee5b5254..7534e13c487d 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1777,6 +1777,7 @@ int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
 	bool do_compress = range->flags & BTRFS_DEFRAG_RANGE_COMPRESS;
 	bool ra_allocated = false;
 	int compress_type = BTRFS_COMPRESS_ZLIB;
+	int compress_level = 0;
 	int ret = 0;
 	u32 extent_thresh = range->extent_thresh;
 	pgoff_t start_index;
@@ -1792,6 +1793,8 @@ int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
 			return -EINVAL;
 		if (range->compress_type)
 			compress_type = range->compress_type;
+		if (range->compress_level)
+			compress_level = range->compress_level;
 	}

 	if (extent_thresh == 0)
@@ -1855,7 +1858,8 @@ int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
 			break;
 		}
 		if (do_compress)
-			BTRFS_I(inode)->defrag_compress = compress_type;
+			BTRFS_I(inode)->defrag_compress = compress_type |
+							  compress_level << 4;
 		ret = defrag_one_cluster(BTRFS_I(inode), ra, cur,
 				cluster_end + 1 - cur, extent_thresh,
 				newer_than, do_compress, &sectors_defragged,
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index d956b2993970..b330d0896a5a 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -605,8 +605,15 @@ struct btrfs_ioctl_defrag_range_args {
 	 */
 	__u32 compress_type;

+	/*
+	 * which compression strength to use if turning on compression
+	 * for this defrag operation.  If unspecified, the inode's value
+	 * will be used
+	 */
+	__u32 compress_level;
+
 	/* spare for later */
-	__u32 unused[4];
+	__u32 unused[3];
 };


--
2.30.2

@qwerty123443
Copy link

Would be nice to be able to compress files with more effort, if you don't bother about the speed.

As I expected, the compression levels are indeed limited to the following:
LZO: 1
Zlib: 9
ZSTD: 15

@Dalcinrafa
Copy link

I wonder when it will hit the repos? This feature would be a game changer for folks like me who run zstd on a NVMe drive.

@John-Gee
Copy link

I wonder when it will hit the repos? This feature would be a game changer for folks like me who run zstd on a NVMe drive.

Why would it be a game changer? You can already remount with a different compression and run defrag on the folder. The feature here would just make it easier to do.

@joshuachris2001
Copy link

joshuachris2001 commented Oct 18, 2022

wouldn't that be just inefficient and it would be redundant to make a mount profile or each file to have a custom/forced compression when btrfs-defrag could change the attr and compress at a certain level automatically. also you can't unmount a drive if its having data actively added and read, just for one stale file or one folder of files.

[bump]

@kdave
Copy link
Owner

kdave commented Oct 21, 2022

The patch makes it work in kernel but lacks a few things, namely backward compatibility checks, the rest are basically coding style issues.

  • extending the structure must be done with adding a new flag and then the level is parsed
  • level must be verified against the max per compression type
  • grabbing another u32 is wasteful when we have u32 for compression already and the values go from 0 to 4 only, so we can use the upper byte for extensions for level or telling defrag to 'nocompress'
  • type + level needs to be wrapped in helpers, the format with shift by 4 is arbitrary

@Artoria2e5
Copy link

Artoria2e5 commented Jun 14, 2023

Stale files could be compressed higher without much loss of performance in case they need to be accessed.
Would be nice to be able to compress files with more effort, if you don't bother about the speed.

There are plenty of compression algorithms that don't noticeably affect decomp speed on higher levels, and zstd is definitely one of them (see https://openzfs.org/w/images/b/b3/03-OpenZFS_2017_-_ZStandard_in_ZFS.pdf slide 9). Falloff beyond 19 is somewhat expected, but it might not actually translate to small block-size use (because different parameters are used).

Just a guess, but maybe it has to do with the fact that btrfs compresses files in 128 KiB "ranges",

https://github.com/facebook/zstd/blob/dev/lib/compress/clevels.h shows that unique levels are defined for 128 KiB blocks up to 22 (despite whatever snake-oil factor that can be there). Plus the fast ones.

Zstd also supports negative compression levels now with --fast, for even faster compression along the lines of lz4, but we can't specify those with btrfs either.

Agree that the even faster stuff can be useful to match SSD speed: some compression is less writing than none. Yes, even at the cost of doing no Huffman on the negative levels at all: LZ4 does that and ended up fine.

@qwerty123443 patch u32

Uhhh, make it signed maybe?

@kdave squeeze into compress_type

We need about 6 bits if we want to go from negative to 22, or 5 if you are happy to squeeze and map it to {-10..-1} {1..22}. Going too far into negative territory is no good anyways, as one would expect. (Still, why bother with the bit-clutching? Just throw a whole byte in. Or reuse the << 4 format.)

The good news is that kernel zstd is new enough to have negative levels: ZSTD_minCLevel is there!

@SteavenGamerYT
Copy link

please put that

1 similar comment
@xe5700
Copy link

xe5700 commented May 18, 2024

please put that

@kdave
Copy link
Owner

kdave commented May 18, 2024

I'm preparing an update to defrag, there are some enhancements that have accumulated over time (https://github.com/kdave/drafts/blob/master/btrfs/defrag-v2.txt).

@Atemu
Copy link

Atemu commented May 18, 2024

From the UX perspective, I currently see two issues in addition to this issue's:

  • Too many operations are loaded into a single command that achieve very different user goals
  • It's currently not obvious how to do a subvol metadata defrag and you always get a warning despite it being a perfectly valid thing to do

I've put a little thought into how I'd like to use the defrag operation and have come up with the following design which turns each use-case for defrag into a separate explicit sub-command:

  • defragment file [--recursive] <files|directories>: Simply defrag one or multiple files or contents of directories (alternate name ideas: data content)
  • defragment metadata [--subvolume|--directory] <directories>: Defrag of metadata for the directory or the entire subvol containing the dir. This would obviously not print a warning as the user explicitly asks for metadata to be defragmented here rather than content.
  • defragment compress [--compressor|--level|--recursive|--sticky] <files|directories>: Force (re-)compression of file or directory contents. Perhaps the "force" part could be even optional; only re-compressing parts that were already compressed?

Technically file and compress would both achieve defrag of the files and the man page should mention that but I think it'd be clearer and easier to use if they were separate commands.

Perhaps all of these sub-commands (or just compress?) could even be stand-alone filesystem sub-commands. This could particularly aid the discoverability of the (re-)compression operation which I've always perceived as quite a bit more common of a need than file content defrag and their connection is not directly obvious to someone unfamiliar with the implementation.

@Atemu
Copy link

Atemu commented May 18, 2024

Hm, on a second thought, the --recursive parameter is probably redundant in this design as specifying a directory can only ever mean defragmenting its content recursively because metadata defrag would be a separate command.

@KweezyCode
Copy link

is it implemented? i need this too

@EkriirkE
Copy link

I would be happy if it just acknowledged the compress property. e.g. "property get" shows "compression=zstd:15", but after just doing "defrag -c" (no algorithm) compsize reveals it uses zlib. If I use "defrag -czstd" is uses zstd but a worse compression level. For now my workaround is to temporarily "mount -oremount,compress-force=zstd:15" then run a "defrag -czstd" and it seems to use zstd with the higher compression level.

@mkeedlinger
Copy link

For now my workaround is to temporarily "mount -oremount,compress-force=zstd:15" then run a "defrag -czstd" and it seems to use zstd with the higher compression level.

Interesting work-around! Probably a decent solution as long as a performance degradation for other workloads is acceptable (since I think this would effect other writes unrelated to your intended defrag).

I think I may need to read up on the exact implications a remount has, I'm not sure if there are any downsides to doing one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement kernel something in kernel has to be done too
Projects
None yet
Development

No branches or pull requests