Reduce disk space with RocksDB compaction and compression #1326
Labels
A-rust
Area: Updates to Rust code
A-state
Area: State / database changes
C-enhancement
Category: This is an improvement
Is your feature request related to a problem? Please describe.
RocksDB logs every write to a MANIFEST file, then replays them on load. Similarly, the write-ahead log can grow really large.
We should limit the size of the manifest file and write-ahead log file.
https://github.com/facebook/rocksdb/wiki/Speed-Up-DB-Open#manifest-replay
RocksDB does single-threaded compaction by default, but also supports multi-threaded compaction.
https://github.com/facebook/rocksdb/wiki/Compaction
RocksDB can also open files with multiple threads:
https://github.com/facebook/rocksdb/wiki/Speed-Up-DB-Open#reading-footer-and-meta-blocks-of-all-the-sst-files
RocksDB supports block-level (database page) compression:
https://github.com/facebook/rocksdb/wiki/Compression
https://github.com/facebook/rocksdb/wiki/Rocksdb-BlockBasedTable-Format
Some of our data is incompressible (hashes, keys, proofs), but other data is compressible (heights, empty fields, duplicated data).
If the data compresses well, we'll save disk space. And if the machine has spare CPU, we'll see a speedup loading data from disk, too.
Describe the solution you'd like
Limit the size of the manifest file and write-ahead log file:
options.max_manifest_file_size
options.max_total_wal_size
Multithreaded compaction:
Set
options.max_background_compactions
to4
.https://github.com/facebook/rocksdb/wiki/Setup-Options-and-Basic-Tuning#other-general-options
Set
options.max_file_opening_threads
to2
.https://github.com/facebook/rocksdb/wiki/Speed-Up-DB-Open#reading-footer-and-meta-blocks-of-all-the-sst-files
Lightweight compression:
Set RocksDB's
options.compression
toLZ4
, and compare the database size from a new load.Heavyweight compression:
Also try setting
options.bottommost_compression
toZSTD
, and do a comparison with lightweight and no compression.Per-Column Family compression:
Enable compression only for column families containing
Height
orBlock
(and other data which contains runs of zeroes, or repeated patterns).https://github.com/facebook/rocksdb/wiki/RocksDB-FAQ#basic-readwrite
Describe alternatives you've considered
Do nothing.
Compression isn't really a high priority.
Alternate compaction:
Set
options.compaction_style
tokCompactionStyleUniversal
, and compare the database size from a new load.Universal compaction has size limits and a double-size issue, so we don't want to use it:
https://github.com/facebook/rocksdb/wiki/Universal-Compaction#double-size-issue
The text was updated successfully, but these errors were encountered: