Recent test on the compressor comparison #92

xuchuanyin · 2018-09-14T08:53:37Z

Hi all, yesterday I ran a compressor comparison through the benchmark tool provided in our test code. The below table shows a part of the result, a more detailed log can be found at the end of this comment.

comparison ratio	compress	decompress
airlift-snappy/xerial-snappy	2.294	0.706
airlift-lz4/jpountz-lz4	0.86	1.07
airlift-lzo/hadoop-lzo	1.92	2.6
airlift-zstd/luben-zstd	0.998	1.08

Any comments are welcome.

The full table is here: compress_log.xlsx

The full runlog of the benchmark is here: compressor.log

dain · 2018-09-14T17:27:53Z

Is there something specific you are looking for comments on?

xuchuanyin · 2018-09-15T01:28:55Z

I'm wondering if the test result is OK since this project claimed that

it faster than other implementation by 10%~40%

But for snappy decompression, it is about 30% slower.

dain · 2018-09-15T05:53:39Z

After a quick look, I'd guess that either there is a regression/bug in aircompressor, the native snappy decompressor got a lot better, or the native snappy decompressor is running without bounds checks (very unsafe). BTW, we switched all of our uses from Snappy to LZ4, because was better in all of our use cases.

xuchuanyin · 2018-09-15T06:43:19Z

em... also the test shown that the airlift version of LZ4 is not obviously better than the jni version.

dain · 2018-09-18T17:55:48Z

Ah, think I understand now. I believe you are asking the question "why would I use this project when I can use the JNI wrappers?" We created this project to avoid JNI for usability, portability issues and the effect is has on the the GC. In Hotspot based JVMs, you have two choices for JNI operating on heap data, you can copy it to native memory, or you can lock the heap and operate on it without copy. The fist mode has a computational cost, but in my mind the bigger problem is the complexity cost in the buffer/resource management. The second mode has a nasty problem that the locks prevent the GC from running regularly, which can result in early OOMs if you are running highly concurrent software like Presto with a highly concurrent GC like G1. Another benefit of Java for compression algorithms is that the JVM can inline the compression code directly into the uses, which can result in pretty big speedups as the compression code is adapted to the actual inlined use case. You can't see this thin isolated benchmarks, and instead you have to test your actual uses. Finally, there are other benefits like debugging and profiling just working like normal Java code.

The original goal of this project was to create compatible compression algorithms that were on par with the performance of the JNI wrappers. What we found in the initial versions where that they were actually more performant, typically in the range listed in the readme. The performance changes over time as the JNI implementations change, and as the JVM changes. For example, you ran the benchmark on Java 8, but we run on Java 10. We also likely run on different CPUs, and most importantly, the performance of compression algorithms is totally dependent on the data you feed it (take a look at the high variance of the different benchmark corpuses).

If none of this appeals to you, or you don't have the same concurrency issues or portability/ergonomics concerns, use the JNI implementations. They are generally excellent.

dain closed this as completed Oct 1, 2018

This was referenced Aug 18, 2023

[Demote] ZStd compression from GA / LTS to experimental and release a 2.9.1 patch opensearch-project/OpenSearch#9422

Closed

ZStd JNI vs Aircompressor pure java performance question #174

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recent test on the compressor comparison #92

Recent test on the compressor comparison #92

xuchuanyin commented Sep 14, 2018 •

edited

Loading

dain commented Sep 14, 2018

xuchuanyin commented Sep 15, 2018

dain commented Sep 15, 2018

xuchuanyin commented Sep 15, 2018 •

edited

Loading

dain commented Sep 18, 2018

Recent test on the compressor comparison #92

Recent test on the compressor comparison #92

Comments

xuchuanyin commented Sep 14, 2018 • edited Loading

dain commented Sep 14, 2018

xuchuanyin commented Sep 15, 2018

dain commented Sep 15, 2018

xuchuanyin commented Sep 15, 2018 • edited Loading

dain commented Sep 18, 2018

xuchuanyin commented Sep 14, 2018 •

edited

Loading

xuchuanyin commented Sep 15, 2018 •

edited

Loading