-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
memset() rather than reduceIndex() #1658
Conversation
like assert() but cannot be disabled. proper separation of user contract errors (CONTROL()) and invariant verification (assert()).
…imit by disabling continue mode when index is close to limit.
lib/compress/zstd_compress.c
Outdated
* memset() will be triggered before reduceIndex(). | ||
*/ | ||
#define ZSTD_INDEXOVERFLOW_MARGIN (16 MB) | ||
static int ZSTD_index_valid_for_continue(ZSTD_window_t w) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ZSTD_indexValidForContinue()
is more inline with naming.
Also : minor speed optimization : shortcut to ZSTD_reset_matchState() rather than the full reset process. It still needs to be completed with ZSTD_continueCCtx() for proper initialization. Also : changed position of LDM hash tables in the context, so that the "regular" hash tables can be at a predictable position, hence allowing the shortcut to ZSTD_reset_matchState() without complex conditions.
This last update implements an optimization : Seems to work fine so far, but I'm not against a second look, In particular, I would like to be completely sure that it works well in combination with LDM. @terrelln likely has a better understanding of this part. |
It should be fine. LDM manages its own window and indices, so this change shouldn't impact it. I'll take a closer look in a bit, just to be sure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry this fell off my radar, LGTM!
I recently received a user performance report stating that
reduceIndex()
can require a non-negligible delay to perform its job, especially when applied to a large context.reduceIndex()
is required in 2 scenarios :continue
mode as a consequence of constantly keeping the same parameters.For the first case, there's nothing new.
For the second case, the issue is that
reduceIndex()
is a measurable load compared to a single compression job. User reported delay in the ~80ms range. I could find similar delay on my laptop withllvm
in the 20-30ms range when using large tables (~30 MB). User results are exacerbated by the use of Visual Studio, which is worse at auto-vectorization (reduceIndex()
performance is tied to auto-vectorization).The thing is, for such case, we do not actually need a
reduceIndex()
operation. A simplermemset()
would work just as well. And of course, a big difference is thatmemset()
is actually fast (I could measure up to 10x faster), and performance differences between compilers is unlikely to be "large".This patch tries to favor
memset()
overreduceIndex()
by testingcurrentIndex
when starting a new compression (detected by triggeringZSTD_resetCCtx_internal()
). It will disallowcontinue
mode whencurrentIndex
is considered "too close" to the limit.The limit was already defined in
zstd_compress_internal.h
asZSTD_CURRENT_MAX
, and "too close" has been arbitrarily defined as< ZSTD_INDEXOVERFLOW_MARGIN (== 16 MB)
.