-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build clib options #135
Merged
Merged
Build clib options #135
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
t20100
added a commit
to t20100/hdf5plugin
that referenced
this pull request
May 31, 2023
dc05e0264 Tag open source release 1.1.10. 7b82423c5 The output buffer in DecompressBranchless is never read from and the source buffers are never written. This allows us to defer any writes to the output buffer for an arbitrary amount of time as long as the writes all occur in the proper order. When a MemCopy64 would have normally occurred we save away the source address and length. Once we reach the location of the next write to the output buffer first perform the deferred copy. This gives time for the source address calculation and length to finish before the deferred copy. 30326e5b8 Merge pull request silx-kit#150 from davemgreen:betterunalignedloads 74960e8bd Allow some buffer overwrite on literal emitting 37f375dde Add prefetch to zippy decompess, 15e2a0e13 Add "cc" clobbers to inline asm that modifies flags. 8881ba172 Improve the speed of hashing in zippy compression. a2d219a8a Modify MemCopy64 to use AVX 32 byte copies instead of SSE2 16 byte copies on capable x86 platforms. This gives an average speedup of 6.87% on Milan and 1.90% on Skylake. 984b191f0 Fix the remaining occurrence of non-const `std::string::data()`. 974fcc49e Fix compilation errors under C++11. d644ca877 Fix warnings due to use of `__attribute__(always_inline)` without `inline`. 9758c9dfd Add `snappy::CompressFromIOVec`. af720f9a3 Merge pull request silx-kit#148 from pitrou:ubsan-ptr-add-overflow 44caf7908 Move the comment about non-overlap requirement from the implementation to the contract of `MemCopy64()`, and clarify that it applies to `size`, not to 64. d261d2766 Optimize zippy MemCpy / MemMove during decompression 6a2b78a37 Optimize Zippy compression for ARM by 5-10% by choosing csel instructions 8dd58a519 Fix compilation for older GCC and Clang versions. 6c6e890ef Change LittleEndian loads/stores to use memcpy 8b07ff196 Update contributing guidelines. 64df9f28c Fix UBSan error (ptr + offset overflow) 65dc7b383 Pass by reference the first argument of ExtractLowBytes to avoid UB of passing uninitialized argument by value. fe18b4632 Switch CI to GitHub Actions. a7ddc144d Merge pull request silx-kit#140 from JunHe77:adv aeb5de55a decompress: refine data depdency 7062d7f1d Merge pull request silx-kit#133 from JunHe77:simd cbb83a1d6 Migrate feature detection macro checks from #ifdef to #if. a8400f1fa Add baseline CPU level to Travis CI. b9c9a989b Merge pull request silx-kit#135 from JunHe77:remove_extra 5c87bc61b Merge pull request silx-kit#136 from JunHe77:ext_arm 734b32bfe Add config and header file for NEON support ab9a57280 Fix SSE3 and BMI2 compile error d643b9a98 decompress: add hint to remove extra AND f52721b2b decompression: optimize ExtractOffset for Arm f2db8f77c Move the extract masks variable out in zippy. I see a consistent 1.5-2% improvement for ARM. Probably because ARM has more relaxed address computation than x86 https://www.godbolt.org/z/bfM1ezx41. I don't think this is a compiler bug or it can do something about it c8f764164 Remove inline assembly as the bug in clang was fixed 9cc3689b2 Optimize memset to pure SIMD because compilers generate consistently bad code. clang for ARM and gcc for x86 https://gcc.godbolt.org/z/oxeGG7aEx b4888f761 Optimize tag extraction for ARM with conditional increment instruction generation (csinc). For codegen see https://gcc.godbolt.org/z/a8z9j95Pv b3fb0b5b4 Enable vector byte shuffle optimizations on ARM NEON b638ebe5d Update Travis CI config. d8f5dd8ec Clarify, in a comment, that offset/256 fits in 3 bits. It has to in this context, because the other 5 bits in the byte are used for len-4 and the tag. git-subtree-dir: src/snappy git-subtree-split: dc05e026488865bc69313a68bcc03ef2e4ea8e83
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Merge PR #130 first! Then it boils down to commit e9da8aa
This PR adds usage of compilation options from build configuration when building C libraries used by compression filters (namely, snappy, charls, zfp).
Without this, configured compilation options are used to build the compression filter but not the libraries they rely on.