Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build clib options #135

Merged
merged 19 commits into from
Jul 1, 2021
Merged

Build clib options #135

merged 19 commits into from
Jul 1, 2021

Conversation

t20100
Copy link
Member

@t20100 t20100 commented Jun 23, 2021

Merge PR #130 first! Then it boils down to commit e9da8aa

This PR adds usage of compilation options from build configuration when building C libraries used by compression filters (namely, snappy, charls, zfp).
Without this, configured compilation options are used to build the compression filter but not the libraries they rely on.

@t20100 t20100 added this to the Next release milestone Jun 23, 2021
@kif kif merged commit bc3634d into silx-kit:main Jul 1, 2021
@t20100 t20100 deleted the build_clib-options branch October 13, 2021 11:40
t20100 added a commit to t20100/hdf5plugin that referenced this pull request May 31, 2023
dc05e0264 Tag open source release 1.1.10.
7b82423c5 The output buffer in DecompressBranchless is never read from and the source buffers are never written.  This allows us to defer any writes to the output buffer for an arbitrary amount of time as long as the writes all occur in the proper order.  When a MemCopy64 would have normally occurred we save away the source address and length.  Once we reach the location of the next write to the output buffer first perform the deferred copy.  This gives time for the source address calculation and length to finish before the deferred copy.
30326e5b8 Merge pull request silx-kit#150 from davemgreen:betterunalignedloads
74960e8bd Allow some buffer overwrite on literal emitting
37f375dde Add prefetch to zippy decompess,
15e2a0e13 Add "cc" clobbers to inline asm that modifies flags.
8881ba172 Improve the speed of hashing in zippy compression.
a2d219a8a Modify MemCopy64 to use AVX 32 byte copies instead of SSE2 16 byte copies on capable x86 platforms.  This gives an average speedup of 6.87% on Milan and 1.90% on Skylake.
984b191f0 Fix the remaining occurrence of non-const `std::string::data()`.
974fcc49e Fix compilation errors under C++11.
d644ca877 Fix warnings due to use of `__attribute__(always_inline)` without `inline`.
9758c9dfd Add `snappy::CompressFromIOVec`.
af720f9a3 Merge pull request silx-kit#148 from pitrou:ubsan-ptr-add-overflow
44caf7908 Move the comment about non-overlap requirement from the implementation to the contract of `MemCopy64()`, and clarify that it applies to `size`, not to 64.
d261d2766 Optimize zippy MemCpy / MemMove during decompression
6a2b78a37 Optimize Zippy compression for ARM by 5-10% by choosing csel instructions
8dd58a519 Fix compilation for older GCC and Clang versions.
6c6e890ef Change LittleEndian loads/stores to use memcpy
8b07ff196 Update contributing guidelines.
64df9f28c Fix UBSan error (ptr + offset overflow)
65dc7b383 Pass by reference the first argument of ExtractLowBytes to avoid UB of passing uninitialized argument by value.
fe18b4632 Switch CI to GitHub Actions.
a7ddc144d Merge pull request silx-kit#140 from JunHe77:adv
aeb5de55a decompress: refine data depdency
7062d7f1d Merge pull request silx-kit#133 from JunHe77:simd
cbb83a1d6 Migrate feature detection macro checks from #ifdef to #if.
a8400f1fa Add baseline CPU level to Travis CI.
b9c9a989b Merge pull request silx-kit#135 from JunHe77:remove_extra
5c87bc61b Merge pull request silx-kit#136 from JunHe77:ext_arm
734b32bfe Add config and header file for NEON support
ab9a57280 Fix SSE3 and BMI2 compile error
d643b9a98 decompress: add hint to remove extra AND
f52721b2b decompression: optimize ExtractOffset for Arm
f2db8f77c Move the extract masks variable out in zippy. I see a consistent 1.5-2% improvement for ARM. Probably because ARM has more relaxed address computation than x86 https://www.godbolt.org/z/bfM1ezx41. I don't think this is a compiler bug or it can do something about it
c8f764164 Remove inline assembly as the bug in clang was fixed
9cc3689b2 Optimize memset to pure SIMD because compilers generate consistently bad code. clang for ARM and gcc for x86 https://gcc.godbolt.org/z/oxeGG7aEx
b4888f761 Optimize tag extraction for ARM with conditional increment instruction generation (csinc). For codegen see https://gcc.godbolt.org/z/a8z9j95Pv
b3fb0b5b4 Enable vector byte shuffle optimizations on ARM NEON
b638ebe5d Update Travis CI config.
d8f5dd8ec Clarify, in a comment, that offset/256 fits in 3 bits.  It has to in this context, because the other 5 bits in the byte are used for len-4 and the tag.

git-subtree-dir: src/snappy
git-subtree-split: dc05e026488865bc69313a68bcc03ef2e4ea8e83
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants