Build clib options #135

t20100 · 2021-06-23T08:14:16Z

Merge PR #130 first! Then it boils down to commit e9da8aa

This PR adds usage of compilation options from build configuration when building C libraries used by compression filters (namely, snappy, charls, zfp).
Without this, configured compilation options are used to build the compression filter but not the libraries they rely on.

or option

blosc.

dc05e0264 Tag open source release 1.1.10. 7b82423c5 The output buffer in DecompressBranchless is never read from and the source buffers are never written. This allows us to defer any writes to the output buffer for an arbitrary amount of time as long as the writes all occur in the proper order. When a MemCopy64 would have normally occurred we save away the source address and length. Once we reach the location of the next write to the output buffer first perform the deferred copy. This gives time for the source address calculation and length to finish before the deferred copy. 30326e5b8 Merge pull request silx-kit#150 from davemgreen:betterunalignedloads 74960e8bd Allow some buffer overwrite on literal emitting 37f375dde Add prefetch to zippy decompess, 15e2a0e13 Add "cc" clobbers to inline asm that modifies flags. 8881ba172 Improve the speed of hashing in zippy compression. a2d219a8a Modify MemCopy64 to use AVX 32 byte copies instead of SSE2 16 byte copies on capable x86 platforms. This gives an average speedup of 6.87% on Milan and 1.90% on Skylake. 984b191f0 Fix the remaining occurrence of non-const `std::string::data()`. 974fcc49e Fix compilation errors under C++11. d644ca877 Fix warnings due to use of `__attribute__(always_inline)` without `inline`. 9758c9dfd Add `snappy::CompressFromIOVec`. af720f9a3 Merge pull request silx-kit#148 from pitrou:ubsan-ptr-add-overflow 44caf7908 Move the comment about non-overlap requirement from the implementation to the contract of `MemCopy64()`, and clarify that it applies to `size`, not to 64. d261d2766 Optimize zippy MemCpy / MemMove during decompression 6a2b78a37 Optimize Zippy compression for ARM by 5-10% by choosing csel instructions 8dd58a519 Fix compilation for older GCC and Clang versions. 6c6e890ef Change LittleEndian loads/stores to use memcpy 8b07ff196 Update contributing guidelines. 64df9f28c Fix UBSan error (ptr + offset overflow) 65dc7b383 Pass by reference the first argument of ExtractLowBytes to avoid UB of passing uninitialized argument by value. fe18b4632 Switch CI to GitHub Actions. a7ddc144d Merge pull request silx-kit#140 from JunHe77:adv aeb5de55a decompress: refine data depdency 7062d7f1d Merge pull request silx-kit#133 from JunHe77:simd cbb83a1d6 Migrate feature detection macro checks from #ifdef to #if. a8400f1fa Add baseline CPU level to Travis CI. b9c9a989b Merge pull request silx-kit#135 from JunHe77:remove_extra 5c87bc61b Merge pull request silx-kit#136 from JunHe77:ext_arm 734b32bfe Add config and header file for NEON support ab9a57280 Fix SSE3 and BMI2 compile error d643b9a98 decompress: add hint to remove extra AND f52721b2b decompression: optimize ExtractOffset for Arm f2db8f77c Move the extract masks variable out in zippy. I see a consistent 1.5-2% improvement for ARM. Probably because ARM has more relaxed address computation than x86 https://www.godbolt.org/z/bfM1ezx41. I don't think this is a compiler bug or it can do something about it c8f764164 Remove inline assembly as the bug in clang was fixed 9cc3689b2 Optimize memset to pure SIMD because compilers generate consistently bad code. clang for ARM and gcc for x86 https://gcc.godbolt.org/z/oxeGG7aEx b4888f761 Optimize tag extraction for ARM with conditional increment instruction generation (csinc). For codegen see https://gcc.godbolt.org/z/a8z9j95Pv b3fb0b5b4 Enable vector byte shuffle optimizations on ARM NEON b638ebe5d Update Travis CI config. d8f5dd8ec Clarify, in a comment, that offset/256 fits in 3 bits. It has to in this context, because the other 5 bits in the byte are used for len-4 and the tag. git-subtree-dir: src/snappy git-subtree-split: dc05e026488865bc69313a68bcc03ef2e4ea8e83

t20100 added 18 commits June 16, 2021 16:34

Rework build options: use probed value as default, override with envvar

31eddd4

or option

Update cpuinfo to v8.0.0 from https://github.com/workhorsy/py-cpuinfo/

76e8693

fixed typo

3757d1d

Check if avx2=True and sse2=False, this leads to runtime issues with

c7e60f8

blosc.

Fixed platform.machine() upper case issue: e.g., AMD64 on Windows

c1f1315

use same check for x86 arch as cpuinfo

7e611be

store sse2 and avx2 compile flags in DefaultBuildConfig

28b8a31

stop using typing.Tuple, depreacted in python3.9...

002a9db

rename function and add link to -march/-mcpu doc

01df744

Fixed namedtuple and native_compile_arg typos

32a003c

Request a clean all when changing build config

e9924c4

Major rework of host and build config handling

176d1f7

store filter file ext in config and use it to load filters

c3c3d41

Use more generic arch to support more platform.machine() values

dbbbdf9

Remove typing for pytohn3.4 support

fa3d6cf

move cpuinfo import at the top

a3458a4

move arch into class

807f203

Use configured compile args also to build clib

e9da8aa

t20100 added this to the Next release milestone Jun 23, 2021

only filter out extra args

e9cc723

kif merged commit bc3634d into silx-kit:main Jul 1, 2021

t20100 deleted the build_clib-options branch October 13, 2021 11:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build clib options #135

Build clib options #135

t20100 commented Jun 23, 2021

Build clib options #135

Build clib options #135

Conversation

t20100 commented Jun 23, 2021