Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Summary: Optimizes 3-bit packing as outlined here: T199311618 Before change: ---------------------------------------------------------------------------------- benchmark_pack_uint_values<3>/128/8 47.0 ns 46.4 ns 15106555 benchmark_pack_uint_values<3>/128/64 6.94 ns 6.90 ns 101226284 benchmark_pack_uint_values<3>/128/128 3.27 ns 3.24 ns 215022716 benchmark_unpack_uint_values<3>/128/8 22.0 ns 21.9 ns 32585572 benchmark_unpack_uint_values<3>/128/64 6.02 ns 5.98 ns 116910230 benchmark_unpack_uint_values<3>/128/128 2.74 ns 2.73 ns 257088291 After change: ---------------------------------------------------------------------------------- benchmark_pack_uint_values<3>/128/8 19.5 ns 19.5 ns 36050883 benchmark_pack_uint_values<3>/128/64 3.90 ns 3.87 ns 181151919 benchmark_pack_uint_values<3>/128/128 1.57 ns 1.57 ns 447247194 benchmark_unpack_uint_values<3>/128/8 20.5 ns 20.4 ns 34490914 benchmark_unpack_uint_values<3>/128/64 3.19 ns 3.11 ns 228019714 benchmark_unpack_uint_values<3>/128/128 1.71 ns 1.70 ns 408587338 Unpacking perf for 128 values is 1.60x faster (2.74/1.71). Reviewed By: digantdesai Differential Revision: D64010666
- Loading branch information