You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Encoding/decoding of int16/int32 arrays has a large overhead (~75%) as zigzag encoding is done in Python. No low hanging optimization left to be had from Python as the implementation is already using numpy ufuncs.
Totaltime: 0.052461sFunction: encodeatline80Line# Hits Time Per Hit % Time Line Contents==============================================================80 @wraps(c_func)
81defencode(data, prev=0):
8283140.040.00.1ifnp.issubdtype(data.dtype, np.signedinteger):
84110796.010796.020.6diffs=np.ediff1d(data, to_begin=data[0])
8519.09.00.0shift=data.dtype.itemsize*8-186126187.026187.049.9data=to_zig_zag(diffs, np.int32(shift))
8788139.039.00.1ifnp.issubdtype(data.dtype, np.uint16):
89data=data.astype(np.uint32)
9091142.042.00.1output=np.zeros(max_compressed_bytes(len(data)), dtype=np.uint8)
9211.01.00.0encoded_size=c_func(
931230.0230.00.4data.ctypes.data_as(ctypes.POINTER(ctypes.c_uint32)),
9411.01.00.0len(data),
95141.041.00.1output.ctypes.data_as(ctypes.POINTER(ctypes.c_uint8)),
96115066.015066.028.7prev97 )
9819.09.00.0returnoutput[:encoded_size]
Totaltime: 0.083777sFunction: decodeatline111Line# Hits Time Per Hit % Time Line Contents==============================================================111 @wraps(c_func)
112defdecode(data, n, prev=0, dtype=None):
113114131.031.00.0output=np.zeros(n, dtype=np.uint32)
11511.01.00.0c_func(
1161105.0105.00.1data.ctypes.data_as(ctypes.POINTER(ctypes.c_uint8)),
117131.031.00.0output.ctypes.data_as(ctypes.POINTER(ctypes.c_uint32)),
11810.00.00.0n,
119120428.020428.024.4prev,
120 )
121122151.051.00.1ifdtypeandnp.issubdtype(dtype, np.signedinteger):
123120575.020575.024.6zigzag=from_zig_zag(output)
124142553.042553.050.8output=np.cumsum(zigzag, dtype=dtype)
125elifdtypeandoutput.dtype!=dtype:
126returnoutput.astype(dtype)
12712.02.00.0returnoutput
Maybe @lemire already has an efficient int16, int32 -> uint32 zigzag implementation and/or is interested in supporting signed typed in streamvbyte natively?
The text was updated successfully, but these errors were encountered:
Encoding/decoding of
int16/int32
arrays has a large overhead (~75%) as zigzag encoding is done in Python. No low hanging optimization left to be had from Python as the implementation is already using numpy ufuncs.Maybe @lemire already has an efficient
int16, int32
->uint32
zigzag implementation and/or is interested in supporting signed typed in streamvbyte natively?The text was updated successfully, but these errors were encountered: