-
Notifications
You must be signed in to change notification settings - Fork 992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More efficient serialization for bitmap segments #3492
Conversation
👀 |
Bitmap chunks are 1024 bits long. |
In theory yes, but such segments are currently not allowed. Per the RFC, the minimum height of a bitmap segment is 9. |
We can always revisit this later - I would not optimize like this yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good.
Minor comment about the positive/negative/raw_bytes magic numbers.
This PR introduces a new type
BitmapSegment
that can be converted back and forth into aSegment<BitmapChunk>
. It provides a more efficient (de)serialization than theSegment
by choosing different strategies depending on the occupancy of the bitmap in the range. This is achieved by splitting the segment up in blocks of 2^16 bits (BitmapBlock
s). If a block has less than 4096 positive indices, we serialize a list of positive indices as unsigned 16bit integers. If a block instead has less than 4096 negative indices, we serialize a list of negative indices as unsigned 16bit integers. In other cases, we serialize the full bitmap bytes directly.TODO:
We can speed up the conversion between the two types by manipulating the storage of the internal bitvecs directly, using anCan be done in a future PRunsafe
block. Is this something we want to do?Check whether the current tests are sufficient