Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More efficient serialization for bitmap segments #3492

Merged
merged 8 commits into from
Nov 24, 2020

Conversation

jaspervdm
Copy link
Contributor

@jaspervdm jaspervdm commented Nov 20, 2020

This PR introduces a new type BitmapSegment that can be converted back and forth into a Segment<BitmapChunk>. It provides a more efficient (de)serialization than the Segment by choosing different strategies depending on the occupancy of the bitmap in the range. This is achieved by splitting the segment up in blocks of 2^16 bits (BitmapBlocks). If a block has less than 4096 positive indices, we serialize a list of positive indices as unsigned 16bit integers. If a block instead has less than 4096 negative indices, we serialize a list of negative indices as unsigned 16bit integers. In other cases, we serialize the full bitmap bytes directly.

TODO:

  • We can speed up the conversion between the two types by manipulating the storage of the internal bitvecs directly, using an unsafe block. Is this something we want to do? Can be done in a future PR
  • Check whether the current tests are sufficient

@jaspervdm jaspervdm requested a review from antiochp November 20, 2020 14:52
@antiochp
Copy link
Member

👀

@antiochp
Copy link
Member

Bitmap chunks are 1024 bits long.
Segments are variable length (based on height).
We flip serialization strategies at 4096 bits.
So does this mean that for some segment sizes/heights we only ever use the positive/negative indices and never fallback to the bitmap representation?

@jaspervdm
Copy link
Contributor Author

In theory yes, but such segments are currently not allowed. Per the RFC, the minimum height of a bitmap segment is 9.

@jaspervdm jaspervdm added this to the 5.0.0 milestone Nov 23, 2020
@antiochp
Copy link
Member

  • We can speed up the conversion between the two types by manipulating the storage of the internal bitvecs directly, using an unsafe block. Is this something we want to do?

We can always revisit this later - I would not optimize like this yet.

Copy link
Member

@antiochp antiochp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good.
Minor comment about the positive/negative/raw_bytes magic numbers.

chain/src/txhashset/bitmap_accumulator.rs Outdated Show resolved Hide resolved
chain/src/txhashset/bitmap_accumulator.rs Outdated Show resolved Hide resolved
@jaspervdm jaspervdm merged commit 055b684 into mimblewimble:master Nov 24, 2020
@jaspervdm jaspervdm deleted the bitmap_segment branch November 26, 2020 17:58
@antiochp antiochp mentioned this pull request Nov 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants