You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Too bad we don't have a writer that uses tiny (like 8 bytes) block at first, but doubles size for each new block (16 bytes, 32 bytes next, etc.). Then we would naturally use log(size) number of blocks without over-allocating.
But then reading bytes is a bit tricky because we'd need to take discrete log (base 2) of the address. Maybe it wouldn't be so bad -- we could do this with Long.numberOfLeadingZeros maybe? But that's a bigger change ... we can do this separately/later.
Note that oal.store.ByteBuffersDataOutput takes a different and neat approach to gracefully growing: it picks an initial block size (1024 bytes by default), and appends new blocks as you write bytes, but then if it reaches 100 blocks, it "resizes" itself by doubling the block size and copying over, so that now you have 50 blocks.
So it's still O(N) amortized cost of that doubling/copying with time, and at any given moment you will not be wasting too many %tg of the bytes you've written, except at the start 1 KB block size.
In #12624, I moved the main FST body out of BytesStore into ByteBuffersDataOutput, and BytesStore becomes only a single byte[] for the currently written node so maybe we don't need to do this?
Description
From #12604 (comment)
The text was updated successfully, but these errors were encountered: