improved performance of ++ for Bit/ByteVector and made other related performance improvements #19

pchiusano · 2014-08-21T17:22:43Z

There's now a new constructor Chunks, in both BitVector and ByteVector. It uses the same idea as an (amortized) O(1) functional counter - when new chunks are added to the end, if they are more than half the size of the chunk immediately to the left, those chunks are combined. This maintains a sequence of balanced trees of exponentially decreasing sizes, so appends take amortized O(1) time, rather than O(log n). I also changed ByteVector to use a small (64 byte) buffer by default in ++, which provided some drastic performance improvements for the common case (without penalizing the other cases).

Update: Also, I forgot to mention that Buffer no longer has the weird discontinuity in its performance if you look past the first chunk. Basically, Buffer only handles providing the mutable tail, and Chunks takes care of appending in O(1). So a buffered ByteVector still has logarithmic take/drop/get. This is why I felt pretty good about adding a small default buffer size.

Here's some numbers from current master:

building vectors of 1000000 bytes
byte vector :+ 2.0971s per trial; 10 trials
byte vector ++ 1.0981999999999998s per trial; 10 trials
byte vector balanced ++ 0.15109999999999998s per trial; 10 trials
buffered byte vector :+ 0.0305s per trial; 40 trials
buffered byte vector ++ 0.0726s per trial; 20 trials
bit vector ++ 2.0225s per trial; 10 trials
bit vector balanced ++ 0.3116s per trial; 10 trials
bit vector balanced Append 0.1393s per trial; 10 trials

And here's some numbers from this branch:

byte vector :+ 0.028050000000000002s per trial; 40 trials
byte vector ++ 0.0526s per trial; 20 trials
byte vector balanced ++ 0.1886s per trial; 10 trials
buffered byte vector :+ 0.022037499999999998s per trial; 80 trials
buffered byte vector ++ 0.038875s per trial; 40 trials
bit vector ++ 0.3809s per trial; 10 trials
bit vector balanced ++ 0.5376000000000001s per trial; 10 trials
bit vector balanced Append 0.2778s per trial; 10 trials
list :: 0.12689999999999999s per trial; 10 trials
vector :+ 0.057199999999999994s per trial; 20 trials

It looks like ++ for BitVector is close to the speed of :: on List. That's to build up a million byte BitVector, one byte at a type, using a left fold. For ByteVector, it's much faster, as it gets to directly use the mutable (only 64 byte) array at the tail.

Here's the code that produced that:

  val N = 1000 * 1000
  val M = 10

  val bytes = (0 until N).map(_.toByte)
  val byteChunks = bytes.map(ByteVector(_))
  val bitChunks = bytes.map(BitVector(_))

  time("byte vector :+", M) { bytes.foldLeft(ByteVector.empty)(_ :+ _) }
  time("byte vector ++", M) { byteChunks.foldLeft(ByteVector.empty)(_ ++ _) }
  time("byte vector balanced ++", M) { BitVector.reduceBalanced(byteChunks)(_.size)(_ ++ _) }
  time("buffered byte vector :+", M) { bytes.foldLeft(ByteVector.empty.buffer)(_ :+ _) }
  time("buffered byte vector ++", M) { byteChunks.foldLeft(ByteVector.empty.buffer)(_ ++ _) }
  time("bit vector ++", M) { bitChunks.foldLeft(BitVector.empty)(_ ++ _) }
  time("bit vector balanced ++", M) { BitVector.reduceBalanced(bitChunks)(_.size)(_ ++ _) }
  time("bit vector balanced Append", M) { BitVector.reduceBalanced(bitChunks)(_.size)(BitVector.Append(_,_)) }
  time("list ::", M) { (0 until N).foldLeft(List.empty[Int])(_.::(_)) }
  time("vector :+", M) { (0 until N).foldLeft(Vector.empty[Int])(_ :+ _) }

…performance improvements

improved performance of ++ for Bit/ByteVector and made other related performance improvements

pchiusano added 3 commits August 21, 2014 12:41

improved performance of ++ for Bit/ByteVector and made other related …

69797b1

…performance improvements

made depth function private[bits], and take/drop are final on BitVector

c49c8da

Addressed various issues brought up by @mpilquist in PR

6e0d02a

mpilquist added a commit that referenced this pull request Aug 22, 2014

Merge pull request #19 from pchiusano/topic/chunked-and-buffered

b5b60e2

improved performance of ++ for Bit/ByteVector and made other related performance improvements

mpilquist merged commit b5b60e2 into scodec:master Aug 22, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improved performance of ++ for Bit/ByteVector and made other related performance improvements #19

improved performance of ++ for Bit/ByteVector and made other related performance improvements #19

pchiusano commented Aug 21, 2014

improved performance of ++ for Bit/ByteVector and made other related performance improvements #19

improved performance of ++ for Bit/ByteVector and made other related performance improvements #19

Conversation

pchiusano commented Aug 21, 2014