You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current decoder pipe emits each element back into the stream as a singleton chunk. I'd like to propose adding a second method that decodes while preserving the underlying chunks of the stream. I have this work done in a fork (chunkDecoder implementation), and I'd be happy to open a PR to contribute. First though I wanted to check if others would find this useful.
Motivation
The reason I'm proposing this is because of a weird performance degradation I hit in a work project after upgrading fs2v2.5.9 -> fs2v3.0.6. Our use case looks like this:
So we're decoding some json into a case class, we do some transform on the data, and then we rechunk the stream so we can produce a bulk request of 25000 Foos at a time. From fs2-v2.5.9 to fs2-v3.0.6 the chunk logic was simplified. In particular the Queue behaviour when rechunking was changed such that there is now a performance tradeoff when rechunking very many small chunks.
Key tradeoff is that if you have lots of tiny chunks backing a chunk, index based access is O(n), but this is a rare pattern in practice, and folks can always call .compact to flatten to a single array backed chunk.
-- typelevel/fs2#2181 (comment)
The .compact method does solve the problem, but the "rare pattern" described is set up to be less rare by the current decoder. Using a chunkDecoder would keep our original stream chunking, and let us avoid the call to .compact + the copying that comes with it.
Lazy Evaluation
The evaluation differences would need to be made clear in the docstrings. Decoding chunks will be lazy, but each element in a chunk will be decoded eagerly. For example:
would decode 597 elements in total (3 chunks) in order to take the 400 objects. The standard decoder would decode only the 400 objects being taken. I don't think this is a problem, but important to call out. I'd add a docstring to the existing decoder method to make its behaviour obvious compared to the proposed chunkDecoder.
The text was updated successfully, but these errors were encountered:
The current
decoder
pipe emits each element back into the stream as a singleton chunk. I'd like to propose adding a second method that decodes while preserving the underlying chunks of the stream. I have this work done in a fork (chunkDecoder
implementation), and I'd be happy to open a PR to contribute. First though I wanted to check if others would find this useful.Motivation
The reason I'm proposing this is because of a weird performance degradation I hit in a work project after upgrading fs2v2.5.9 -> fs2v3.0.6. Our use case looks like this:
So we're decoding some json into a case class, we do some transform on the data, and then we rechunk the stream so we can produce a bulk request of 25000
Foo
s at a time. From fs2-v2.5.9 to fs2-v3.0.6 the chunk logic was simplified. In particular theQueue
behaviour when rechunking was changed such that there is now a performance tradeoff when rechunking very many small chunks.The
.compact
method does solve the problem, but the "rare pattern" described is set up to be less rare by the currentdecoder
. Using achunkDecoder
would keep our original stream chunking, and let us avoid the call to.compact
+ the copying that comes with it.Lazy Evaluation
The evaluation differences would need to be made clear in the docstrings. Decoding chunks will be lazy, but each element in a chunk will be decoded eagerly. For example:
would decode 597 elements in total (3 chunks) in order to take the 400 objects. The standard
decoder
would decode only the 400 objects being taken. I don't think this is a problem, but important to call out. I'd add a docstring to the existingdecoder
method to make its behaviour obvious compared to the proposedchunkDecoder
.The text was updated successfully, but these errors were encountered: