Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As part of
unpack_frames
, we slice out each frame we'd like to extract (see code snippet below).distributed/distributed/protocol/utils.py
Line 135 in 8a0e4b6
However this causes a copy, which increases memory usage and creates a notable bottleneck when unpacking frames. Closer inspection of
unpack_frames
shows this dominates the time of that function and takes up roughly half of the time indeserialize_bytes
. Also asdeserialize_bytes
typically works with abytes
object, these frames end up beingbytes
objects, which we wind up needing to copy later to produce mutable frames ( see PR #3967 and related context ). IOW performing a copy inunpack_frames
is wasted effort.To fix this issue, we coerce the input of
unpack_frames
to amemoryview
. This means slicing later merely produces views onto the data, which is essentially free. This avoids the copy and alleviates this bottleneck. Also this just works in most Python calls (likestruct.unpack_from
) as they arebytes
-like compatible so work onmemoryview
s. The details can be seen in the benchmark below usingdeserialize_bytes
part of the unspilling code path, which calls intounpack_frames
. This speeds up the unspilling code path by ~50%.Before:
After: