Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to use pickle5 when available #3425

Closed
wants to merge 4 commits into from
Closed

Conversation

jakirkham
Copy link
Member

As there are some improvements for buffer serialization in pickle protocol 5 and these can be opted in using a backports package on older Python versions. Try to leverage the backport package when available and fallback to the builtin pickle otherwise.

cc @quasiben @pentschev

As there are some improvements for buffer serialization in pickle
protocol 5 and these can be opted in using a backports package on older
Python versions. Try to leverage the backport package when available and
fallback to the builtin pickle otherwise.
This is a backport of Python's pickle 5 protocol for Python 3.6 and 3.7.
So install it for those versions. In Python 3.8+ this is included in
Python natively. So skip installing it for more recent Python versions.
@mrocklin
Copy link
Member

When using protocol 5 I would expect us to select out the frames from the pickling process and send them along separately. Does using pickle5 provide a speed boost if we just use dumps/loads naively?

cc @pitrou @ogrisel

@mrocklin
Copy link
Member

Closing this soonish if there are no further comments.

@pitrou
Copy link
Member

pitrou commented Feb 16, 2020

When using protocol 5 I would expect us to select out the frames from the pickling process and send them along separately.

That would be the best reason to use it indeed.

Does using pickle5 provide a speed boost if we just use dumps/loads naively?

It may, depending exactly how pickle is invoked (some copies may be avoided, though not all). The best is if you can do some measurements and post them here :-)

@pitrou
Copy link
Member

pitrou commented Feb 16, 2020

PS: are you still Python 2-compatible?

@mrocklin
Copy link
Member

No, we dropped Python 2 a little while ago.

@jakirkham
Copy link
Member Author

Does using pickle5 provide a speed boost if we just use dumps/loads naively?

This is what I was finding.

When using protocol 5 I would expect us to select out the frames from the pickling process and send them along separately.

That would certainly be better. Though it would be nice if we didn't have to maintain non-protocol 5 support as well.

@jakirkham
Copy link
Member Author

Replacing with PR ( #3784 ), which actually handles out-of-band buffers. Admittedly it doesn't use pickle5 on older Python versions. However we would need cloudpickle to support to support pickle5 for that to work correctly ( cloudpipe/cloudpickle#179 ) anyways.

@jakirkham jakirkham closed this May 7, 2020
@jakirkham jakirkham deleted the use_pickle5 branch May 7, 2020 07:01
@jakirkham
Copy link
Member Author

Also this has been extended to Python versions pre-3.8 with pickle5 with PR ( #3849 ).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants