Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pickled Functions seem to be incompatible between python3.5 and python3.6+ even when using pickle.DEFAULT_PROTOCOL #293

Closed
Ark-kun opened this issue Jun 14, 2019 · 7 comments

Comments

@Ark-kun
Copy link

Ark-kun commented Jun 14, 2019

Even an empty function has different 3-byte code representation in python3.5 and python3.6+.
This causes either "SystemError: unknown opcode" when going up or "Segmentation fault" when going down.

I understand that the support for this is not implied, but I think the incompatibility should be stated explicitly. "cloudpickle uses pickle.HIGHEST_PROTOCOL by default: it is meant to send objects between processes running the same version of Python." is not explicit here since it only talks about "cloudpickle useing pickle.HIGHEST_PROTOCOL by default". Which implies that using the more compatible protocol allows passing objects between different python versions.

Summary of my results:

testing python3.5 -> python3.6, 3.7
XXX lineno: 10, opcode: 0
Traceback (most recent call last):
  File "<string>", line 9, in <module>
  File "<string>", line 10, in func
SystemError: unknown opcode
testing python3.6,3.7 -> python3.5
Segmentation fault

PoC demonstrating code difference:

def func():
    pass
print(func.__code__.co_code)
'
On python3.5:
b'd\x00\x00S'

On python3.6:
b'd\x00S\x00'

Pickle structure diff: https://www.diffchecker.com/aC5PuVHy

Testing code:

function test_compatibility() {
v1=$1
v2=$2

echo testing "$v1 -> $v2"

"$v1" -c '
import pickle
import cloudpickle
import sys

def func():# -> str:
    pass

pickled_func = cloudpickle.dumps(func, pickle.DEFAULT_PROTOCOL)
sys.stdout.buffer.write(pickled_func)
' | "$v2" -c '
import pickle
import sys

pickled_func = sys.stdin.buffer.read()

func = pickle.loads(pickled_func)

func()
'

}


test_compatibility python3.5 python3.5
test_compatibility python3.5 python3.6
test_compatibility python3.5 python3.7

test_compatibility python3.6 python3.5
test_compatibility python3.6 python3.6
test_compatibility python3.6 python3.7

test_compatibility python3.7 python3.5
test_compatibility python3.7 python3.6
test_compatibility python3.7 python3.7
Ark-kun added a commit to Ark-kun/pipelines that referenced this issue Jun 14, 2019
Ark-kun added a commit to Ark-kun/pipelines that referenced this issue Jun 14, 2019
I've introduced code pickling to capture dependencies in kubeflow#1372
Later I've discovered that there is a serious opcode incompatibility between python versions 3.5 and 3.6+. See my analysis of the issue: cloudpipe/cloudpickle#293

Dues to this issue I decided to switch back to using source code copying by default and to continue improving it.

Until we stop supporting python 3.5 (kubeflow#668) it's too dangerous to use code pickling by default.

Code pickling can be enabled by specifying `pickle_code=True` when calling `func_to_container_op`
@hongye-sun
Copy link

Have you tried changing the protocol to lower version like 0?

@llllllllll
Copy link
Contributor

This isn't going to be solved by changing the protocol version. To serialize arbitrary lambdas we need to be able to send the bytecode from one machine to another. Python bytecode can change pretty radically between minor versions, and instructions could be added or removed with no clear translation making it impossible to convert between versions.

I think the best we can do is strengthen the claim in the readme/docs to say, "Cloudpickle can only be used to send objects between the exact same version of Python."

ogrisel added a commit that referenced this issue Jun 14, 2019
…hon versions

As suggested in #293 (comment)

And also add a security notice. This is not specific to cloudpickle as `load` / `loads` come from the Python standard library but it's better to be explicit that loading pickle payloads from untrusted sources is a security vulnerability.
@ogrisel
Copy link
Contributor

ogrisel commented Jun 14, 2019

Done in #294. I also added a security notice. Let me know what you think.

@Ark-kun
Copy link
Author

Ark-kun commented Jun 15, 2019

Let me know what you think.

Sounds good.

@Ark-kun
Copy link
Author

Ark-kun commented Jun 15, 2019

Python bytecode can change pretty radically between minor versions, and instructions could be added or removed with no clear translation making it impossible to convert between versions.

Is there any place I can read more about those breaking changes?

Do you know some other python code format (e.g. pre-compiled code) that's more stable? Something similar to .Net's CIL which has pretty good backwards compatibility.

@ogrisel
Copy link
Contributor

ogrisel commented Jun 17, 2019

Is there any place I can read more about those breaking changes?

The generated byte code is an implementation detail of CPython. It's not part of the public API in anyway as far as I know. It's possible to grep "bytecode" in the changelog though: https://docs.python.org/3/whatsnew/changelog.html

Do you know some other python code format (e.g. pre-compiled code) that's more stable? Something similar to .Net's CIL which has pretty good backwards compatibility.

Python is not compiled so there is nothing. Saving the source code would be an optional but that would render cloudpickle much more brittle and less efficient IMO.

Ark-kun added a commit to Ark-kun/pipelines that referenced this issue Jun 18, 2019
I've introduced code pickling to capture dependencies in kubeflow#1372
Later I've discovered that there is a serious opcode incompatibility between python versions 3.5 and 3.6+. See my analysis of the issue: cloudpipe/cloudpickle#293

Dues to this issue I decided to switch back to using source code copying by default and to continue improving it.

Until we stop supporting python 3.5 (kubeflow#668) it's too dangerous to use code pickling by default.

Code pickling can be enabled by specifying `pickle_code=True` when calling `func_to_container_op`
Ark-kun added a commit to Ark-kun/pipelines that referenced this issue Jun 18, 2019
I've introduced code pickling to capture dependencies in kubeflow#1372
Later I've discovered that there is a serious opcode incompatibility between python versions 3.5 and 3.6+. See my analysis of the issue: cloudpipe/cloudpickle#293

Dues to this issue I decided to switch back to using source code copying by default and to continue improving it.

Until we stop supporting python 3.5 (kubeflow#668) it's too dangerous to use code pickling by default.

Code pickling can be enabled by specifying `pickle_code=True` when calling `func_to_container_op`
Ark-kun added a commit to Ark-kun/pipelines that referenced this issue Jun 18, 2019
I've introduced code pickling to capture dependencies in kubeflow#1372
Later I've discovered that there is a serious opcode incompatibility between python versions 3.5 and 3.6+. See my analysis of the issue: cloudpipe/cloudpickle#293

Dues to this issue I decided to switch back to using source code copying by default and to continue improving it.

Until we stop supporting python 3.5 (kubeflow#668) it's too dangerous to use code pickling by default.

Code pickling can be enabled by specifying `pickle_code=True` when calling `func_to_container_op`
Ark-kun added a commit to Ark-kun/pipelines that referenced this issue Jun 18, 2019
Ark-kun added a commit to Ark-kun/pipelines that referenced this issue Jun 18, 2019
@ogrisel ogrisel closed this as completed Jun 18, 2019
k8s-ci-robot pushed a commit to kubeflow/pipelines that referenced this issue Jun 19, 2019
I've introduced code pickling to capture dependencies in #1372
Later I've discovered that there is a serious opcode incompatibility between python versions 3.5 and 3.6+. See my analysis of the issue: cloudpipe/cloudpickle#293

Dues to this issue I decided to switch back to using source code copying by default and to continue improving it.

Until we stop supporting python 3.5 (#668) it's too dangerous to use code pickling by default.

Code pickling can be enabled by specifying `pickle_code=True` when calling `func_to_container_op`
k8s-ci-robot pushed a commit to kubeflow/pipelines that referenced this issue Jun 23, 2019
* SDK - Refactored the code in kfp.components._python_op._capture_function_code_using_cloudpickle

* SDK/Lightweight - Added python version compatibility checks

See my compatibility analysis: cloudpipe/cloudpickle#293
@AlJohri
Copy link

AlJohri commented Sep 5, 2019

is it possible to store the python version with the pickled file and check it at load time? I was hoping to implement this application side but it would necessitate a metadata file which I was hoping to avoid

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants