-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Cloudpickle #5643
Update Cloudpickle #5643
Conversation
One of the main reasons for doing this is that the current version of cloudpickle that we vendor does not properly serialize keyword only arguments in functions. |
Test FAILed. |
jenkins, retest this please |
Yeah it is from master
…On Thu, Sep 5, 2019 at 5:14 PM Si-Yuan ***@***.***> wrote:
jenkins, retest this please
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#5643?email_source=notifications&email_token=ABCRZZL4UBS7FOJO77CNM63QIGOHXA5CNFSM4IUBE2OKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6BKIBI#issuecomment-528655365>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABCRZZIOBNJM5W7R7R3LV3LQIGOHXANCNFSM4IUBE2OA>
.
|
Jenkins retest this please |
Test FAILed. |
Test FAILed. |
Here's one easy patch to the original cloudpickle: def save_codeobject(self, obj):
"""
Save a code object
"""
if PY3: # pragma: no branch
if hasattr(obj, "co_posonlyargcount"): # pragma: no branch
args = (
obj.co_argcount, obj.co_posonlyargcount,
obj.co_kwonlyargcount, obj.co_nlocals, obj.co_stacksize,
obj.co_flags, obj.co_code, obj.co_consts, obj.co_names,
obj.co_varnames, obj.co_filename, obj.co_name,
obj.co_firstlineno, obj.co_lnotab, obj.co_freevars,
obj.co_cellvars
)
else:
args = (
obj.co_argcount, obj.co_kwonlyargcount, obj.co_nlocals,
obj.co_stacksize, obj.co_flags, obj.co_code, obj.co_consts,
obj.co_names, obj.co_varnames, obj.co_filename,
obj.co_name, obj.co_firstlineno, obj.co_lnotab,
obj.co_freevars, obj.co_cellvars
)
else:
args = (
obj.co_argcount, obj.co_nlocals, obj.co_stacksize, obj.co_flags, obj.co_code,
obj.co_consts, obj.co_names, obj.co_varnames, obj.co_filename, obj.co_name,
obj.co_firstlineno, obj.co_lnotab, obj.co_freevars, obj.co_cellvars
)
self.save_reduce(types.CodeType, args, obj=obj) |
@suquark we should update the cloudpickle entirely rather than monkey patch it. |
Test FAILed. |
Test PASSed. |
Thanks @richardliaw, I think we should merge it and try it out. This behavior of reusing existing classes instead of creating a new one seems dangerous to me (e.g., this means that mutations to the class will affect the deserialized object, right?). @suquark doesn't cloudpickle have an option to turn off this behavior? |
Test FAILed. |
Test FAILed. |
jenkins retest this please |
@richardliaw just to clarify, you are saying that cloudpickle's behavior has changed between the previous version and this one, right? |
Yeah |
Test PASSed. |
@robertnishihara I don't think cloudpickle has such options. This is essential for cloudpickle to work correctly with dynamic class between remote processes. For example, def main():
def DynamicClass:
pass
@ray.remote
def remote_function():
return DynamicClass()
# register a serializer (assume broadcast to all workers)
cloudpickle.dispatch[DynamicClass] = ....
ray.get(remote_function.remote()) In this case, if cloudpickle doesn't keep the existing dynamic class, then when it serializes the Pyarrow also did in the same ways but with |
Why are these changes needed?
cc @suquark
Related issue number
Checks
scripts/format.sh
to lint the changes in this PR.