Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pickling errors thrown when saving some Stdlib modules #529

Merged
merged 2 commits into from
Aug 1, 2022

Conversation

leogama
Copy link
Contributor

@leogama leogama commented Jul 22, 2022

I'll add some small pickling errors' fixes here while I test the "save unpickleable objects by reference" option.

@leogama
Copy link
Contributor Author

leogama commented Jul 22, 2022

@mmckerns: I've been working with module-saving code for months and didn't realize until now that the function _load_module() doesn't do exactly what its name suggests. For example, I was thinking of pickling the settings submodule to preserve dump_module() options and filters set between a series of python invocations using an automating tool. However:

>>> import dill                                                                                                                                                                                
>>> dill.dump_module(module="dill.settings")                                                                                                                                                   
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-84d05d12f876> in <module>
----> 1 dill.dump_module(module='dill.settings')

~/Projects/forks/dill/dill/session.py in dump_module(filename, module, refimported, refonfail, **kwds)
    297         main = _import_module(main)
    298     if not isinstance(main, ModuleType):
--> 299         raise TypeError("%r is not a module" % main)
    300     original_main = main
    301     if refimported:

TypeError: {'protocol': 4, 'byref': False, 'fmode': 0, 'recurse': False, 'ignore': False, 'dump_module': {'refimported': False, 'refonfail': False}} is not a module

In this case, the dill.settings object is actually the sys.modules['dill.settings'].settings dictionary imported in the module dill. That's why I passed "dill.settings" as the module argument instead of dill.settings. But, inside dump_module(), _import_module("dill.settings") returns the dictionary, not the module. I don't know how much it is by design though.

It seems that _import_module() works much more like Unpickler.find_class() than like importlib.import_module(), while I expected pretty much the opposite (i.e. that in my example it would try to return the submodule dill.settings and, if that failed, it would fallback to return the attribute settings of the module dill).

I'm really confused...


By the way, I'm testing the "refonfail" option of dump_module() to save objects that fail to pickle by reference and I'm happy to announce that dill is currently able to pickle 99% of the Standard Library's modules and public submodules (the ones listed in docs.python.org). Most of the fails are due to infinite recursion, which I shall investigate soon.

@leogama leogama changed the title Fix KeyError when pickling type with '__dict__' or '__weakref__' in '__slots__' Fix pickling errors thrown when saving some Stdlib modules Jul 22, 2022
@mmckerns
Copy link
Member

dill is currently able to pickle 99% of the Standard Library's modules and public submodules

That's great news.

It seems that _import_module() works much more like Unpickler.find_class() than like importlib.import_module()

Unfortunately there are a lot of undocumented non-public functions in dill. However, you can see from the code that it does import the module (or submodule), so it's an appropriate name... it's just not the dill equivalent of importlib.import_module().

@leogama
Copy link
Contributor Author

leogama commented Jul 22, 2022

It seems that _import_module() works much more like Unpickler.find_class() than like importlib.import_module()

Unfortunately there are a lot of undocumented non-public functions in dill. However, you can see from the code that it does import the module (or submodule), so it's an appropriate name... it's just not the dill equivalent of importlib.import_module().

Yes, it seems that there are places where using importlib.import_module() is more appropriate than _import_module().

But take a look at this listing:

>>> import dill
>>> dill.__version__
'0.3.5.1'
>>> import importlib
>>> dill_settings = importlib.import_module('dill.settings')
>>> dill_settings
<module 'dill.settings' from '/home/leogama/.local/lib/python3.8/site-packages/dill/settings.py'>
>>> dill.copy(dill_settings)
{'protocol': 4, 'byref': False, 'fmode': 0, 'recurse': False, 'ignore': False}
>>> import pickletools
>>> pickletools.dis(pickletools.optimize(dill.dumps(dill_settings, protocol=0)))
    0: c    GLOBAL     'dill._dill _import_module'
   27: (    MARK
   28: V        UNICODE    'dill.settings'
   43: t        TUPLE      (MARK at 27)
   44: R    REDUCE
   45: .    STOP
highest protocol among opcodes = 0

By the way, could you help me with another issue real quick? I'd like to push a commit to master just moving the session-related stuff to a new session.py submodule. I did that in both the #475 and #527 branches, but this code movement is preventing me from reviewing changes in that segment as git treats this as brand new file...

Edit: here is the PR: #530

@leogama leogama marked this pull request as ready for review July 26, 2022 04:35
@mmckerns mmckerns added this to the dill-0.3.6 milestone Jul 26, 2022
@mmckerns mmckerns self-requested a review July 26, 2022 13:48
Copy link
Member

@mmckerns mmckerns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

dill/_dill.py Show resolved Hide resolved
dill/_dill.py Show resolved Hide resolved
dill/_dill.py Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants