Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make iterating over sys.modules threadsafe #322

Merged
merged 1 commit into from
Jan 21, 2020

Conversation

mmohrhard
Copy link
Contributor

Despite creating a copy through list(sys.modules.items()) there
is a possible race condition if another thread is adding to sys.modules

  File "x/lib/python3.7/pickle.py", line 774, in save_tuple
    save(element)
  File "x/lib/python3.7/pickle.py", line 549, in save
    self.save_reduce(obj=obj, *rv)
  File "x/lib/python3.7/pickle.py", line 637, in save_reduce
    save(func)
  File "x/lib/python3.7/pickle.py", line 518, in save
    self.save_global(obj)
  File "x/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 876, in save_global
    elif not _is_global(obj, name=name):
  File "x/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 174, in _is_global
    module_name = _whichmodule(obj, name)
  File "x/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 156, in _whichmodule
    for module_name, module in list(sys.modules.items()):
RuntimeError: dictionary changed size during iteration

@mmohrhard
Copy link
Contributor Author

I think the CI failures are unrelated.

@ogrisel
Copy link
Contributor

ogrisel commented Jan 14, 2020

Thank you very much @mmohrhard? Do you think you could write a non-regression test for this?

Also please add an entry in the CHANGES.md file.

Despite creating a copy through list(sys.modules.items()) there
is a possible race condition if another thread is adding to sys.modules

  File "x/lib/python3.7/pickle.py", line 774, in save_tuple
    save(element)
  File "x/lib/python3.7/pickle.py", line 549, in save
    self.save_reduce(obj=obj, *rv)
  File "x/lib/python3.7/pickle.py", line 637, in save_reduce
    save(func)
  File "x/lib/python3.7/pickle.py", line 518, in save
    self.save_global(obj)
  File "x/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 876, in save_global
    elif not _is_global(obj, name=name):
  File "x/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 174, in _is_global
    module_name = _whichmodule(obj, name)
  File "x/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 156, in _whichmodule
    for module_name, module in list(sys.modules.items()):
RuntimeError: dictionary changed size during iteration
@codecov
Copy link

codecov bot commented Jan 15, 2020

Codecov Report

Merging #322 into master will decrease coverage by 34.63%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master     #322       +/-   ##
===========================================
- Coverage   92.96%   58.33%   -34.64%     
===========================================
  Files           2        2               
  Lines         853      852        -1     
  Branches      178      175        -3     
===========================================
- Hits          793      497      -296     
- Misses         37      321      +284     
- Partials       23       34       +11
Impacted Files Coverage Δ
cloudpickle/cloudpickle.py 79.52% <100%> (-12.5%) ⬇️
cloudpickle/cloudpickle_fast.py 0% <0%> (-95.6%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8cf9ec4...fe35eb3. Read the comment docs.

@mmohrhard
Copy link
Contributor Author

@ogrisel I have added the Changes.md part but am unable to generate a test case that fails before the fix. We have only hit the bug now 3 times out of a hundred production runs and only after updating one of our dependencies. Based on trying to create a MCVE it seems it is quite tough to generate a test that hits this condition reliably.

@ogrisel
Copy link
Contributor

ogrisel commented Jan 20, 2020

@mmohrhard thanks for your feedback.

However there is something that I do not understand: both list(l) and l.copy are builtin functions and none of them mentions explicitly that the copy is thread-safe if l is being concurrently mutated in another thread.

So I have not sure that using .copy is actually the correct fix. @pitrou do you have a suggestion?

@pitrou
Copy link
Member

pitrou commented Jan 20, 2020

list.copy is pretty much thread-safe AFAICT. As for dict.copy though, it's a different can of worms...

@mmohrhard
Copy link
Contributor Author

mmohrhard commented Jan 20, 2020

list.copy is pretty much thread-safe AFAICT.

The patch is replacing creating a list from a dict items view (in the end using PyDictIterItem_Type) to copying the dict and then creating the items view. From reading PyDict_Copy looks completely thread-safe whereas apparently list_extend in Object/listobject.c is not.

@mmohrhard
Copy link
Contributor Author

I should also note that I have not yet completely figured out how list_extend is able to drop the GIL in this case. We have now hit the same bug at least once in cpython's pickle.py which contains the same piece of code. I'm still trying to debug the code to understand why the list(sys.modules.items()) code is not thread-safe.

@pitrou
Copy link
Member

pitrou commented Jan 20, 2020

The GIL can be dropped at various places where pure Python code is executed, for example when running the GC (which will trigger execution of destructors etc.). Basically, C code that is guaranted not to release the GIL is the exception rather than the rule.

@ogrisel
Copy link
Contributor

ogrisel commented Jan 20, 2020

list.copy is pretty much thread-safe AFAICT. As for dict.copy though, it's a different can of worms...

Sorry I meant dict.copy indeed.

So what is the solution to iterate over sys.modules in a thread safe way?

for _ in range(100):
    try:
        sys_modules_copy = sys.modules.copy()
        break
    except RuntimeError:
        pass  # try again
else:
    raise RuntimeError("Failed to iterate over sys.modules")

for module_name, module in sys_modules_copy.items():
    ...

That looks really ugly to me...

@ogrisel
Copy link
Contributor

ogrisel commented Jan 21, 2020

Let's merge this because empirically it seems to reduce the likelihood to get a race condition. Still if someone has a better suggestion feel free to comment or open a new PR.

@ogrisel ogrisel merged commit c3982ea into cloudpipe:master Jan 21, 2020
sthagen added a commit to sthagen/cloudpipe-cloudpickle that referenced this pull request Jan 22, 2020
make iterating over sys.modules (more) threadsafe (cloudpipe#322)
@mmohrhard
Copy link
Contributor Author

Just for documentation, Serhiy Storchaka explains in https://bugs.python.org/issue40327 the underlying problem and mentions why the sys.modules.copy().items() pattern is slightly better. Cpython is now also replacing the list(sys.modules.items()) pattern.

wip-sync pushed a commit to NetBSD/pkgsrc-wip that referenced this pull request Apr 4, 2022
2.0.0
=====

- Python 3.5 is no longer supported.

- Support for registering modules to be serialised by value. This allows code
  defined in local modules to be serialised and executed remotely without those
  local modules installed on the remote machine.
  ([PR #417](cloudpipe/cloudpickle#417))

- Fix a side effect altering dynamic modules at pickling time.
  ([PR #426](cloudpipe/cloudpickle#426))

- Support for pickling type annotations on Python 3.10 as per [PEP 563](
  https://www.python.org/dev/peps/pep-0563/)
  ([PR #400](cloudpipe/cloudpickle#400))

- Stricter parametrized type detection heuristics in
  _is_parametrized_type_hint to limit false positives.
  ([PR #409](cloudpipe/cloudpickle#409))

- Support pickling / depickling of OrderedDict KeysView, ValuesView, and
  ItemsView, following similar strategy for vanilla Python dictionaries.
  ([PR #423](cloudpipe/cloudpickle#423))

- Suppressed a source of non-determinism when pickling dynamically defined
  functions and handles the deprecation of co_lnotab in Python 3.10+.
  ([PR #428](cloudpipe/cloudpickle#428))

1.6.0
=====

- `cloudpickle`'s pickle.Pickler subclass (currently defined as
  `cloudpickle.cloudpickle_fast.CloudPickler`) can and should now be accessed
  as `cloudpickle.Pickler`. This is the only officially supported way of
  accessing it.
  ([issue #366](cloudpipe/cloudpickle#366))

- `cloudpickle` now supports pickling `dict_keys`, `dict_items` and
  `dict_values`.
  ([PR #384](cloudpipe/cloudpickle#384))

1.5.0
=====

- Fix a bug causing cloudpickle to crash when pickling dynamically created,
  importable modules.
  ([issue #360](cloudpipe/cloudpickle#354))

- Add optional dependency on `pickle5` to get improved performance on
  Python 3.6 and 3.7.
  ([PR #370](cloudpipe/cloudpickle#370))

- Internal refactoring to ease the use of `pickle5` in cloudpickle
  for Python 3.6 and 3.7.
  ([PR #368](cloudpipe/cloudpickle#368))

1.4.1
=====

- Fix incompatibilities between cloudpickle 1.4.0 and Python 3.5.0/1/2
  introduced by the new support of cloudpickle for pickling typing constructs.
  ([issue #360](cloudpipe/cloudpickle#360))

- Restore compat with loading dynamic classes pickled with cloudpickle
  version 1.2.1 that would reference the `types.ClassType` attribute.
  ([PR #359](cloudpipe/cloudpickle#359))

1.4.0
=====

**This version requires Python 3.5 or later**

- cloudpickle can now all pickle all constructs from the ``typing`` module
  and the ``typing_extensions`` library in Python 3.5+
  ([PR #318](cloudpipe/cloudpickle#318))

- Stop pickling the annotations of a dynamic class for Python < 3.6
  (follow up on #276)
  ([issue #347](cloudpipe/cloudpickle#347))

- Fix a bug affecting the pickling of dynamic `TypeVar` instances on Python 3.7+,
  and expand the support for pickling `TypeVar` instances (dynamic or non-dynamic)
  to Python 3.5-3.6 ([PR #350](cloudpipe/cloudpickle#350))

- Add support for pickling dynamic classes subclassing `typing.Generic`
  instances on Python 3.7+
  ([PR #351](cloudpipe/cloudpickle#351))

1.3.0
=====

- Fix a bug affecting dynamic modules occuring with modified builtins
  ([issue #316](cloudpipe/cloudpickle#316))

- Fix a bug affecting cloudpickle when non-modules objects are added into
  sys.modules
  ([PR #326](cloudpipe/cloudpickle#326)).

- Fix a regression in cloudpickle and python3.8 causing an error when trying to
  pickle property objects.
  ([PR #329](cloudpipe/cloudpickle#329)).

- Fix a bug when a thread imports a module while cloudpickle iterates
  over the module list
  ([PR #322](cloudpipe/cloudpickle#322)).

- Add support for out-of-band pickling (Python 3.8 and later).
  https://docs.python.org/3/library/pickle.html#example
  ([issue #308](cloudpipe/cloudpickle#308))

- Fix a side effect that would redefine `types.ClassTypes` as `type`
  when importing cloudpickle.
  ([issue #337](cloudpipe/cloudpickle#337))

- Fix a bug affecting subclasses of slotted classes.
  ([issue #311](cloudpipe/cloudpickle#311))

- Dont pickle the abc cache of dynamically defined classes for Python 3.6-
  (This was already the case for python3.7+)
  ([issue #302](cloudpipe/cloudpickle#302))

1.2.2
=====

- Revert the change introduced in
  ([issue #276](cloudpipe/cloudpickle#276))
  attempting to pickle functions annotations for Python 3.4 to 3.6. It is not
  possible to pickle complex typing constructs for those versions (see
  [issue #193]( cloudpipe/cloudpickle#193))

- Fix a bug affecting bound classmethod saving on Python 2.
  ([issue #288](cloudpipe/cloudpickle#288))

- Add support for pickling "getset" descriptors
  ([issue #290](cloudpipe/cloudpickle#290))

1.2.1
=====

- Restore (partial) support for Python 3.4 for downstream projects that have
  LTS versions that would benefit from cloudpickle bug fixes.

1.2.0
=====

- Leverage the C-accelerated Pickler new subclassing API (available in Python
  3.8) in cloudpickle. This allows cloudpickle to pickle Python objects up to
  30 times faster.
  ([issue #253](cloudpipe/cloudpickle#253))

- Support pickling of classmethod and staticmethod objects in python2.
  arguments. ([issue #262](cloudpipe/cloudpickle#262))

- Add support to pickle type annotations for Python 3.5 and 3.6 (pickling type
  annotations was already supported for Python 3.7, Python 3.4 might also work
  but is no longer officially supported by cloudpickle)
  ([issue #276](cloudpipe/cloudpickle#276))

- Internal refactoring to proactively detect dynamic functions and classes when
  pickling them.  This refactoring also yields small performance improvements
  when pickling dynamic classes (~10%)
  ([issue #273](cloudpipe/cloudpickle#273))

1.1.1
=====

- Minor release to fix a packaging issue (Markdown formatting of the long
  description rendered on pypi.org). The code itself is the same as 1.1.0.

1.1.0
=====

- Support the pickling of interactively-defined functions with positional-only
  arguments. ([issue #266](cloudpipe/cloudpickle#266))

- Track the provenance of dynamic classes and enums so as to preseve the
  usual `isinstance` relationship between pickled objects and their
  original class defintions.
  ([issue #246](cloudpipe/cloudpickle#246))

1.0.0
=====

- Fix a bug making functions with keyword-only arguments forget the default
  values of these arguments after being pickled.
  ([issue #264](cloudpipe/cloudpickle#264))

0.8.1
=====

- Fix a bug (already present before 0.5.3 and re-introduced in 0.8.0)
  affecting relative import instructions inside depickled functions
  ([issue #254](cloudpipe/cloudpickle#254))

0.8.0
=====

- Add support for pickling interactively defined dataclasses.
  ([issue #245](cloudpipe/cloudpickle#245))

- Global variables referenced by functions pickled by cloudpickle are now
  unpickled in a new and isolated namespace scoped by the CloudPickler
  instance. This restores the (previously untested) behavior of cloudpickle
  prior to changes done in 0.5.4 for functions defined in the `__main__`
  module, and 0.6.0/1 for other dynamic functions.

0.7.0
=====

- Correctly serialize dynamically defined classes that have a `__slots__`
  attribute.
  ([issue #225](cloudpipe/cloudpickle#225))

0.6.1
=====

- Fix regression in 0.6.0 which breaks the pickling of local function defined
  in a module, making it impossible to access builtins.
  ([issue #211](cloudpipe/cloudpickle#211))

0.6.0
=====

- Ensure that unpickling a function defined in a dynamic module several times
  sequentially does not reset the values of global variables.
  ([issue #187](cloudpipe/cloudpickle#205))

- Restrict the ability to pickle annotations to python3.7+ ([issue #193](
  cloudpipe/cloudpickle#193) and [issue #196](
  cloudpipe/cloudpickle#196))

- Stop using the deprecated `imp` module under Python 3.
  ([issue #207](cloudpipe/cloudpickle#207))

- Fixed pickling issue with singleton types `NoneType`, `type(...)` and
  `type(NotImplemented)` ([issue #209](cloudpipe/cloudpickle#209))

0.5.6
=====

- Ensure that unpickling a locally defined function that accesses the global
  variables of a module does not reset the values of the global variables if
  they are already initialized.
  ([issue #187](cloudpipe/cloudpickle#187))

0.5.5
=====

- Fixed inconsistent version in `cloudpickle.__version__`.

0.5.4
=====

- Fixed a pickling issue for ABC in python3.7+ ([issue #180](
  cloudpipe/cloudpickle#180)).

- Fixed a bug when pickling functions in `__main__` that access global
  variables ([issue #187](
  cloudpipe/cloudpickle#187)).

0.5.3
=====
- Fixed a crash in Python 2 when serializing non-hashable instancemethods of built-in
  types ([issue #144](cloudpipe/cloudpickle#144)).

- itertools objects can also pickled
  ([PR #156](cloudpipe/cloudpickle#156)).

- `logging.RootLogger` can be also pickled
  ([PR #160](cloudpipe/cloudpickle#160)).

0.5.2
=====

- Fixed a regression: `AttributeError` when loading pickles that hold a
  reference to a dynamically defined class from the `__main__` module.
  ([issue #131]( cloudpipe/cloudpickle#131)).

- Make it possible to pickle classes and functions defined in faulty
  modules that raise an exception when trying to look-up their attributes
  by name.

0.5.1
=====

- Fixed `cloudpickle.__version__`.

0.5.0
=====

- Use `pickle.HIGHEST_PROTOCOL` by default.

0.4.4
=====

- `logging.RootLogger` can be also pickled
  ([PR #160](cloudpipe/cloudpickle#160)).

0.4.3
=====

- Fixed a regression: `AttributeError` when loading pickles that hold a
  reference to a dynamically defined class from the `__main__` module.
  ([issue #131]( cloudpipe/cloudpickle#131)).

- Fixed a crash in Python 2 when serializing non-hashable instancemethods of built-in
  types. ([issue #144](cloudpipe/cloudpickle#144))

0.4.2
=====

- Restored compatibility with pickles from 0.4.0.
- Handle the `func.__qualname__` attribute.

0.4.1
=====

- Fixed a crash when pickling dynamic classes whose `__dict__` attribute was
  defined as a [`property`](https://docs.python.org/3/library/functions.html#property).
  Most notably, this affected dynamic [namedtuples](https://docs.python.org/2/library/collections.html#namedtuple-factory-function-for-tuples-with-named-fields)
  in Python 2. (cloudpipe/cloudpickle#113)
- Cloudpickle now preserves the `__module__` attribute of functions (cloudpipe/cloudpickle#118).
- Fixed a crash when pickling modules that don't have a `__package__` attribute (cloudpipe/cloudpickle#116).

0.4.0
=====

* Fix functions with empty cells
* Allow pickling Logger objects
* Fix crash when pickling dynamic class cycles
* Ignore "None" mdoules added to sys.modules
* Support WeakSets and ABCMeta instances
* Remove non-standard `__transient__` support
* Catch exception from `pickle.whichmodule()`

0.3.1
=====

* Fix version information and ship a changelog

 0.3.0
=====

* Import submodules accessed by pickled functions
* Support recursive functions inside closures
* Fix `ResourceWarnings` and `DeprecationWarnings`
* Assume modules with `__file__` attribute are not dynamic

0.2.2
=====

* Support Python 3.6
* Support Tornado Coroutines
* Support builtin methods
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants