Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-32436: Implement PEP 567 #5027

Merged
merged 24 commits into from
Jan 23, 2018
Merged

bpo-32436: Implement PEP 567 #5027

merged 24 commits into from
Jan 23, 2018

Conversation

1st1
Copy link
Member

@1st1 1st1 commented Dec 28, 2017

The PR implements PEP 567. It consists of a few major blocks:

  1. New public APIs. Files of interest:

    • Includes/context.h: the public C API;
    • Includes/internal/context.h: structs layout;
    • Python/context.c: implementation;
    • Modules/_contextvarsmodule.c and Lib/contextvars.py: the contextvars module.
  2. Changes in asyncio (including Modules/_asynciomodule.c).

  3. HAMT immutable dict implementation. Files of interest: Python/hamt.c and Include/internal/hamt.h.

    • HAMT APIs are private, and are not exported to Python.h.

    • I stress-tested HAMT with millions of random key/value insertions/deletions/modifications. I also used the llvm tooling to ensure that tests cover 99% of the C code. I'm pretty confident that there are no memory leaks and that the implementation is stable.

  4. Few minor changes throughout the codebase, including pystate.h and pystate.c, and modifications of build/make scripts.

Tests in test_context and test_asyncio pass the refleaks test mode.

https://bugs.python.org/issue32436

Python/hamt.c Outdated
@@ -251,6 +251,29 @@ Further Reading
3. Clojure's PersistentHashMap implementation:
https://github.com/clojure/clojure/blob/master/src/jvm/
clojure/lang/PersistentHashMap.java
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think writing URL is time to break 79-columns rule.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

Python/hamt.c Outdated

#ifdef Py_DEBUG
assert(iter->i_level >= 0);
iter->i
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyTuple_Pack()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I completely forgot about it.

Python/hamt.c Outdated

#ifdef Py_DEBUG
assert(iter->i_level >= 0);
iter->i
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subclassing Hamt is allowed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's not even exposed to Python.

Python/hamt.c Outdated
return (uint32_t)__builtin_popcountl(i);
#elif defined(__clang__) && (__clang_major__ > 3)
return (uint32_t)__builtin_popcountl(i);
#else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Python/context.c Outdated
{
/* Take hash of the name and XOR it with the default object hash.
We do this to ensure that even sequentially allocated ContextVar
objects have drastically different hashes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dictobject.c makes a big deal about how Python hashes don't need to be randomly allocated. I assume the issue here is that we know these hashes will be used in HAMTs, and compared to regular dicts, HAMTs are more sensitive to equidistribution? Probably it's worth mentioning this motivation in the comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Python/hamt.c Outdated
static inline Py_ssize_t
hamt_node_bitmap_count(PyHamtNode_Bitmap *node)
{
return Py_SIZE(node) >> 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using >> 1 instead of / 2 is 20th century's hack.
All compilers can do this optimization nowdays.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a few places.

@1st1 1st1 force-pushed the pep567 branch 4 times, most recently from 23b9f67 to 62fedcc Compare January 16, 2018 17:44
@1st1
Copy link
Member Author

1st1 commented Jan 17, 2018

@methane Thanks for the initial review, Inada-san. Do you have any other comments/suggestions?

Copy link
Member

@methane methane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GCC produce these errors. I think we have macro for it, but I don't remember it.

Python/hamt.c: In function ‘hamt_node_collision_assoc’:
Python/hamt.c:1432:1: warning: control reaches end of non-void function [-Wreturn-type]
 }                                                                                                                                                                                             ^
Python/hamt.c: In function ‘_PyHamt_Without’:
Python/hamt.c:2365:1: warning: control reaches end of non-void function [-Wreturn-type]
 }                                                                                                                                                                                             ^
Python/hamt.c: In function ‘hamt_node_bitmap_without’:
Python/hamt.c:1093:1: warning: control reaches end of non-void function [-Wreturn-type]
 }                                                                                                                                                                                             ^
Python/hamt.c: In function ‘hamt_node_array_without’:
Python/hamt.c:1911:1: warning: control reaches end of non-void function [-Wreturn-type]
 }                                                                                                                                                                                             ^
Python/hamt.c: In function ‘hamt_node_collision_without’:
Python/hamt.c:1526:1: warning: control reaches end of non-void function [-Wreturn-type]
 }                                                                                                                                                                                             ^
Python/hamt.c: In function ‘_PyHamt_Find’:
Python/hamt.c:2395:1: warning: control reaches end of non-void function [-Wreturn-type]
 }                                                                                                                                                                                             ^
Python/hamt.c: In function ‘hamt_baseiter_tp_iternext’:
Python/hamt.c:2548:1: warning: control reaches end of non-void function [-Wreturn-type]
 }                                                                                                                                                                                             ^
Python/hamt.c: In function ‘hamt_py_get’:                                                                                                                                                     Python/hamt.c:2801:1: warning: control reaches end of non-void function [-Wreturn-type]
 }                                                                                                                                                                                             ^
Python/hamt.c: In function ‘hamt_tp_subscript’:
Python/hamt.c:2749:1: warning: control reaches end of non-void function [-Wreturn-type]
 }                                                                                                                                                                                             ^



/* This method is exposed only for CPython tests. Don not use it. */
PyAPI_FUNC(PyObject *) _PyContext_NewHamtForTests(void);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be moved to internal header?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because we're using it in _testcapimodule.c to test the HAMT datatype directly in Lib/test/test_context.py.

Copy link
Member Author

@1st1 1st1 Jan 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(unless I'm missing something and it's possible to expose the HAMT type to Python somehow without making a public C API method)

}

int res = _PyHamt_Eq(
((PyContext *)v)->ctx_vars, ((PyContext *)w)->ctx_vars);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what context equality should be.
How is it used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has to be implemented because per PEP 567 Context implements collections.abc.Mapping, which has a __eq__ method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And since Context behaves like a Mapping in all its methods, Context.__eq__ should simply compare if another Context object has the same contents or not.

Python/hamt.c Outdated
return NULL;
}
if (comp_err == 1) { /* key == key_or_null */
comp_err = PyObject_RichCompareBool(val, val_or_node, Py_EQ);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use val_or_node == val instead of RichCompareBool.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>>> from _testcapi import hamt
>>> h = hamt()
>>> h2 = h.set('a', [])
>>> h3 = h2.set('a', [])
>>> h3.get('a').append(42)
>>> h2.get('a')
[42]

I tried to reproduce it with contextvars, but I faced fatal error. (Sorry, I used non debug build)
https://gist.github.com/methane/2ce5eda468a4091be60c8233080fe265

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is a real bug, thanks so much for finding it!

Funny enough, I initially got this right, and was using val_or_node == val. At sometime I decided to review/document hamt.c and changed it to PyObject_RichCompareBool without giving it too much thought.

Anyways, I'll fix the HAMT and it will also fix the fatal error you saw. That error is interesting on its own. What happened is that PyContextVar_Set caches a borrowed ref to the last value set. Because you were setting another empty list object to the variable, it wasn't really stored (because of the bug in hamt.c), so we stored a borrowed reference to an object that was about to be GCed.

@methane
Copy link
Member

methane commented Jan 18, 2018

Py_UNREACHABLE() is macro for unreachable flow.

@1st1
Copy link
Member Author

1st1 commented Jan 22, 2018

Thank you, Inada-san.

@1st1
Copy link
Member Author

1st1 commented Jan 23, 2018

@gvanrossum has just approved the PEP so I guess it's time to merge this!

@1st1 1st1 merged commit f23746a into python:master Jan 23, 2018
@1st1 1st1 deleted the pep567 branch January 23, 2018 00:11
@1st1 1st1 restored the pep567 branch January 23, 2018 00:11
@smurfix
Copy link

smurfix commented Feb 15, 2018

Is there a pure-Python version of contextvars?

@vltr
Copy link

vltr commented Apr 17, 2018

@smurfix see here

@thehesiod
Copy link
Contributor

perhaps I'm missing something, but it seems the fact that only async tasks inherit a copy of their parent contexts is not mentioned in the docs: https://docs.python.org/3/library/contextvars.html It was surprising to me because threading.local did not behave this way by default, and in fact it makes the API behave differently between async and threading worlds per:

import asyncio
import contextvars
import threading

var = contextvars.ContextVar('var')
var.set('spam')


def doit():
    print(f"thread: {var.get()}")


async def async_sub_doit():
    print(f"async sub: {var.get()}")


async def async_doit():
    print(f"async: {var.get()}")
    var.set('span2')
    print(f"async: {var.get()}")
    await asyncio.create_task(async_sub_doit())

print(f"main: {var.get()}")

try:
    thread = threading.Thread(target=doit)
    thread.start()
    thread.join()
except Exception as e:
    print(f"Thread exception: {e}")

asyncio.run(async_doit())

which results in the spawned thread not inheriting the contextvars from the parent, whereas the spawned async tasks do. I think this is an important distinction to make to developers. Personally looking at the API docs I didn't expect it to make a copy given it seems like it's supposed to be a fresh context per task. I think it's neat, but unexpected and certainly looks like magic when looking at code like: https://github.com/DataDog/dd-trace-py/blob/v0.36.1/ddtrace/contrib/asyncio/wrappers.py#L52 which sets the current task's contextvar value, then calls the underlying create task, then swaps back. It seemed to me it wouldn't work because you had to execute under the context of the new task, but thanks to some trickery it ends up working :). TLDR would be nice if documented and clarified behavior between the two modes.

@asvetlov
Copy link
Contributor

Yes, a task has a copy of contextvars by design.
A thread doesn't copy vars but has a fresh context for the sake of backward compatibility; otherwise, the behavior of libraries like decimal will be broken in a backward-incompatible way.
I can imagine an argument to threading.Thread to optionally pass a copy of the current thread's context vars into a spawned thread, maybe it is a nice improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants