-
-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-32436: Implement PEP 567 #5027
Conversation
384c38b
to
e439870
Compare
Python/hamt.c
Outdated
@@ -251,6 +251,29 @@ Further Reading | |||
3. Clojure's PersistentHashMap implementation: | |||
https://github.com/clojure/clojure/blob/master/src/jvm/ | |||
clojure/lang/PersistentHashMap.java |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think writing URL is time to break 79-columns rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
Python/hamt.c
Outdated
|
||
#ifdef Py_DEBUG | ||
assert(iter->i_level >= 0); | ||
iter->i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyTuple_Pack()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I completely forgot about it.
Python/hamt.c
Outdated
|
||
#ifdef Py_DEBUG | ||
assert(iter->i_level >= 0); | ||
iter->i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subclassing Hamt is allowed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's not even exposed to Python.
Python/hamt.c
Outdated
return (uint32_t)__builtin_popcountl(i); | ||
#elif defined(__clang__) && (__clang_major__ > 3) | ||
return (uint32_t)__builtin_popcountl(i); | ||
#else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
Python/context.c
Outdated
{ | ||
/* Take hash of the name and XOR it with the default object hash. | ||
We do this to ensure that even sequentially allocated ContextVar | ||
objects have drastically different hashes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dictobject.c
makes a big deal about how Python hashes don't need to be randomly allocated. I assume the issue here is that we know these hashes will be used in HAMTs, and compared to regular dicts, HAMTs are more sensitive to equidistribution? Probably it's worth mentioning this motivation in the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
Python/hamt.c
Outdated
static inline Py_ssize_t | ||
hamt_node_bitmap_count(PyHamtNode_Bitmap *node) | ||
{ | ||
return Py_SIZE(node) >> 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using >> 1
instead of / 2
is 20th century's hack.
All compilers can do this optimization nowdays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in a few places.
23b9f67
to
62fedcc
Compare
@methane Thanks for the initial review, Inada-san. Do you have any other comments/suggestions? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GCC produce these errors. I think we have macro for it, but I don't remember it.
Python/hamt.c: In function ‘hamt_node_collision_assoc’:
Python/hamt.c:1432:1: warning: control reaches end of non-void function [-Wreturn-type]
} ^
Python/hamt.c: In function ‘_PyHamt_Without’:
Python/hamt.c:2365:1: warning: control reaches end of non-void function [-Wreturn-type]
} ^
Python/hamt.c: In function ‘hamt_node_bitmap_without’:
Python/hamt.c:1093:1: warning: control reaches end of non-void function [-Wreturn-type]
} ^
Python/hamt.c: In function ‘hamt_node_array_without’:
Python/hamt.c:1911:1: warning: control reaches end of non-void function [-Wreturn-type]
} ^
Python/hamt.c: In function ‘hamt_node_collision_without’:
Python/hamt.c:1526:1: warning: control reaches end of non-void function [-Wreturn-type]
} ^
Python/hamt.c: In function ‘_PyHamt_Find’:
Python/hamt.c:2395:1: warning: control reaches end of non-void function [-Wreturn-type]
} ^
Python/hamt.c: In function ‘hamt_baseiter_tp_iternext’:
Python/hamt.c:2548:1: warning: control reaches end of non-void function [-Wreturn-type]
} ^
Python/hamt.c: In function ‘hamt_py_get’: Python/hamt.c:2801:1: warning: control reaches end of non-void function [-Wreturn-type]
} ^
Python/hamt.c: In function ‘hamt_tp_subscript’:
Python/hamt.c:2749:1: warning: control reaches end of non-void function [-Wreturn-type]
} ^
|
||
|
||
/* This method is exposed only for CPython tests. Don not use it. */ | ||
PyAPI_FUNC(PyObject *) _PyContext_NewHamtForTests(void); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can it be moved to internal header?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because we're using it in _testcapimodule.c
to test the HAMT datatype directly in Lib/test/test_context.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(unless I'm missing something and it's possible to expose the HAMT type to Python somehow without making a public C API method)
} | ||
|
||
int res = _PyHamt_Eq( | ||
((PyContext *)v)->ctx_vars, ((PyContext *)w)->ctx_vars); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what context equality should be.
How is it used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has to be implemented because per PEP 567 Context
implements collections.abc.Mapping
, which has a __eq__
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And since Context
behaves like a Mapping in all its methods, Context.__eq__
should simply compare if another Context object has the same contents or not.
Python/hamt.c
Outdated
return NULL; | ||
} | ||
if (comp_err == 1) { /* key == key_or_null */ | ||
comp_err = PyObject_RichCompareBool(val, val_or_node, Py_EQ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use val_or_node == val
instead of RichCompareBool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>>> from _testcapi import hamt
>>> h = hamt()
>>> h2 = h.set('a', [])
>>> h3 = h2.set('a', [])
>>> h3.get('a').append(42)
>>> h2.get('a')
[42]
I tried to reproduce it with contextvars, but I faced fatal error. (Sorry, I used non debug build)
https://gist.github.com/methane/2ce5eda468a4091be60c8233080fe265
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this is a real bug, thanks so much for finding it!
Funny enough, I initially got this right, and was using val_or_node == val
. At sometime I decided to review/document hamt.c and changed it to PyObject_RichCompareBool
without giving it too much thought.
Anyways, I'll fix the HAMT and it will also fix the fatal error you saw. That error is interesting on its own. What happened is that PyContextVar_Set
caches a borrowed ref to the last value set. Because you were setting another empty list object to the variable, it wasn't really stored (because of the bug in hamt.c
), so we stored a borrowed reference to an object that was about to be GCed.
|
Thank you, Inada-san. |
@gvanrossum has just approved the PEP so I guess it's time to merge this! |
Is there a pure-Python version of |
perhaps I'm missing something, but it seems the fact that only async tasks inherit a copy of their parent contexts is not mentioned in the docs: https://docs.python.org/3/library/contextvars.html It was surprising to me because threading.local did not behave this way by default, and in fact it makes the API behave differently between async and threading worlds per: import asyncio
import contextvars
import threading
var = contextvars.ContextVar('var')
var.set('spam')
def doit():
print(f"thread: {var.get()}")
async def async_sub_doit():
print(f"async sub: {var.get()}")
async def async_doit():
print(f"async: {var.get()}")
var.set('span2')
print(f"async: {var.get()}")
await asyncio.create_task(async_sub_doit())
print(f"main: {var.get()}")
try:
thread = threading.Thread(target=doit)
thread.start()
thread.join()
except Exception as e:
print(f"Thread exception: {e}")
asyncio.run(async_doit()) which results in the spawned thread not inheriting the contextvars from the parent, whereas the spawned async tasks do. I think this is an important distinction to make to developers. Personally looking at the API docs I didn't expect it to make a copy given it seems like it's supposed to be a fresh context per task. I think it's neat, but unexpected and certainly looks like magic when looking at code like: https://github.com/DataDog/dd-trace-py/blob/v0.36.1/ddtrace/contrib/asyncio/wrappers.py#L52 which sets the current task's contextvar value, then calls the underlying create task, then swaps back. It seemed to me it wouldn't work because you had to execute under the context of the new task, but thanks to some trickery it ends up working :). TLDR would be nice if documented and clarified behavior between the two modes. |
Yes, a task has a copy of contextvars by design. |
The PR implements PEP 567. It consists of a few major blocks:
New public APIs. Files of interest:
Includes/context.h
: the public C API;Includes/internal/context.h
: structs layout;Python/context.c
: implementation;Modules/_contextvarsmodule.c
andLib/contextvars.py
: the contextvars module.Changes in asyncio (including
Modules/_asynciomodule.c
).HAMT immutable dict implementation. Files of interest:
Python/hamt.c
andInclude/internal/hamt.h
.HAMT APIs are private, and are not exported to
Python.h
.I stress-tested HAMT with millions of random key/value insertions/deletions/modifications. I also used the llvm tooling to ensure that tests cover 99% of the C code. I'm pretty confident that there are no memory leaks and that the implementation is stable.
Few minor changes throughout the codebase, including
pystate.h
andpystate.c
, and modifications of build/make scripts.Tests in
test_context
andtest_asyncio
pass the refleaks test mode.https://bugs.python.org/issue32436