Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-111545: Add Py_HashPointer() function #112096

Merged
merged 3 commits into from
Dec 6, 2023
Merged

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Nov 15, 2023

  • Keep _Py_HashPointer() function as an alias to PyHash_Pointer().
  • Add tests on PyHash_Pointer() to test_capi.test_hash.
  • Remove _Py_HashPointerRaw() function: inline code in _Py_hashtable_hash_ptr().

📚 Documentation preview 📚: https://cpython-previews--112096.org.readthedocs.build/

@vstinner
Copy link
Member Author

I checked how _Py_rotateright_uintptr() is compiled on x86-64:

  • clang -O3 with __builtin_rotateright64(): ror rax, cl
  • gcc -O3 with (x >> bits) | (x << (8 * SIZEOF_UINTPTR_T - bits)): ror rbx,cl

As expected, the ROR instruction is used in both cases.


I used gdb: in Python, I run import _testinternalcapi, and then in gdb I type disassemble rotateright_uintptr.

For that, I made the following change. The check_rotateright_uintptr() cannot be used to check the machine code: since check_rotateright_uintptr() arguments are constant, GCC and clang just pre-compute the function result at build. The funciton is no longer called at runtime.

Change:

diff --git a/Modules/_testinternalcapi.c b/Modules/_testinternalcapi.c
index 2d888e0ae2c..a08b3dfb168 100644
--- a/Modules/_testinternalcapi.c
+++ b/Modules/_testinternalcapi.c
@@ -290,6 +290,23 @@ test_rotateright_uintptr(PyObject *self, PyObject *Py_UNUSED(args))
 }
 
 
+static PyObject*
+rotateright_uintptr(PyObject *self, PyObject *args)
+{
+    PyObject *obj;
+    int bits;
+    if (!PyArg_ParseTuple(args, "O!i", &PyLong_Type, &obj, &bits)) {
+        return NULL;
+    }
+    unsigned long long x = PyLong_AsUnsignedLongLong(obj);
+    if (x == (unsigned long long)-1 && PyErr_Occurred()) {
+        return NULL;
+    }
+    uintptr_t y = _Py_rotateright_uintptr((uintptr_t)x, bits);
+    return PyLong_FromUnsignedLongLong(y);
+}
+
+
 #define TO_PTR(ch) ((void*)(uintptr_t)ch)
 #define FROM_PTR(ptr) ((uintptr_t)ptr)
 #define VALUE(key) (1 + ((int)(key) - 'a'))
@@ -1689,6 +1706,7 @@ static PyMethodDef module_functions[] = {
     {"test_popcount", test_popcount, METH_NOARGS},
     {"test_bit_length", test_bit_length, METH_NOARGS},
     {"test_rotateright_uintptr", test_rotateright_uintptr, METH_NOARGS},
+    {"rotateright_uintptr", rotateright_uintptr, METH_VARARGS},
     {"test_hashtable", test_hashtable, METH_NOARGS},
     {"get_config", test_get_config, METH_NOARGS},
     {"set_config", test_set_config, METH_O},

@vstinner vstinner marked this pull request as ready for review November 15, 2023 11:12
@vstinner vstinner requested review from a team and tiran as code owners November 15, 2023 11:12
@vstinner vstinner force-pushed the hash_pointer branch 4 times, most recently from 992c39e to f63b726 Compare November 15, 2023 11:33
@vstinner
Copy link
Member Author

vstinner commented Nov 15, 2023

I checked how _Py_rotateright_uintptr() is compiled on x86-64

Adding _Py_rotateright_uintptr() static inline function added too much code, and it's not needed: current code is already compiled to the expected most efficient x86-64 ROR instruction. I removed the _Py_rotateright_uintptr() function.

Doc/c-api/hash.rst Outdated Show resolved Hide resolved
Python/pyhash.c Outdated Show resolved Hide resolved
PC/pyconfig.h Outdated Show resolved Hide resolved
@vstinner
Copy link
Member Author

vstinner commented Dec 1, 2023

Changes:

@vstinner
Copy link
Member Author

vstinner commented Dec 1, 2023

@encukou: Please review the updated PR.

@vstinner vstinner changed the title gh-111545: Add PyHash_Pointer() function gh-111545: Add Py_HashPointer() function Dec 1, 2023
Copy link
Member

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me, thank you for adding it!

I hope Serhiy is OK with this iteration. (And that the rest of the C-API WG won't have major objections, but Idon't think that's likely.)

Include/internal/pycore_pyhash.h Outdated Show resolved Hide resolved
@vstinner
Copy link
Member Author

vstinner commented Dec 1, 2023

@serhiy-storchaka: Do you want to review this change?

@vstinner
Copy link
Member Author

vstinner commented Dec 1, 2023

I rebased the PR on main to get a NoGIL fix.

@vstinner
Copy link
Member Author

vstinner commented Dec 1, 2023

@encukou:

(And that the rest of the C-API WG won't have major objections, but Idon't think that's likely.)

PEP 731 says:

The working group may operate primarily through reviews of GitHub issues and PRs.

Are you suggesting that the C API cannot be modified anymore unless whole C API Working Group approve a change? Or are you suggesting that the C API Working Group must review all C API additions? I don't think that was the plan. Well, the plan is not well defined.

@encukou
Copy link
Member

encukou commented Dec 4, 2023

Are you suggesting that the C API cannot be modified anymore unless whole C API Working Group approve a change?

No, I'm saying that the others might disagree -- and that I don't think it's very likely that they will disagree.
I'd optimistically merge this (edit: after checking with Serhiy), but be prepared to fix it up or revert.

Copy link
Member

@zooba zooba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No concerns from me, other than the lack of docs.

Adding new C API should include the WG up until we've developed and agreed upon the guidelines for adding new C API. Without those, yeah, we should all get pinged for any new APIs. (And if that takes up all our time and we don't get a chance to develop guidelines, so be it...)


.. c:function:: Py_hash_t Py_HashPointer(const void *ptr)

Hash a pointer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a bit more text to clarify that it ensures the pointer itself is usable as a hash value, and does not even attempt to access the memory the pointer refers to (let alone trying to call __hash__ or similar)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completed the doc. Is it more explicit? I clarified that only the pointer value is hashed.

* Implement _Py_HashPointerRaw() as a static inline function.
* Add Py_HashPointer() tests to test_capi.test_hash.
* Keep _Py_HashPointer() function as an alias to Py_HashPointer().
@vstinner
Copy link
Member Author

vstinner commented Dec 5, 2023

Adding new C API should include the WG up until we've developed and agreed upon the guidelines for adding new C API.

@encukou created capi-workgroup/decisions#1

``uintptr_t`` internally). The pointer is not dereferenced.

The function cannot fail: it cannot return ``-1``.

Copy link
Member

@iritkatriel iritkatriel Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing some mention of the use case for this (not necessarily here). The issue talks about _Py_HashDouble, there is no mention of Py_HashPointer.

@vstinner
Copy link
Member Author

vstinner commented Dec 6, 2023

@iritkatriel:

I'm missing some mention of the use case for this (not necessarily here). The issue talks about _Py_HashDouble, there is no mention of Py_HashPointer.

Proposed Py_HashDouble() API requires the caller to call Py_HashPointer(obj) for NaN: see #112449

Moreover, I removed the private function _Py_HashPointer() in Python 3.13 alpha1 and it broke a few projects. A code search on PyPI top 5,000 projects found 4 projects:

  • cffi (1.16.0)
  • gmpy2 (2.1.5)
  • numba (0.58.1)
  • tree_sitter (0.20.4)

Finally, igraph-0.11.2 contains a copy of the private _Py_HashPointer() function in src/_igraph/pyhelpers.c.

@vstinner vstinner merged commit 828451d into python:main Dec 6, 2023
32 checks passed
@vstinner vstinner deleted the hash_pointer branch December 6, 2023 14:09
@vstinner
Copy link
Member Author

vstinner commented Dec 6, 2023

I merged my PR. The API was approved by the C API Working Group: capi-workgroup/decisions#1

aisk pushed a commit to aisk/cpython that referenced this pull request Feb 11, 2024
* Implement _Py_HashPointerRaw() as a static inline function.
* Add Py_HashPointer() tests to test_capi.test_hash.
* Keep _Py_HashPointer() function as an alias to Py_HashPointer().
Glyphack pushed a commit to Glyphack/cpython that referenced this pull request Sep 2, 2024
* Implement _Py_HashPointerRaw() as a static inline function.
* Add Py_HashPointer() tests to test_capi.test_hash.
* Keep _Py_HashPointer() function as an alias to Py_HashPointer().
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants