Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add format_descriptor<> & npy_format_descriptor<> PyObject * specializations. #4674

Merged
merged 24 commits into from
May 23, 2023

Conversation

rwgk
Copy link
Collaborator

@rwgk rwgk commented May 17, 2023

Description

Fixes a curious omission: the O format and dtype.object have been part of numpy ~forever (and therefore also of the buffer protocol, for all practical purposes).

The npy_format_descriptor<PyObject *> specialization enables py::array_t<PyObject *> to/from-python conversions (see added tests).

As a side-effect of adding the npy_format_descriptor<PyObject *> specialization, the runtime behavior of the type_casters for Eigen types changes (see below). To guard against accidents, static_assert()s are systematically added to all those type_casters.

Also add buffer_info::item_type_is_equivalent_to<T>() to make the existing detail::compare_buffer_info<T>::compare() more visible & accessible.

Note that the (small) change in pybind11/numpy.h fixes a long-standing (and inconsequential) bug: pybind11_fail("Unsupported buffer format!") called while PyErr_Occurred() is true.

BTW: This PR turned into a much bigger project than anticipated, but the functional production code changes are actually really small. To see that, focus only on:

  • include/pybind11/detail/common.h
  • include/pybind11/numpy.h
  • include/pybind11/buffer_info.h

Everything else is just tests, safety guards, and boilerplate noise.


Example of the runtime behavior of the type_casters for Eigen types before this PR:

    def test_pass_mat_pyobject_ptr():
        mat = np.array([[0, 1, 2], [3, 4, 5]], dtype=object)
>       m.pass_mat_pyobject_ptr(mat)
E       RuntimeError: NumPy type info missing for P7_object

mat        = array([[0, 1, 2],
       [3, 4, 5]], dtype=object)

test_eigen_matrix.py:812: RuntimeError

After this PR, the code used to produce that output does not compile anymore (static_assert() failure). Without the guards, there would be a buffer overrun (likely to trigger a segfault).

Suggested changelog entry:

1. ``format_descriptor<>`` & ``npy_format_descriptor<>`` ``PyObject *`` specializations were added. The latter enables ``py::array_t<PyObject *>`` to/from-python conversions.
2. ``buffer_info`` gained an ``item_type_is_equivalent_to<T>()`` member function.

Ralf W. Grosse-Kunstleve added 2 commits May 17, 2023 15:36
@Skylion007
Copy link
Collaborator

Skylion007 commented May 17, 2023

@rwgk There are a bunch of old PRs / issues that touch on this. #1152 and #647

@rwgk
Copy link
Collaborator Author

rwgk commented May 17, 2023

@rwgk There are a bunch of old PRs / issues that touch on this. #1152 and #647

I kind of stumbled into this unintentionally, TBH. I don't want to spend too much extra time.

I'm still working on what I really need.

Do you have concrete suggestions for changes here? (The production code change is actually very small and even fixes a(n inconsequential) bug: pybind11_fail() called while PyErr_Occurred() is true.)

@Skylion007
Copy link
Collaborator

@EricCousineau-TRI has given a lot of thought to this. #2259 Can you take a look?

/// Return dtype for the given typenum (one of the NPY_TYPES).
/// https://numpy.org/devdocs/reference/c-api/array.html#c.PyArray_DescrFromType
static dtype from_typenum(int typenum) {
auto *ptr = detail::npy_api::get().PyArray_DescrFromType_(typenum);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought dtype already have an explicit int ctor for this. Probably should have made it a static method in retrospect, but it's already there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out, changed. The explicit ctor is fine IMO, I just didn't look around at all.

@rwgk
Copy link
Collaborator Author

rwgk commented May 17, 2023

@EricCousineau-TRI has given a lot of thought to this. #2259 Can you take a look?

#1152 is mainly going into the eigen code, it looks like a lot of work. What I have is much more limited and really very simple. I don't want to expand the scope here.

Ralf W. Grosse-Kunstleve added 2 commits May 17, 2023 16:41
Trivial addition, but still in search for a meaningful test.
@rwgk
Copy link
Collaborator Author

rwgk commented May 17, 2023

I just pushed 50eaa3a, which is what I really need.

I thought getting into the numpy code will be a good way to add a meaningful test, but instead I got far off track.

Does someone have an idea for a meaningful minimal test for the format_descriptor<PyObject *> addition?

My original use case, which I forced through by making a small mess that I'm trying to clean up now:

https://github.com/pybind/pybind11_abseil/blob/a8fed7557747fa33e5a04844738baf5f6a1c8d1b/pybind11_abseil/absl_casters.h#L442

@rwgk
Copy link
Collaborator Author

rwgk commented May 18, 2023

Not knowing any better (not being familiar with the buffer interface and numpy code), I decided to simply add a unit test that is 1. exactly focused on what I need, but 2. also comprehensive in that domain: 82ce80f

@rwgk rwgk changed the title Add npy_format_descriptor<PyObject *> to enable py::array_t<PyObject *> to/from-python conversions. Add format_descriptor<> & npy_format_descriptor<> PyObject * support. May 18, 2023
@rwgk rwgk changed the title Add format_descriptor<> & npy_format_descriptor<> PyObject * support. Add format_descriptor<> & npy_format_descriptor<> PyObject * specializations. May 18, 2023
@rwgk rwgk marked this pull request as ready for review May 18, 2023 17:45
@rwgk rwgk requested a review from EthanSteinberg May 18, 2023 17:46
copybara-service bot pushed a commit to pybind/pybind11_abseil that referenced this pull request May 18, 2023
…ters.h

While working on pybind/pybind11#4674 I came to realize that this include is not needed here.

Note however that the pybind11/cast.h changes under pybind/pybind11#4601 are still needed, therefore pybind11_abseil still requires current pybind11 master.

PiperOrigin-RevId: 533182724
Copy link
Collaborator

@EthanSteinberg EthanSteinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this! A couple of comments for minor possible improvements.

@@ -287,6 +287,8 @@ handle eigen_encapsulate(Type *src) {
template <typename Type>
struct type_caster<Type, enable_if_t<is_eigen_dense_plain<Type>::value>> {
using Scalar = typename Type::Scalar;
static_assert(!std::is_pointer<Scalar>::value,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might be able to move these asserts to EigenProps, which would reduce the amount of redundant code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was ambivalent before and still am a little bit, but decided to keep the 5 static_assert():

Cons: 2-3 x 2 more lines of code.

Pros:

The static_assert()s appear at or near the top of each of the type_casters, which makes them essentially a kind of documentation.

In case a static_assert() fails, it will be much more obvious that the message we want to send is: the type_caster does not support pointer types as scalar types.

// Common message for `static_assert()`s, which are useful to easily preempt much less obvious
// errors in code that does not support `format_descriptor<PyObject *>`.
#define PYBIND11_MESSAGE_POINTER_TYPES_ARE_NOT_SUPPORTED \
"Pointer types (in particular `PyObject *`) are not supported."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this message Eigen specific

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

I introduced pybind11/eigen/common.h to have central location for the message. (Previously I shied away from doing that, that's why the message wasn't specific. But that was really just taking a shortcut. What we have now this is definitely better.)

@@ -164,6 +164,8 @@ PYBIND11_WARNING_POP

template <typename Type>
struct type_caster<Type, typename eigen_tensor_helper<Type>::ValidType> {
static_assert(!std::is_pointer<typename Type::Scalar>::value,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to not add the assert to eigen_tensor_helper instead to avoid duplicate lines?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

#include <pybind11/stl.h>

#include "constructor_stats.h"
#include "pybind11_tests.h"

TEST_SUBMODULE(buffers, m) {

#define PYBIND11_LOCAL_DEF(...) \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it's worth changing, but I tend to not like macros that return values.

I would have written this as:

std::map<std::string, std::string> values;

#define ASSIGN_HELPER(...)
    values[#__VA_ARGS__] = return py::format_descriptor<__VA_ARGS__>::format();

ASSIGN_HELPER(bool);
ASSIGN_HELPER(PyObject*);

return values[cpp_name];

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

],
)
def test_format_descriptor_format(cpp_name, expected_codes):
assert m.format_descriptor_format(cpp_name) in expected_codes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add an assert that the format descriptor is a valid numpy format descriptor

https://numpy.org/doc/stable/reference/generated/numpy.format_parser.html#numpy-format-parser

assert np. format_parser(blah).dtype is not None

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done-ish. See my comments in the test code.

What a lucky man I was to never have seen #1908 before.

I see Windows isn't happy about my latest version of the test. Fixing...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add an assert that the format descriptor is a valid numpy format descriptor

@lalaland Thanks a lot for nudging me in that direction!

Using np. format_parser() didn't work out, but via d432ce7 I got on the right track, and after that taught me what I needed to look for, I quickly found it here:

template <typename T, typename SFINAE = void>
struct compare_buffer_info {
static bool compare(const buffer_info &b) {
return b.format == format_descriptor<T>::format() && b.itemsize == (ssize_t) sizeof(T);
}
};
template <typename T>
struct compare_buffer_info<T, detail::enable_if_t<std::is_integral<T>::value>> {
static bool compare(const buffer_info &b) {
return (size_t) b.itemsize == sizeof(T)
&& (b.format == format_descriptor<T>::value
|| ((sizeof(T) == sizeof(long))
&& b.format == (std::is_unsigned<T>::value ? "L" : "l"))
|| ((sizeof(T) == sizeof(size_t))
&& b.format == (std::is_unsigned<T>::value ? "N" : "n")));
}
};

From there it was only a small step to add a public (not in namespace detail) interface for the existing functionality (029b157), and then your validation idea basically fell into place.

The new public interface is also exactly what I need for a clean solution here:

https://github.com/pybind/pybind11_abseil/blob/a8fed7557747fa33e5a04844738baf5f6a1c8d1b/pybind11_abseil/absl_casters.h#L442

PyObject **data = arr_from_list.mutable_data();
for (py::size_t i = 0; i < arr_size; i++) {
assert(data[i] == nullptr);
data[i] = py::cast<PyObject *>(objs[i].attr("value"));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a silly question, but does this appropriately increase the reference count?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a silly question, this was something I was struggling with quite a bit:

// Note that `cast<PyObject *>(obj)` increments the reference count of `obj`.
// This is necessary for the case that `obj` is a temporary, and could
// not possibly be different, given
// 1. the established convention that the passed `handle` is borrowed, and
// 2. we don't want to force all generic code using `cast<T>()` to special-case
// handling of `T` = `PyObject *` (to increment the reference count there).
// It is the responsibility of the caller to ensure that the reference count
// is decremented.
template <typename T,
typename Handle,
detail::enable_if_t<detail::is_same_ignoring_cvref<T, PyObject *>::value
&& detail::is_same_ignoring_cvref<Handle, handle>::value,
int>
= 0>
T cast(Handle &&handle) {
return handle.inc_ref().ptr();
}

Ralf W. Grosse-Kunstleve added 2 commits May 18, 2023 14:07
…, as suggested by @lalaland. Move to new pybind11/eigen/common.h header.
…aland. Additional tests meant to ensure consistency between py::format_descriptor<>, np.array, np.format_parser turn out to be useful only to highlight long-standing inconsistencies.
@rwgk rwgk requested a review from henryiii as a code owner May 18, 2023 22:23
@rwgk
Copy link
Collaborator Author

rwgk commented May 19, 2023

@lalaland @Skylion007 this is done done now I think, could you please take another look?

I already ran this with all clang sanitizers in the Google toolchain and also initiated Google-global testing.

Copy link
Collaborator

@EthanSteinberg EthanSteinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor comment about the compare function, which I think is poorly named, but otherwise, looks good!

I like how we now have a full unit test to verify that the C++ types and numpy types are aligned accordingly

@@ -150,6 +153,11 @@ struct buffer_info {
Py_buffer *view() const { return m_view; }
Py_buffer *&view() { return m_view; }

template <typename T>
static bool compare(const buffer_info &b) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be named has_same_format or something like that? compare sounds like a general == check, when this is only checking the format of the buffer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I totally agree, but what is a good name?

Here is my line of though that got me to the new name and source code comment:

"format" has so so many meanings already, to the uninitiated it likely means nothing. Additionally, the Py_buffer format strings may actually not be the same, because of the i/q/l ambiguity I was struggling with here.

Trying to start from the top down, what would a good source code comment be?

// Determines if the buffer item type is `T`.

Asking myself: Is that 100% accurate? What is "is" when it comes to aliases, such as long double being effectively an alias for double on some platforms? Or long effectively being an alias for int on some platforms, and long long an alias for long on others? (I think I actually stumbled over all of these ambiguities at various points in time far back.)

// Determines if the buffer item type is equivalent to `T`.

Asking myself for completeness: Do we have to worry about endianness? Searching for "endia", "most", "least", "signi" under https://docs.python.org/3/c-api/buffer.html doesn't pull up anything, which I take it means that we don't have to worry. It appears to be assumed that endianness conventions are never mixed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the precise semantics here are the same as https://en.cppreference.com/w/cpp/types/is_same

// Determines if the butter item type is the same as T (see std::is_same)

Asking myself for completeness: Do we have to worry about endianness? Searching for "endia", "most", "least", "signi" under https://docs.python.org/3/c-api/buffer.html doesn't pull up anything, which I take it means that we don't have to worry. It appears to be assumed that endianness conventions are never mixed.

Good point. Yeah, your code is actually technically incorrect as you aren't handling endianness. But it will only provide false negatives. The format strings do specify endianness. This is specified in https://docs.python.org/3/library/struct.html#byte-order-size-and-alignment

Note that if there is no endian specifier, the default is "native", so your code won't return any false positives.

Your current code will return false negatives. "@i" and "=i" for example are equivalent to "int", but your code will return false. "<i" will be equivalent to int on little-endian platforms. ">i" and "!i" on big-endian platforms. Etc, etc.

Not sure if this is worth handling.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why is this a static method and not a member function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the precise semantics here are the same as https://en.cppreference.com/w/cpp/types/is_same

Not sure: last night I saw that std::is_same<double, long double>::value with MSVC is false, which is why I had to change my code to sizeof(long double) == sizeof(double) (b09e75b), because that's what we really want to know.

Not sure if this is worth handling.

I'm pretty sure no: https://docs.python.org/3/c-api/buffer.html isn't explicit about it (unfortunately), but it's strongly implied that endianness is assumed not to be mixed. Therefore I think going into that in the context of buffer_info will not help anything, on the contrary.

The situation is very different for serializing/deserializing arrays, then it has to be assumed endianness could be different between the producer and consumer, which could run on totally different platforms.

But the buffer interface is all about in-memory access. Hopefully we won't have one process (one address space) that mixes endianness. (It sounds like an idea from hell.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why is this a static method and not a member function?

Then you can take a simple function pointer, vs a member function pointer. I think that's easier to work with in generic code.

Other than that, it doesn't really matter?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than that, it doesn't really matter?

It's rather uncommon to see a static member function that could be a member function since member functions have a better interface for general use. I actually can't think of a single example in the standard library where a static member function is used like this.

It's more of a style thing though, not major.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll try to change it to a member function asap. (I figure it's now or never, i.e. important to get right now.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: ef34d29

I agree the buffer_info API looks better that way. The only trouble was/is, the syntax for dereferencing member function pointers badly rubs me the wrong way. (buffer.request().*((*equiv_table)[cpp_name]))(), what a beauty!

@@ -595,3 +595,74 @@ def test_round_trip_float():
arr = np.zeros((), np.float64)
arr[()] = 37.2
assert m.round_trip_float(arr) == 37.2


# HINT: An easy and robust way (although only manual unfortunately) to check for
Copy link
Collaborator

@EthanSteinberg EthanSteinberg May 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be possible to do this automatically using weakrefs: https://docs.python.org/3/library/weakref.html#weakref.ref

I'm not sure it's worth changing for this PR.

Copy link
Collaborator Author

@rwgk rwgk May 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be possible to do this automatically using weakrefs:

Yes, but: That only works when making strong assumptions about the reference-counting and garbage collection behavior. In practice, when targeting C Python only, that's usually fine. Fundamentally though such tests are creating tech debt that might get in the way of core interpreter developments in the future. Therefore I generally write tests involving weakref and sys.getrefcount() only as a last resort.

What would be great to have: a pytest feature, e.g. @pytest.mark.empirical_leak_check(), with options like --empirical_leak_check=quick and --empirical_leak_check=thorough, that run each marked test in a loop for a few seconds and monitor RES programmatically, with heuristics to decide how long to try (to deal with noise) and when to flag a potential leak or not. (It will probably be a bit of black magic to get to a useful stable implementation.)

…e_is_equivalent_to`. Add comment defining "equivalent" by example.
@rwgk
Copy link
Collaborator Author

rwgk commented May 20, 2023

Just for the record: global testing passed as expected (internal test id OCL:533474823:BASE:533577337:1684538450117:2fa7449).

I still need to work on the static-to-member-function change, but retesting globally will not be needed.

@rwgk rwgk merged commit 8e1f9d5 into pybind:master May 23, 2023
@rwgk rwgk deleted the numpy_pyobject_ptr_master branch May 23, 2023 17:49
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label May 23, 2023
copybara-service bot pushed a commit to pybind/pybind11_abseil that referenced this pull request May 24, 2023
The new member function was added with pybind/pybind11#4674

PiperOrigin-RevId: 534952040
@rwgk rwgk removed the needs changelog Possibly needs a changelog entry label Jul 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants