Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FFI module cleanup #1289

Open
2 of 3 tasks
davidhewitt opened this issue Nov 20, 2020 · 12 comments
Open
2 of 3 tasks

FFI module cleanup #1289

davidhewitt opened this issue Nov 20, 2020 · 12 comments

Comments

@davidhewitt
Copy link
Member

davidhewitt commented Nov 20, 2020

The modules in src/ffi have become a bit out-of-date, and also quite disorganised so it's kinda hard to figure out what's missing.

Since Python 3.9 the cpython include files have been split so that (roughly) the limited api is in cpython/Include/ and the unlimited api is in cpython/Include/cpython/. I propose for simplicity in maintaining the pyo3::ffi module we should reorganise code to match that structure.

I suggest that we do the following things:

  • reorganise ffi/mod.rs to be alphabetical and have // skipped foo.h comments for each header present in cpython master which is not present in our bindings at the moment.
  • add ffi/cpython/mod.rs to contain the unlimited api headers, and do the same as with ffi/mod.rs.
  • for each file in ffi/, reorganise the definitions in it appropriately into ffi and ffi/cpython parts which match the cpython master definitions. This should include:
    • Put the existing definitions in the same order as upstream cpython.
    • Either:
      • Add // skipped foo for each definition foo we don't currently have in our definitions.
      • Add new definition for each we don't currently have in our definitions.

The idea with each // skipped foo being in alphabetical order is that it's then hopefully easier for us to compare against upstream in the future.

With the new cpython folder for the unlimited api, it might be reduce a lot of #[cfg(Py_LIMITED_API)] chatter as we'll only need that once on the mod cpython definition.

Also, it could be interesting to not have pub use self::cpython::* in ffi/mod.rs, so that users are forced to use unlimited api symbols as ffi::cpython::foo. But that's probably quite breaking, so I can go either way on it.

Perhaps in general the above guidelines of how to organise our ffi modules should go in CONTRIBUTING.md ?

@nw0
Copy link
Contributor

nw0 commented Dec 23, 2020

I've started implementing this, and ran into a couple of questions:

  1. Do we want to mark the Py_DEPRECATED functions in any way? I'm not sure whether this is useful or desired in PyO3, but this would be a convenient time to do it.
  2. There are a good number of ifndef Py_LIMITED_API blocks in the cpython/Include/*.h headers. I'm inclined to leave them in place for now: is this desired? The best thing to do might be to file PRs against cpython to move them where possible, I guess.

@davidhewitt
Copy link
Member Author

@nw0 thanks so much for being willing to help us out! In response to your questions:

  1. With Py_DEPRECATED functions I think:

    • any we currently have ffi bindings for, we should add #[deprecated()] also.
    • any that we are missing, let's skip adding.
  2. With #ifndef Py_LIMITED_API inside the cpython/Include/*.h headers - that's quite suprising as I thought that those headers are only #include-d if it's not Py_LIMITED_API. I guess we'll have to match anything that cpython does with those definitions. If you want to file upstream PRs against cpython, the whole ecosystem will be extremely grateful for you helping us to keep things tidy!

P.S. I can imagine that this whole ffi cleanup will be quite a big PR. For ease of reviewing, and maybe also so that I and others can join in with the cleanup effort, I would suggest you open PRs with just a few files in it at a time; that way we can merge things as we go.

Again, thank you so much! ❤️

@nw0
Copy link
Contributor

nw0 commented Dec 27, 2020

A number of the comments read:

needs adjustment for Python 3.3 and 3.5

mod unicodeobject; // TODO supports PEP-384 only; needs adjustment for Python 3.3 and 3.5

As we don't support Python 3.5 any longer (let alone 3.3), shall I remove those? Asking as I'm not sure what it means.

@kngwyu
Copy link
Member

kngwyu commented Dec 27, 2020

I was thinking of removing those comments and appreciate your help. 👍🏼

@jeet-parekh
Copy link

I've been exploring the code under src/ffi, and also the CPython header files. I have a question.

I noticed that sometimes the #defines from the C header files are implemented in src/ffi, and sometimes they aren't. And sometimes if they are implemented, then they're not being used anywhere.
The typedefs are mostly implemented. If not, they are commented as skipped.

My question is, what's the thought process behind what should and shouldn't be added to the FFI? I'm mostly confused by the #defines. The typedefs are pretty clear to me.
I am a complete beginner to rust, so apologies if I may have missed something obvious.

@davidhewitt
Copy link
Member Author

Great question. In general we need to include all the #defines from the C header files in src/ffi. As they're C macros they won't be exported in the libpython symbol table; instead we would to re-implement them.

The code in src/ffi is for the whole ecosystem, not just PyO3 - so even if the PyO3 crate doesn't use them, downstream crates might. (Once day I hope to have good test coverage of all of src/ffi, but I think that's a long way off.)

@jeet-parekh
Copy link

Thanks! I guess implementing the structs would be a good way to get started. And if they use any macros or other constant #defines, then I include that as well. The other macros that are not being used might be hard to test. So I'll skip those. Any thoughts?

@davidhewitt
Copy link
Member Author

Sounds great! If you're looking for examples to refer back to, this series of PRs by @nw0 which has been some prior work in this area are excellent: https://github.com/PyO3/pyo3/pulls?q=is%3Apr+author%3Anw0+is%3Aclosed

@jeet-parekh
Copy link

Still exploring the FFI code slowly. I have a few more questions. Would appreciate some pointers.

  1. Some functions are implemented as unsafe.

    pyo3/src/ffi/datetime.rs

    Lines 235 to 237 in b365969

    pub unsafe fn PyDateTime_TIME_GET_SECOND(o: *mut PyObject) -> c_int {
    _PyDateTime_GET_SECOND!((o as *mut PyDateTime_Time), 0)
    }

    And some are not unsafe.

    pyo3/src/ffi/datetime.rs

    Lines 317 to 324 in b365969

    extern "C" {
    // skipped _PyDateTime_HAS_TZINFO (not in PyPy)
    #[link_name = "PyPyDateTime_GET_YEAR"]
    pub fn PyDateTime_GET_YEAR(o: *mut PyObject) -> c_int;
    #[link_name = "PyPyDateTime_GET_MONTH"]
    pub fn PyDateTime_GET_MONTH(o: *mut PyObject) -> c_int;
    #[link_name = "PyPyDateTime_GET_DAY"]
    pub fn PyDateTime_GET_DAY(o: *mut PyObject) -> c_int;

    What's the thought process behind that? I have two guesses - (1) C macros are implemented as unsafe in rust, or (2) the functions which use pyobject are unsafe. Or is it something that I have completely missed?

  2. Why are these functions inside Option? Is it simply because they're function pointers?

    pyo3/src/ffi/pymem.rs

    Lines 39 to 47 in b365969

    pub struct PyMemAllocatorEx {
    pub ctx: *mut c_void,
    pub malloc: Option<extern "C" fn(ctx: *mut c_void, size: size_t) -> *mut c_void>,
    pub calloc:
    Option<extern "C" fn(ctx: *mut c_void, nelem: size_t, elsize: size_t) -> *mut c_void>,
    pub realloc:
    Option<extern "C" fn(ctx: *mut c_void, ptr: *mut c_void, new_size: size_t) -> *mut c_void>,
    pub free: Option<extern "C" fn(ctx: *mut c_void, ptr: *mut c_void)>,
    }

@davidhewitt
Copy link
Member Author

Some functions are implemented as unsafe. [...] And some are not unsafe.

So in general all of these ffi methods are unsafe to use, for two main reasons:

  • They deal in raw pointers
  • They assume that the GIL is acquired

The functions you note in your second example are actually still unsafe, because they're inside an extern "C" block. The Rust compiler assumes all functions implemented in C are unsafe 😄

Why are these functions inside Option? Is it simply because they're function pointers?

Precisely - function pointers in C are nullable, however Rust's fn () is not. Note that this is quite important to get right - I've written bugs in the past where I forgot to wrap in Option. This allowed the optimizer to assume that the value could never be null, which led to some nasty UB!

@deantvv
Copy link
Contributor

deantvv commented Oct 17, 2021

Here is something I wish I knew when I first started here and hope sharing here will help new contributor start more easily.

  1. For PyPy, you can check the link name with nm -D ${PYPY_INSTALL_BASE}/lib/libpypy3-c.so | grep 'Pyxxxxx'
    And add the link name corresponding to what you find in nm. It is usually something like PyPyxxxxx
    Example:
nm -D $PYPY_INSTALL_BASE/lib/libpypy3-c.so | grep PyDict_New
# 00000000018ef450 T PyPyDict_New
extern "C" {
    #[cfg_attr(PyPy, link_name = "PyPyDict_New")]
    pub fn PyDict_New() -> *mut PyObject;
}
  1. For macro in C, we usually rewrite this as a #[inline] function
  2. Only add #[cfg_attr(windows, link(name = "pythonXY"))] to pub static mut Pyxxxx_Type declaration

@nw0
Copy link
Contributor

nw0 commented Oct 18, 2021

I don't have time to make a PR right now, but wanted to note that last week some of the header files have moved in CPython again.

There are a few changes, but this one in particular makes some definitions "internal-only": python/cpython#28957

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants