Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reintroduce vectorcall optimization with new PyCallArgs trait #4768

Merged
merged 1 commit into from
Dec 24, 2024

Conversation

Icxolu
Copy link
Contributor

@Icxolu Icxolu commented Dec 5, 2024

This reintroduces the vectorcall optimization we removed temporarily in #4653 using a new PyCallArgs trait as was discussed during the initial implementation in #4456.

This slightly reduces the number of types that can be passed to PyAnyMethods::call and friends. With this only Rust tuples (including unit), Bound<PyTuple> and Py<PyTuple> are allowed (instead of any type that can be converted into a PyTuple via IntoPyObject as before). Inside pyo3 there is no such type, but there could be user types (for example #[derive(IntoPyObject)] tuple structs) that won't work anymore. The work around is to simply convert at the call site, so I don't think that's a huge blocker.

There are still possibilities left open with regards to kwargs. One idea could be a PyCallArgs type with a builder like API to set both, positional and keyword args which can than be transformed as needed depending on the calling convention (maybe with something like smallvec or even heapless::Vec to avoid lots of small allocation). I might explore that as a followup.

Closes #4656

@Icxolu Icxolu requested a review from Copilot December 5, 2024 19:44
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 suggestion.

Comments skipped due to low confidence (1)

src/call.rs:1

  • The TODO comment should be replaced with meaningful documentation.
//! TODO

src/call.rs Outdated Show resolved Hide resolved
Copy link

codspeed-hq bot commented Dec 5, 2024

CodSpeed Performance Report

Merging #4768 will improve performances by 57.92%

Comparing Icxolu:pycallargs (ec67980) with main (9e63b34)

Summary

⚡ 5 improvements
✅ 79 untouched benchmarks

Benchmarks breakdown

Benchmark main Icxolu:pycallargs Change
call 756.2 µs 620.5 µs +21.87%
call_1 380.1 µs 271 µs +40.27%
call_method_1 1,008.9 µs 641.4 µs +57.29%
call_method_one_arg 965.8 µs 611.6 µs +57.92%
call_one_arg 347.7 µs 242.1 µs +43.64%

@Icxolu
Copy link
Contributor Author

Icxolu commented Dec 5, 2024

Looks like it worked! If we're roughly happy with this, I'll work on the docs.

@Icxolu
Copy link
Contributor Author

Icxolu commented Dec 5, 2024

It looks like we can even enable some of it on the limited api after 3.12: https://docs.python.org/3/c-api/call.html#c.PyObject_Vectorcall, but I don't think we have the ffi definitions yet.

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Always so satisfying to see such big perf wins. 😂

I actually have the FFI bindings in #4379 but I haven't completed splitting that PR up.

src/call.rs Show resolved Hide resolved
src/types/tuple.rs Outdated Show resolved Hide resolved

/// TODO
pub trait PyCallArgs<'py>: Sized + private::Sealed {
#[doc(hidden)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a good idea to have all the methods hidden and the type sealed; we can explain the situation with docs 👍.

Especially if in the long term we might have something like #4414 to solve keyword args (although much design experimentation still needed to make sure we're all happy with whatever we come up with).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a first pass on the docs, let we know what you think.

@Icxolu Icxolu force-pushed the pycallargs branch 2 times, most recently from 86bbe39 to 0b4a277 Compare December 10, 2024 20:24
@Icxolu Icxolu marked this pull request as ready for review December 10, 2024 20:24
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks super, thanks. Sorry for the long delays, I am beginning to pick off the long backlog now 🫣

src/call.rs Outdated Show resolved Hide resolved
@Icxolu
Copy link
Contributor Author

Icxolu commented Dec 24, 2024

No worries 😄, and happy holidays🎄

@Icxolu Icxolu enabled auto-merge December 24, 2024 12:14
@Icxolu Icxolu added this pull request to the merge queue Dec 24, 2024
@davidhewitt
Copy link
Member

Thanks, happy holidays to you too! 🎉 🎄

Merged via the queue into PyO3:main with commit 3965f5f Dec 24, 2024
45 of 46 checks passed
@Icxolu Icxolu deleted the pycallargs branch December 24, 2024 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reintroduce vectorcall specialization for Python call API.
2 participants