Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize sequence conversion for list and tuple #2944

Merged
merged 2 commits into from
Feb 16, 2023

Conversation

davidhewitt
Copy link
Member

closes #2943

Avoid using PyObject_IsInstance for checking if lists or tuples are sequences, as we know they are always sequences.

@davidhewitt
Copy link
Member Author

@ritchie46 would you be willing to run your benchmark suite built against this branch to verify this resolves the issue?

If so, we can prepare 0.18.2 patch release with this in a week or two.

@ritchie46
Copy link

@ritchie46 would you be willing to run your benchmark suite built against this branch to verify this resolves the issue?

If so, we can prepare 0.18.2 patch release with this in a week or two.

Yeap, seems much better; 👍

  • 0.18: best of 5: 359 nsec per loop
  • this_branch: best of 5: 193 nsec per loop

@davidhewitt
Copy link
Member Author

Great - so I agree with you @adamreichold that it won't solve the general case (and all the discussion in #2943 about not needing PySequence at all for this particular case in polars).

I still think this optimization seems worth it for a common case?

Copy link
Member

@adamreichold adamreichold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this optimization seems worth it for a common case?

Agreed. (I have not yet messaged bors to give the others a change to weigh in.)

@mejrs
Copy link
Member

mejrs commented Feb 11, 2023

Do we have a benchmark for this? We should catch similar regressions in the future.

@adamreichold
Copy link
Member

Do we have a benchmark for this? We should catch similar regressions in the future.

I agree that a benchmark would be helpful but I don't think this is an unexpected regression. I think everybody involved in the review expected this to decrease performance with the memoization of the collection.abc.Sequence type object being a partial mitigation due to that. From my perspective, the original report is rather about very specific use cases which are unable or unwilling to tolerate even the intentional change in performance.

@davidhewitt
Copy link
Member Author

👍 I'll push a benchmark within a day or two.

@davidhewitt
Copy link
Member Author

Benchmark pushed. On my machine these drop from the 100-200ns range to about 2ns 😄

@ritchie46
Copy link

Benchmark pushed. On my machine these drop from the 100-200ns range to about 2ns smile

Wow... That that's 1-3 cpu cycles 🙈

@davidhewitt
Copy link
Member Author

Yes, checking for lists and tuples (and subclasses) can be done by checking for a bit set on the type object.

I should probably push a similar PR for dicts & mapping.

@davidhewitt davidhewitt force-pushed the optimize-sequence-conversion branch from b64200e to 4a41917 Compare February 15, 2023 08:28
bors bot added a commit that referenced this pull request Feb 15, 2023
2954: optimize mapping conversion for dict r=davidhewitt a=davidhewitt

Equivalent of #2944 for dicts -> mapping.

The benchmark diff is not as significant as in #2944. Still something like 80ns -> 2ns on my machine.

Co-authored-by: David Hewitt <1939362+davidhewitt@users.noreply.github.com>
@davidhewitt
Copy link
Member Author

Think we've given others a few days, I'd like to prep the next patch release soon.

bors r=adamreichold

@bors
Copy link
Contributor

bors bot commented Feb 16, 2023

Build succeeded:

@bors bors bot merged commit c858ced into PyO3:main Feb 16, 2023
@jakob-keller
Copy link

Sounds fantastic, can't wait to give it a spin! Thanks to everyone involved!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2.5/3x Performance regression with PySequence as PyTryFrom since 0.16.
5 participants