Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create benchmark for groupby #5772

Merged
merged 4 commits into from
Sep 20, 2021
Merged

Create benchmark for groupby #5772

merged 4 commits into from
Sep 20, 2021

Conversation

Illviljan
Copy link
Contributor

@Illviljan Illviljan commented Sep 7, 2021

@pep8speaks
Copy link

pep8speaks commented Sep 7, 2021

Hello @Illviljan! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-09-14 02:19:50 UTC

@github-actions
Copy link
Contributor

github-actions bot commented Sep 7, 2021

Unit Test Results

         6 files  ±0           6 suites  ±0   56m 50s ⏱️ ±0s
16 227 tests ±0  14 492 ✔️ ±0  1 735 💤 ±0  0 ❌ ±0 
90 558 runs  ±0  82 383 ✔️ ±0  8 175 💤 ±0  0 ❌ ±0 

Results for commit 7a65d59. ± Comparison against base commit 7a65d59.

♻️ This comment has been updated with latest results.

@Illviljan
Copy link
Contributor Author

Illviljan commented Sep 7, 2021

Someone who understands asv is free to try it out. Every time I managed to get asv running my entire pc froze and I had to force restart it.

@max-sixty
Copy link
Collaborator

This looks great, thanks @Illviljan

We may want to make the array much bigger, given the overhead.

And (could be a TODO), parameterize the cardinality.

Sorry to hear re asv. Does it install the environment successfully? That can take a while. But maybe not worth exploring more if it brings down the PC... 😬

@Illviljan
Copy link
Contributor Author

I agree we should have more array sizes. I was thinking grouping small, medium, large, 1d, 2d, Nd arrays.
But that's for a future PR, I lost interest when I couldn't see the results.

It froze during the tests, for example I explicitly tested repr.py because I thought it would be an easy one. PC still froze.

self.ds = self.ds.chunk({"dim_0": 50})


class GroupByDataFrame(GroupBy):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we really need the dataframe tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good to have some kind of "competitor" comparison, because performance complaints will always be relative to them anyway. It might not be needed for more than just a few of the (future) tests though.

@dcherian
Copy link
Contributor

Thanks @Illviljan

@dcherian dcherian merged commit 7a65d59 into pydata:main Sep 20, 2021
@Illviljan Illviljan deleted the groupby_asv branch August 12, 2022 09:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants