Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Make dt.truncate 1.5x faster when every is just a single duration (and not an expression) #16666

Merged
merged 2 commits into from
Jun 3, 2024

Conversation

MarcoGorelli
Copy link
Collaborator

@MarcoGorelli MarcoGorelli commented Jun 2, 2024

broadcast_try_binary_elementwise simplifies things here, but does seem to introduce a perf hit for the base case (when every isn't an expression): #15768 (comment)

this makes a noticeable difference for the single-every case https://www.kaggle.com/code/marcogorelli/polars-timing?scriptVersionId=181122731:

  • here: 0.05906398600003134
  • main: 0.09365042766664071

@github-actions github-actions bot added performance Performance issues or improvements python Related to Python Polars rust Related to Rust Polars labels Jun 2, 2024
Copy link

codecov bot commented Jun 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.49%. Comparing base (ef64730) to head (61f5d18).
Report is 9 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #16666      +/-   ##
==========================================
- Coverage   81.51%   81.49%   -0.02%     
==========================================
  Files        1414     1415       +1     
  Lines      186392   186609     +217     
  Branches     3014     3014              
==========================================
+ Hits       151942   152083     +141     
- Misses      33921    33996      +75     
- Partials      529      530       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@MarcoGorelli MarcoGorelli changed the title perf: improve truncate performance when every is just a single duration (and not an expression) perf: make truncate 1.5x faster when every is just a single duration (and not an expression) Jun 2, 2024
Comment on lines -65 to -71
// TODO: optimize the code below, so it does the following:
// - convert to naive
// - truncate all naively
// - localize, preserving the fold of the original datetime.
// The last step is the non-trivial one. But it should be worth it,
// and faster than the current approach of truncating everything
// as tz-aware.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've made an issue for this #16617

@MarcoGorelli MarcoGorelli force-pushed the truncate-single-every-perf branch from 5425011 to 33b7347 Compare June 2, 2024 19:58
@ritchie46
Copy link
Member

What does the seperate implementation do that we cannot do in the broadcast_ function? Could we improve that helper function?

@MarcoGorelli
Copy link
Collaborator Author

In the broadcast one, the function:

  • takes a single lhs_opt and a single rhs_opt
  • performs some logic with them

If rhs_opt is the same each time, then some of that logic may get duplicated. Here, for example, this part would get repeated each time:

                let every =
                    *duration_cache.get_or_insert_with(every, |every| Duration::parse(every));
                if every.negative {
                    polars_bail!(ComputeError: "cannot truncate a Datetime to a negative duration")
                }

The cache would avoid the cost of Duration::parse on each element, but it still involves looking up every in the cache each time, and checking if every is negative. Doing this each time adds up

If every.len() == 1, then every just needs parsing as a duration once, and then it's just a matter of running w.truncate_*s on each element

I think that the broadcast_ probably can be improved, but that ultimately just keeping it simple will always be fastest?

@MarcoGorelli MarcoGorelli marked this pull request as ready for review June 3, 2024 08:18
@ritchie46
Copy link
Member

but that ultimately just keeping it simple will always be fastest?

Yes.. Thanks for the explanation. 👍

@ritchie46 ritchie46 merged commit 57a5046 into pola-rs:main Jun 3, 2024
26 checks passed
@stinodego stinodego changed the title perf: make truncate 1.5x faster when every is just a single duration (and not an expression) perf: Make truncate 1.5x faster when every is just a single duration (and not an expression) Jun 7, 2024
@stinodego stinodego changed the title perf: Make truncate 1.5x faster when every is just a single duration (and not an expression) perf: Make dt.truncate 1.5x faster when every is just a single duration (and not an expression) Jun 7, 2024
@cmdlineluser cmdlineluser mentioned this pull request Jun 7, 2024
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance issues or improvements python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants