perf: Make `dt.truncate` 1.5x faster when `every` is just a single duration (and not an expression) #16666

MarcoGorelli · 2024-06-02T17:44:55Z

broadcast_try_binary_elementwise simplifies things here, but does seem to introduce a perf hit for the base case (when every isn't an expression): #15768 (comment)

this makes a noticeable difference for the single-every case https://www.kaggle.com/code/marcogorelli/polars-timing?scriptVersionId=181122731:

here: 0.05906398600003134
main: 0.09365042766664071

codecov · 2024-06-02T18:07:57Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.49%. Comparing base (ef64730) to head (61f5d18).
Report is 9 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #16666      +/-   ##
==========================================
- Coverage   81.51%   81.49%   -0.02%     
==========================================
  Files        1414     1415       +1     
  Lines      186392   186609     +217     
  Branches     3014     3014              
==========================================
+ Hits       151942   152083     +141     
- Misses      33921    33996      +75     
- Partials      529      530       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…ration (and not an expression)

MarcoGorelli · 2024-06-02T19:57:14Z

crates/polars-time/src/truncate.rs

-        // TODO: optimize the code below, so it does the following:
-        //       - convert to naive
-        //       - truncate all naively
-        //       - localize, preserving the fold of the original datetime.
-        //       The last step is the non-trivial one. But it should be worth it,
-        //       and faster than the current approach of truncating everything
-        //       as tz-aware.


i've made an issue for this #16617

ritchie46 · 2024-06-03T06:27:39Z

What does the seperate implementation do that we cannot do in the broadcast_ function? Could we improve that helper function?

MarcoGorelli · 2024-06-03T08:15:31Z

In the broadcast one, the function:

takes a single lhs_opt and a single rhs_opt
performs some logic with them

If rhs_opt is the same each time, then some of that logic may get duplicated. Here, for example, this part would get repeated each time:

                let every =
                    *duration_cache.get_or_insert_with(every, |every| Duration::parse(every));
                if every.negative {
                    polars_bail!(ComputeError: "cannot truncate a Datetime to a negative duration")
                }

The cache would avoid the cost of Duration::parse on each element, but it still involves looking up every in the cache each time, and checking if every is negative. Doing this each time adds up

If every.len() == 1, then every just needs parsing as a duration once, and then it's just a matter of running w.truncate_*s on each element

I think that the broadcast_ probably can be improved, but that ultimately just keeping it simple will always be fastest?

ritchie46 · 2024-06-03T10:58:01Z

but that ultimately just keeping it simple will always be fastest?

Yes.. Thanks for the explanation. 👍

github-actions bot added performance Performance issues or improvements python Related to Python Polars rust Related to Rust Polars labels Jun 2, 2024

MarcoGorelli changed the title ~~perf: improve truncate performance when every is just a single duration (and not an expression)~~ perf: make truncate 1.5x faster when every is just a single duration (and not an expression) Jun 2, 2024

perf: improve truncate performance when every is just a single du…

33b7347

…ration (and not an expression)

MarcoGorelli commented Jun 2, 2024

View reviewed changes

MarcoGorelli force-pushed the truncate-single-every-perf branch from 5425011 to 33b7347 Compare June 2, 2024 19:58

MarcoGorelli marked this pull request as ready for review June 3, 2024 08:18

MarcoGorelli requested review from ritchie46, stinodego, c-peters, alexander-beedie, reswqa and orlp as code owners June 3, 2024 08:18

extra test coverage for good measure

61f5d18

ritchie46 merged commit 57a5046 into pola-rs:main Jun 3, 2024
26 checks passed

MarcoGorelli mentioned this pull request Jun 4, 2024

Alternative method 10x faster than dt.offset_by() #16722

Closed

2 tasks

stinodego changed the title ~~perf: make truncate 1.5x faster when every is just a single duration (and not an expression)~~ perf: Make truncate 1.5x faster when every is just a single duration (and not an expression) Jun 7, 2024

stinodego changed the title ~~perf: Make truncate 1.5x faster when every is just a single duration (and not an expression)~~ perf: Make dt.truncate 1.5x faster when every is just a single duration (and not an expression) Jun 7, 2024

cmdlineluser mentioned this pull request Jun 7, 2024

dt.truncate is slow #13157

Closed

2 tasks

MarcoGorelli mentioned this pull request Jul 11, 2024

perf: Add fastpath for when rounding by single constant durations #17580

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Make `dt.truncate` 1.5x faster when `every` is just a single duration (and not an expression) #16666

perf: Make `dt.truncate` 1.5x faster when `every` is just a single duration (and not an expression) #16666

MarcoGorelli commented Jun 2, 2024 •

edited

Loading

codecov bot commented Jun 2, 2024 •

edited

Loading

MarcoGorelli Jun 2, 2024

ritchie46 commented Jun 3, 2024

MarcoGorelli commented Jun 3, 2024

ritchie46 commented Jun 3, 2024

perf: Make dt.truncate 1.5x faster when every is just a single duration (and not an expression) #16666

perf: Make dt.truncate 1.5x faster when every is just a single duration (and not an expression) #16666

Conversation

MarcoGorelli commented Jun 2, 2024 • edited Loading

codecov bot commented Jun 2, 2024 • edited Loading

Codecov Report

MarcoGorelli Jun 2, 2024

Choose a reason for hiding this comment

ritchie46 commented Jun 3, 2024

MarcoGorelli commented Jun 3, 2024

ritchie46 commented Jun 3, 2024

perf: Make `dt.truncate` 1.5x faster when `every` is just a single duration (and not an expression) #16666

perf: Make `dt.truncate` 1.5x faster when `every` is just a single duration (and not an expression) #16666

MarcoGorelli commented Jun 2, 2024 •

edited

Loading

codecov bot commented Jun 2, 2024 •

edited

Loading