-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Fix performance regression for eager join_where
#21308
Conversation
@@ -181,7 +193,7 @@ More information on the new streaming engine: https://github.com/pola-rs/polars/ | |||
} | |||
|
|||
// Make sure it is after predicate pushdown | |||
if opt_flags.collapse_joins() && members.has_filter_with_join_input { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During IR conversion a join_where
is converted to a cross-join with subsequent filters, and we rely on collapse_joins
to convert to more performant joins. However the issue is that during the optimization step, members: MemberCollector
was never initialized if opt_flags
contained EAGER
, causing collapse_joins
to be skipped despite it being enabled. This caused the performance regression as we end up materializing an entire cross-join.
The fix for the linked issue is just this 1 line - the other changes are to make it so that members
is initialized as-needed rather than based on combinations of flags to help avoid making similar mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ouch.. that's a painful one.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #21308 +/- ##
==========================================
+ Coverage 79.90% 79.91% +0.01%
==========================================
Files 1596 1596
Lines 228580 228593 +13
Branches 2608 2608
==========================================
+ Hits 182644 182690 +46
+ Misses 45340 45307 -33
Partials 596 596 ☔ View full report in Codecov by Sentry. |
join_where
in 1.19 #21145