Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix plan cost runner regression script #11129

Merged
merged 6 commits into from
Mar 20, 2024

Conversation

Tmonster
Copy link
Contributor

Thijs had some comments on the old PR https://github.com/duckdb/duckdb/pull/10585/files. I've been messing around in the Join Order Optimizer and realized that I old got my equalities incorrect and the regression / improvement reporting was flipped 🙈 .

This PR also removes comparing build side and probe side for plans and reporting a regression if any of these values increase. If the total plan cost changes, these values more than likely change as well, and I want to avoid being eager about reporting regressions especially if the execution time has improved. The plan now is only to report changes if the total plan cost has increased and the build side has increased. When this is not the case a regression is reported if there is a significant (3%) difference in execution time. It's possible that build side and probe side cardinalities have changed, but execution time has improved. For example, there may be more intermediate tuples in the join tree, but these tuples might all be in the probe side. This won't add much overhead since the tuples will be in flight most of the time. If the tuples were on the build side, that would add a overhead since they will need to be included in the hash table.

I ran this new version plan_cost_runner.py between v0.9.2 and the current main and there are no regressions for tpch and imdb.

@Tmonster Tmonster requested a review from lnkuiper March 13, 2024 10:30
@github-actions github-actions bot marked this pull request as draft March 14, 2024 14:47
@Tmonster Tmonster force-pushed the fix_plan_cost_runner_regression branch from 936fa1d to 2b45c30 Compare March 14, 2024 14:56
@Tmonster Tmonster marked this pull request as ready for review March 14, 2024 14:58
Copy link
Contributor

@lnkuiper lnkuiper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Mytherin Mytherin merged commit ac97a91 into duckdb:main Mar 20, 2024
48 checks passed
@Mytherin
Copy link
Collaborator

Thanks!

github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Mar 21, 2024
Merge pull request duckdb/duckdb#11276 from pdet/error_handler_csv
Merge pull request duckdb/duckdb#11129 from Tmonster/fix_plan_cost_runner_regression
krlmlr added a commit to duckdb/duckdb-r that referenced this pull request Mar 23, 2024
Merge pull request duckdb/duckdb#11276 from pdet/error_handler_csv
Merge pull request duckdb/duckdb#11129 from Tmonster/fix_plan_cost_runner_regression
github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Mar 28, 2024
Merge pull request duckdb/duckdb#11276 from pdet/error_handler_csv
Merge pull request duckdb/duckdb#11129 from Tmonster/fix_plan_cost_runner_regression
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants