Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: variable tidb_opt_enable_hash_join to skip hash join #46575

Merged
merged 17 commits into from
Sep 26, 2023

Conversation

coderplay
Copy link
Contributor

@coderplay coderplay commented Aug 31, 2023

What problem does this PR solve?

This PR adds a global/session knob to disable hash join, in order to prevent query plan regression.
Use cases:

  • As an application developer, I’m very confident the queries don’t benefit from hash join. However, at times, bad plans were generated in which a hash join was selected. I will config the connection pool, and initialize the connections with set tidb_opt_enable_hash_join=no.
  • As a DBA, I’m aware the application is OLTP workload, which works properly in MySQL ~5.7. I’m very sure the SQLs don’t need hash join. I’ve seen cases in the past. bad plans were generated in which a hash join was selected. After confirming with application developers, I will disable hash join in cluster level with set global tidb_opt_enable_hash_join=off

Issue Number: close #46695

What is changed and how it works?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

Add a new global/session variable `tidb_opt_enable_hash_join` to control whether hash join is enabled or not.

@coderplay coderplay requested a review from a team as a code owner August 31, 2023 16:57
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. labels Aug 31, 2023
@sre-bot
Copy link
Contributor

sre-bot commented Aug 31, 2023

CLA assistant check
All committers have signed the CLA.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Aug 31, 2023

Welcome @coderplay!

It looks like this is your first PR to pingcap/tidb 🎉.

I'm the bot to help you request reviewers, add labels and more, See available commands.

We want to make sure your contribution gets all the attention it needs!



Thank you, and welcome to pingcap/tidb. 😃

@ti-chi-bot
Copy link

ti-chi-bot bot commented Aug 31, 2023

Hi @coderplay. Thanks for your PR.

I'm waiting for a pingcap member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Aug 31, 2023
@tiprow
Copy link

tiprow bot commented Aug 31, 2023

Hi @coderplay. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@coderplay
Copy link
Contributor Author

/review default

@ti-chi-bot
Copy link

ti-chi-bot bot commented Aug 31, 2023

@coderplay:

Potential Problems and Suggestions

Problem 1

In the PR description, there is a placeholder Issue Number: close #xxx. The issue number should be provided to link the relevant issue.

Suggestion:

Replace xxx with the actual issue number.

Problem 2

There are no tests to check if setting the global variable tidb_opt_enable_hash_join to no actually disables hash join at the cluster level.

Suggestion:

Add a test case to verify that setting the global variable tidb_opt_enable_hash_join to no actually disables hash join at the cluster level.

Problem 3

There is no documentation update for the newly added global/session variable tidb_opt_enable_hash_join.

Suggestion:

Update the documentation to include the new global/session variable tidb_opt_enable_hash_join, its purpose, and its usage.

Problem 4

In the PR description, the release note is set to None. However, this PR introduces a new global/session variable which might affect user behaviors.

Suggestion:

Update the release note to:

Add a new global/session variable `tidb_opt_enable_hash_join` to control whether hash join is enabled or not.

Overall, the PR looks good. Just a few minor issues and suggestions to make it even better.

In response to this:

/review default

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-tests-checked release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/needs-linked-issue labels Sep 5, 2023
@coderplay
Copy link
Contributor Author

/retest

@ti-chi-bot
Copy link

ti-chi-bot bot commented Sep 6, 2023

@coderplay: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

1 similar comment
@tiprow
Copy link

tiprow bot commented Sep 6, 2023

@coderplay: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@coderplay
Copy link
Contributor Author

"Warning": null
},
{
"SQL": "select /*+ leading(t4, t3, t2, t1) */ * from t1, t2, t3, t4",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why hashjoin can't be disabled in this case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a look at this.

Copy link
Contributor

@qw4990 qw4990 Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no is not a valid value for a bool variable in TiDB, only on, off, 1, 0. After setting it to off, it can work:
image

Copy link
Contributor Author

@coderplay coderplay Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qw4990 My original meaning was that even in your PR, the hash join is not disabled when applying a leading hint along with a no_hash_join hint.

see my comment on #45538

Copy link
Contributor

@qw4990 qw4990 Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my case, the no_hash_join only works for t2 and t3. The top Hash Join is applied on t1 and result of (t2, t3, t4) instead of t2, so no_hash_join(t2) cannot take effect on it.
image

@qw4990
Copy link
Contributor

qw4990 commented Sep 25, 2023

And do we need to pick this PR to v7.1? @coderplay @songrijie

@codecov
Copy link

codecov bot commented Sep 25, 2023

Codecov Report

Merging #46575 (a6893ef) into master (ae11336) will decrease coverage by 0.1177%.
Report is 49 commits behind head on master.
The diff coverage is 100.0000%.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #46575        +/-   ##
================================================
- Coverage   72.9961%   72.8785%   -0.1177%     
================================================
  Files          1338       1367        +29     
  Lines        399462     409380      +9918     
================================================
+ Hits         291592     298350      +6758     
- Misses        89019      92174      +3155     
- Partials      18851      18856         +5     
Flag Coverage Δ
integration 33.6852% <63.6363%> (?)
unit 73.1069% <100.0000%> (+0.1107%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 53.9913% <ø> (ø)
parser 84.9232% <ø> (-0.0388%) ⬇️
br 48.7605% <ø> (-4.3404%) ⬇️

@qw4990
Copy link
Contributor

qw4990 commented Sep 25, 2023

/retest-required

@qw4990
Copy link
Contributor

qw4990 commented Sep 25, 2023

/retest-required

@qw4990 qw4990 added needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. labels Sep 25, 2023
@@ -162,6 +162,7 @@ type logicalOptRule interface {
func BuildLogicalPlanForTest(ctx context.Context, sctx sessionctx.Context, node ast.Node, infoSchema infoschema.InfoSchema) (Plan, types.NameSlice, error) {
sctx.GetSessionVars().PlanID.Store(0)
sctx.GetSessionVars().PlanColumnID.Store(0)
sctx.GetSessionVars().EnableHashJoin = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too many CI failures due to the default value of this variable:
image

Can we change the name of the variable from EnableHashJoin to DisableHashJoin and set it as false by default?

  1. for most users, HashJoin should be allowed, so using false as the default value seems safer.
  2. for some tests, the test framework won't use DefEnableHashJoin to initialize sctx.EnableHashJoin, instead it uses the default value of Bool in Golang, which is false; If we use EnableHashJoin, there will be plenty of test failures, changing it to DisableHashJoin can fix this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing with PM, we decided to keep it unchanged as EnableHashJoin.

Copy link
Contributor

@qw4990 qw4990 Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too many CI failures... For safety, how about using another solution, which is using DisableHashJoin internally, and exposing enable_hash_join to users.

Copy link
Contributor

@qw4990 qw4990 Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too many CI failures... For safety, how about using another solution, which is using DisableHashJoin internally, and exposing enable_hash_join to users.

@coderplay I've pushed some code to your PR directly (tomorrow is the code freeze deadline of v7.4 ...) by using this approach to solve CI failures.
image

Copy link
Contributor Author

@coderplay coderplay Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing those @qw4990 ! In the future, how can I detect those issues locally before submitting this PR?

@qw4990
Copy link
Contributor

qw4990 commented Sep 25, 2023

/retest-required

@qw4990
Copy link
Contributor

qw4990 commented Sep 25, 2023

@easonn7 PTAL

@easonn7
Copy link

easonn7 commented Sep 26, 2023

/approve

@ti-chi-bot
Copy link

ti-chi-bot bot commented Sep 26, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: easonn7, qw4990, time-and-fate

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the approved label Sep 26, 2023
@ti-chi-bot ti-chi-bot bot merged commit 95fa30c into pingcap:master Sep 26, 2023
13 of 16 checks passed
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-6.5: #47276.

ti-chi-bot pushed a commit to ti-chi-bot/tidb that referenced this pull request Sep 26, 2023
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.1: #47277.

ti-chi-bot pushed a commit to ti-chi-bot/tidb that referenced this pull request Sep 26, 2023
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
ti-chi-bot bot pushed a commit that referenced this pull request Oct 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved epic/hint lgtm needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. ok-to-test Indicates a PR is ready to be tested. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/planner SIG: Planner size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal: introduce system variable to ignore hash join in TiDB
8 participants