Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: introduce system variable to ignore hash join in TiDB #46695

Closed
coderplay opened this issue Sep 5, 2023 · 0 comments · Fixed by #46575
Closed

Proposal: introduce system variable to ignore hash join in TiDB #46695

coderplay opened this issue Sep 5, 2023 · 0 comments · Fixed by #46575
Assignees
Labels
affects-6.5 affects-7.1 sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.

Comments

@coderplay
Copy link
Contributor

coderplay commented Sep 5, 2023

Background

  • For most OLTP queries, hash join is not selected as it normally takes longer to fetch very few rows.
  • MySQL doesn’t support hash join before MySQL 8.0.
  • In 8.0 or higher, MySQL provides an optimizer switch () to disable in session/global level.
  • There is undocumented parameter (_hash_join_enabled) in Oracle to disable hash join.

It makes sense we introduce the mechanism to control this optimizer behavior at session or cluster level.

User case

  • As an application developer, I’m very confident the queries don’t benefit from hash join. However, at times, bad plans were generated in which a hash join was selected. I will config the connection pool, and initialize the connections with “set tidb_opt_enable_hash_join=no”.
  • As a DBA, I’m aware the application is OLTP workload, which works properly in MySQL ~5.7. I’m very sure the SQLs don’t need hash join. I’ve seen cases in the past. bad plans were generated in which a hash join was selected. After confirming with application developers, I will disable hash join in cluster level with “set global tidb_opt_enable_hash_join=no”

Functional Specification

Introduce variable to disable hash join in session/global

  • Name: tidb_opt_enable_hash_join
  • Scope: session / global
  • Applies to set_var(): YES
  • Default value: YES
  • Possible values: YES /NO

Behavior

  • With tidb_opt_enable_hash_join=NO, Optimizer ignores hash join when generating execution plan.
  • Optimizer hints overrides the behavior of the variable. If tidb_opt_enable_hash_join is set to “NO”, hash join is still enforced for the queries with hint “HASH_JOIN()”, “HASH_JOIN_PROBE()” or “HASH_JOIN_BUILD()”. In this case, lower level control makes exceptions possible. (The same behavior as MySQL)
  • It’s possible queries come with hints which could lead to no join type available. For example
set tidb_opt_enable_hash_join=NO 
select /*+ no_merge_join(), no_index_join(), no_index_hash_join()/ … 

It’s not acceptable SQL fails caused by optimizer variable changes, as the scope of variables is wide. We have to avoid no available plan because of this variable.

In this case, we shall keep the hints working and ignore the behavior change caused by this variable. Query still generates a plan with hash_join, and appends messages to warnings.

variable tidb_opt_enable_hash_join=NO is ignored due to no join type available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.5 affects-7.1 sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants