Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

explain: reformat the explain info of hash joins #12985

Closed
fzhedu opened this issue Oct 29, 2019 · 5 comments
Closed

explain: reformat the explain info of hash joins #12985

fzhedu opened this issue Oct 29, 2019 · 5 comments
Assignees
Labels
type/enhancement The issue or PR belongs to an enhancement. type/usability

Comments

@fzhedu
Copy link
Contributor

fzhedu commented Oct 29, 2019

Feature Request

Is your feature request related to a problem? Please describe:

Related to the outer hash join issue #6868 and the index hash join #12139, #12349 .

The current explain info of hash joins cannot directly label which side builds a hash table. Instead, it just outputs the inner side.
For example:
T (left/right outer) join S on T.a=S.a
(1) if T is not larger than S, so T builds a hash table
inner: T
(2) otherwise:
inner: S

Describe the feature you'd like:

Clearly output the internal information of (index) hash joins:

  1. output the inner side, as well as the outer side;
  2. clarify whether the inner or the outer builds a hash table.

For example:
T (left/right outer) join S on T.a=S.a
(1) if T is not larger than S, so T builds a hash table
inner (build): T, outer: S
(2) otherwise:
inner: T, outer (build): S

Describe alternatives you've considered:

if you have suggestions, pleas write it here.
Teachability, Documentation, Adoption, Migration Strategy:

@fzhedu fzhedu added type/enhancement The issue or PR belongs to an enhancement. type/suggestion type/usability labels Oct 29, 2019
@XuHuaiyu
Copy link
Contributor

An explain result example:

Only the inner side is printed in the explain result. And the it's the default build side of hashjoin.

HashLeftJoin_23	4166.67	root	left outer join, inner:TableReader_33, equal:[eq(Column#2, Column#4)]
├─TableReader_33	3333.33	root	data:TableScan_32
│ └─TableScan_32	3333.33	cop[tikv]	table:t1, range:(1,+inf], keep order:false, stats:pseudo
└─TableReader_36	9990.00	root	data:Selection_35
  └─Selection_35	9990.00	cop[tikv]	not(isnull(Column#4))
    └─TableScan_34	10000.00	cop[tikv]	table:t2, range:[-inf,+inf], keep order:false, stats:pseudo

We can refine the explain result to:

HashLeftJoin_23	4166.67	root	left outer join, inner(build):TableReader_33, outer: TableReader_36, equal:[eq(Column#2, Column#4)]
├─TableReader_33	3333.33	root	data:TableScan_32
│ └─TableScan_32	3333.33	cop[tikv]	table:t1, range:(1,+inf], keep order:false, stats:pseudo
└─TableReader_36	9990.00	root	data:Selection_35
  └─Selection_35	9990.00	cop[tikv]	not(isnull(Column#4))
    └─TableScan_34	10000.00	cop[tikv]	table:t2, range:[-inf,+inf], keep order:false, stats:pseudo

@XuHuaiyu
Copy link
Contributor

And not only the explain result of HashJoin needs to mark both the outer and inner side. But also IndexJoin and MergeJoin. It will make the readers EASY and FAST to locate the two children of a join.

@winoros
Copy link
Member

winoros commented Oct 29, 2019

Use hashed instead of build?

@SunRunAway
Copy link
Contributor

I think build is better. See https://en.wikipedia.org/wiki/Hash_join

This relation is called the build side of the join.

@LittleFall
Copy link
Contributor

After pr/14838, explain will show build and probe side.

HashLeftJoin_23	4166.67	root	left outer join, inner:TableReader_33 (REVERSED), equal:[eq(test.t1.c2, test.t2.c1)]
├─TableReader_33(Build)	3333.33	root	data:TableRangeScan_32
│ └─TableRangeScan_32	3333.33	cop[tikv]	table:t1, range:(1,+inf], keep order:false, stats:pseudo
└─TableReader_36(Probe)	9990.00	root	data:Selection_35
  └─Selection_35	9990.00	cop[tikv]	not(isnull(test.t2.c1))
    └─TableFullScan_34	10000.00	cop[tikv]	table:t2, keep order:false, stats:pseudo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement The issue or PR belongs to an enhancement. type/usability
Projects
None yet
Development

No branches or pull requests

5 participants