Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: HashJoin error with limit after pushing down TopN to the coprocessor in mockTiKV #15767

Closed
fzhedu opened this issue Mar 27, 2020 · 6 comments · Fixed by #15898
Closed
Assignees
Labels
severity/moderate sig/execution SIG execution type/bug The issue is confirmed as a bug.

Comments

@fzhedu
Copy link
Contributor

fzhedu commented Mar 27, 2020

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. What did you do?

download the file with all DDLs
randgen_index_merge_join.txt

source randgen_index_merge_join.txt
 explain SELECT   /*+INL_JOIN(table1)*/ table2 . `col_int` AS field1 , table2 . `col_int_key` AS field2 FROM  B AS table1  RIGHT  JOIN Y AS table2 ON  table1 . `col_varchar_1024_utf8` =  table2 . `col_varchar_10_latin1_key`  WHERE ( table2 . `pk` <> 9 AND table2 . `pk` NOT IN (6) )   ORDER BY field1, field2 DESC LIMIT 50 OFFSET 50 ;  

2. What did you expect to see?

+--------+--------+
| field1 | field2 |
+--------+--------+
|   NULL |      7 |
|   NULL |      7 |
|   NULL |      7 |
|   NULL |      7 |
|   NULL |      6 |
|   NULL |      6 |
|   NULL |      6 |
|   NULL |      6 |
|   NULL |      6 |
|   NULL |      6 |
|   NULL |      6 |
|   NULL |      6 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
+--------+--------+
50 rows in set, 1 warning (0.02 sec)

3. What did you see instead?

+--------+--------+
| field1 | field2 |
+--------+--------+
|   NULL |      8 |
|   NULL |      7 |
|   NULL |      7 |
|   NULL |      7 |
|   NULL |      7 |
|   NULL |      7 |
|   NULL |      6 |
|   NULL |      6 |
|   NULL |      6 |
|   NULL |      6 |
|   NULL |      6 |
|   NULL |      6 |
|   NULL |      6 |
|   NULL |      6 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
|   NULL |      5 |
+--------+--------+

4. What version of TiDB are you using? (tidb-server -V or run select tidb_version(); on TiDB)

5.7.25-TiDB-v4.0.0-beta-523-g7eba696bb

@fzhedu fzhedu added the type/bug The issue is confirmed as a bug. label Mar 27, 2020
@fzhedu fzhedu added the sig/execution SIG execution label Mar 27, 2020
@XuHuaiyu
Copy link
Contributor

  1. We do not need to set tidb_max_chunk_size and tidb_init_chunk_size
  2. works fine on tikv

@XuHuaiyu
Copy link
Contributor

XuHuaiyu commented Mar 27, 2020

tidb commit: 5268094 on master

on mocktikv, HashJoin generates 288 rows:

	diff --git a/planner/core/exhaust_physical_plans.go b/planner/core/exhaust_physical_plans.go
index 6cd9b397b..4e772369a 100644
--- a/planner/core/exhaust_physical_plans.go
+++ b/planner/core/exhaust_physical_plans.go
@@ -309,7 +309,7 @@ func (p *LogicalJoin) getHashJoins(prop *property.PhysicalProperty) []PhysicalPl
                        joins = append(joins, p.getHashJoin(prop, 0, true))
                } else {
                        joins = append(joins, p.getHashJoin(prop, 0, false))
-                       joins = append(joins, p.getHashJoin(prop, 0, true))
+                       //joins = append(joins, p.getHashJoin(prop, 0, true))
                }
mysql> explain analyze SELECT   table2 . `col_int` AS field1 , table2 . `col_int_key` AS field2 FROM  B AS table1  RIGHT  JOIN Y AS table2 ON  table1 . `col_varchar_1024_utf8` =  table2 . `col_varchar_10_latin1_key`  WHERE ( table2 . `pk` <> 9 AND table2 . `pk` NOT IN (6) )   ORDER BY field1, field2 DESC LIMIT 50 OFFSET 50 ;
+-----------------------------------+----------+---------+-----------+--------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------+-----------------+---------+
| id                                | estRows  | actRows | task      | operator info                                                                                                                        | execution info                                                                         | memory          | disk    |
+-----------------------------------+----------+---------+-----------+--------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------+-----------------+---------+
| Projection_10                     | 50.00    | 50      | root      | randgen_index_merge_join.y.col_int, randgen_index_merge_join.y.col_int_key                                                           | time:16.225421ms, loops:2, rows:50, Concurrency:OFF                                    | 1.234375 KB     | N/A     |
| └─TopN_13                         | 50.00    | 50      | root      | randgen_index_merge_join.y.col_int:asc, randgen_index_merge_join.y.col_int_key:desc, offset:50, count:50                             | time:16.2173ms, loops:2, rows:50                                                       | 7.09375 KB      | N/A     |
|   └─HashRightJoin_17              | 125.00   | 288     | root      | right outer join, equal:[eq(randgen_index_merge_join.b.col_varchar_1024_utf8, randgen_index_merge_join.y.col_varchar_10_latin1_key)] | time:15.989001ms, loops:2, rows:288, Concurrency:5, probe collision:0, build:51.957µs  | 12.23046875 KB  | 0 Bytes |
|     ├─TableReader_19(Build)       | 10000.00 | 100     | root      | data:TableFullScan_18                                                                                                                | time:2.523989ms, loops:2, rows:100, rpc num: 1, rpc time:2.734379ms, proc keys:0       | 1.0478515625 KB | N/A     |
|     │ └─TableFullScan_18          | 10000.00 | 100     | cop[tikv] | table:table1, keep order:false, stats:pseudo                                                                                         | time:2.539933ms, loops:101, rows:100                                                   | N/A             | N/A     |
|     └─TopN_20(Probe)              | 100.00   | 100     | root      | randgen_index_merge_join.y.col_int:asc, randgen_index_merge_join.y.col_int_key:desc, offset:0, count:100                             | time:15.591369ms, loops:2, rows:100                                                    | 5.0859375 KB    | N/A     |
|       └─TableReader_28            | 100.00   | 100     | root      | data:TopN_27                                                                                                                         | time:15.372291ms, loops:2, rows:100, rpc num: 1, rpc time:15.457931ms, proc keys:0     | 3.466796875 KB  | N/A     |
|         └─TopN_27                 | 100.00   | 100     | cop[tikv] | randgen_index_merge_join.y.col_int:asc, randgen_index_merge_join.y.col_int_key:desc, offset:0, count:100                             | time:15.283199ms, loops:101, rows:100                                                  | N/A             | N/A     |
|           └─TableRangeScan_26     | 6669.67  | 498     | cop[tikv] | table:table2, range:[-inf,6), (6,9), (9,+inf], keep order:false, stats:pseudo                                                        | time:10.460969ms, loops:499, rows:498                                                  | N/A             | N/A     |
+-----------------------------------+----------+---------+-----------+--------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------+-----------------+---------+
9 rows in set (0.02 sec)

on tikv, HashJoin generates 295 rows:

tidb> explain analyze SELECT   table2 . `col_int` AS field1 , table2 . `col_int_key` AS field2 FROM  B AS table1  RIGHT  JOIN Y AS table2 ON  table1 . `col_varchar_1024_utf8` =  table2 . `col_varchar_10_latin1_key`  WHERE ( table2 . `pk` <> 9 AND table2 . `pk` NOT IN (6) )   ORDER BY field1, field2 DESC LIMIT 50 OFFSET 50 ;
+-----------------------------------+---------+---------+-----------+--------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------+-----------------+---------+
| id                                | estRows | actRows | task      | operator info                                                                                                                        | execution info                                                                        | memory          | disk    |
+-----------------------------------+---------+---------+-----------+--------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------+-----------------+---------+
| Projection_10                     | 50.00   | 50      | root      | randgen_index_merge_join.y.col_int, randgen_index_merge_join.y.col_int_key                                                           | time:4.875228ms, loops:2, rows:50, Concurrency:OFF                                    | 1.234375 KB     | N/A     |
| └─TopN_13                         | 50.00   | 50      | root      | randgen_index_merge_join.y.col_int:asc, randgen_index_merge_join.y.col_int_key:desc, offset:50, count:50                             | time:4.862697ms, loops:2, rows:50                                                     | 8.6484375 KB    | N/A     |
|   └─HashRightJoin_17              | 100.00  | 295     | root      | right outer join, equal:[eq(randgen_index_merge_join.b.col_varchar_1024_utf8, randgen_index_merge_join.y.col_varchar_10_latin1_key)] | time:4.483809ms, loops:2, rows:295, Concurrency:5, probe collision:0, build:48.333µs  | 12.23046875 KB  | 0 Bytes |
|     ├─TableReader_20(Build)       | 100.00  | 100     | root      | data:TableFullScan_19                                                                                                                | time:1.255139ms, loops:2, rows:100, rpc num: 1, rpc time:1.342984ms, proc keys:100    | 1.087890625 KB  | N/A     |
|     │ └─TableFullScan_19          | 100.00  | 100     | cop[tikv] | table:table1, keep order:false, stats:pseudo                                                                                         | time:0s, loops:3, rows:100                                                            | N/A             | N/A     |
|     └─TopN_21(Probe)              | 100.00  | 100     | root      | randgen_index_merge_join.y.col_int:asc, randgen_index_merge_join.y.col_int_key:desc, offset:0, count:100                             | time:3.836429ms, loops:2, rows:100                                                    | 4.0771484375 KB | N/A     |
|       └─TableReader_29            | 100.00  | 100     | root      | data:TopN_28                                                                                                                         | time:3.676556ms, loops:2, rows:100, rpc num: 1, rpc time:3.636999ms, proc keys:498    | 3.4541015625 KB | N/A     |
|         └─TopN_28                 | 100.00  | 100     | cop[tikv] | randgen_index_merge_join.y.col_int:asc, randgen_index_merge_join.y.col_int_key:desc, offset:0, count:100                             | time:2ms, loops:1, rows:100                                                           | N/A             | N/A     |
|           └─TableRangeScan_27     | 336.33  | 498     | cop[tikv] | table:table2, range:[-inf,6), (6,9), (9,+inf], keep order:false, stats:pseudo                                                        | time:2ms, loops:1, rows:498                                                           | N/A             | N/A     |
+-----------------------------------+---------+---------+-----------+--------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------+-----------------+---------+
9 rows in set (0.00 sec)

@XuHuaiyu
Copy link
Contributor

After disable topn_push_down, we can get correct result on mocktikv:

insert into mysql.opt_rule_blacklist value("topn_push_down");
admin reload opt_rule_blacklist;

@XuHuaiyu
Copy link
Contributor

This not a bug. The result is expected to be unstable.

We fetch data from table2 order by table2.col_int, table2 . col_int_key, but use table2. col_varchar_10_latin1_key as join key. The order of col_varchar_10_latin1_key can not be promised, thus we get a different result from MySQL.

If we use order by table2.col_int, table2 . col_int_key,table2. col_varchar_10_latin1_key , the result will be correct.

@XuHuaiyu XuHuaiyu added type/question The issue belongs to a question. type/bug The issue is confirmed as a bug. and removed type/bug The issue is confirmed as a bug. type/question The issue belongs to a question. labels Mar 27, 2020
@XuHuaiyu
Copy link
Contributor

XuHuaiyu commented Mar 27, 2020

After discussing with @fzhedu , this is bug.
TopN should be executed after join done, it should not be pushed down as the child of join.

@fzhedu
Copy link
Contributor Author

fzhedu commented Mar 31, 2020

This bug is caused by the TopN execution on mockTiKV.
after insert into mysql.expr_pushdown_blacklist values ('<>'); admin reload expr_pushdown_blacklist;, the topN is not pushed down to the coprocessor, the result is ok.
Besides, the SQL runs correctly on distributed TiKV.
This highlights the TopN causes errors in mockTiKV.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity/moderate sig/execution SIG execution type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants