Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

select count(distinct col_bit(64)_undef_signed, col_enum_undef_signed) get wrong result #17241

Closed
ChenPeng2013 opened this issue May 15, 2020 · 3 comments
Assignees
Labels
severity/critical sig/execution SIG execution type/bug The issue is confirmed as a bug.
Milestone

Comments

@ChenPeng2013
Copy link
Contributor

ChenPeng2013 commented May 15, 2020

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

use test;
source multi_column.txt;

SELECT count(distinct `col_bit(64)_undef_signed`, `col_enum_undef_signed`) FROM table_190000_utf8_undef;
+---------------------------------------------------------------------+
| count(distinct `col_bit(64)_undef_signed`, `col_enum_undef_signed`) |
+---------------------------------------------------------------------+
|                                                               43860 |
+---------------------------------------------------------------------+
1 row in set (10.59 sec)

2. What did you expect to see? (Required)

in mysql 8.0.17

mysql> SELECT count(distinct `col_bit(64)_undef_signed`, `col_enum_undef_signed`) FROM table_190000_utf8_undef;
+---------------------------------------------------------------------+
| count(distinct `col_bit(64)_undef_signed`, `col_enum_undef_signed`) |
+---------------------------------------------------------------------+
|                                                               28748 |
+---------------------------------------------------------------------+
1 row in set (7.72 sec)

I'm not sure if it has the same reason as #17216

3. Affected version (Required)

Release Version: v4.0.0-rc.1-79-g1934f34c9
Edition: Community
Git Commit Hash: 1934f34c94a173a492506489ccfb2fc931cd2200
Git Branch: release-4.0
UTC Build Time: 2020-05-14 07:10:09
GoVersion: go1.14.1
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false

4. Root Cause Analysis

@ChenPeng2013 ChenPeng2013 added the type/bug The issue is confirmed as a bug. label May 15, 2020
@wjhuang2016 wjhuang2016 added the sig/execution SIG execution label May 17, 2020
@ghost ghost added the correctness label Aug 6, 2020
@SunRunAway SunRunAway modified the milestones: v4.0.6, v4.0.5 Aug 12, 2020
@ichn-hu
Copy link
Contributor

ichn-hu commented Aug 12, 2020

@ChenPeng2013 It looks like a bug of MySQL after some investigaion.

If you swap the two field in the count distinct, the result on MySQL will match TiDB's.

mysql> SELECT count(distinct `col_bit(64)_undef_signed`, `col_enum_undef_signed`) FROM table_190000_utf8_undef;
+---------------------------------------------------------------------+
| count(distinct `col_bit(64)_undef_signed`, `col_enum_undef_signed`) |
+---------------------------------------------------------------------+
|                                                               28748 |
+---------------------------------------------------------------------+
1 row in set (0.09 sec)

mysql> SELECT count(distinct `col_enum_undef_signed`, `col_bit(64)_undef_signed`) FROM table_190000_utf8_undef;
+---------------------------------------------------------------------+
| count(distinct `col_enum_undef_signed`, `col_bit(64)_undef_signed`) |
+---------------------------------------------------------------------+
|                                                               43860 |
+---------------------------------------------------------------------+
1 row in set (0.10 sec)

mysql> show variables like '%version%';
+--------------------------+-------------------------------+
| Variable_name            | Value                         |
+--------------------------+-------------------------------+
| admin_tls_version        | TLSv1,TLSv1.1,TLSv1.2,TLSv1.3 |
| immediate_server_version | 999999                        |
| innodb_version           | 8.0.21                        |
| original_server_version  | 999999                        |
| protocol_version         | 10                            |
| slave_type_conversions   |                               |
| tls_version              | TLSv1,TLSv1.1,TLSv1.2,TLSv1.3 |
| version                  | 8.0.21                        |
| version_comment          | Source distribution           |
| version_compile_machine  | x86_64                        |
| version_compile_os       | Linux                         |
| version_compile_zlib     | 1.2.11                        |
+--------------------------+-------------------------------+
12 rows in set (0.01 sec)

Additionally, I used a simple python script to calculate the distinct value pairs and it is indeed 43860.

the following file is obtained by extract rows from the database.

mysql_17241.txt

and

╭─ichn@ichn-arch-pc ~/tmp/fix_count_distinct 
╰─$ mysql -h 127.0.0.1 -u root -P 3306 -p -D test -e 'SELECT cast(`col_bit(64)_undef_signed` as signed), `col_enum_undef_signed` FROM table_190000_utf8_undef;' > mysql_17241.txt         
Enter password: 
╭─ichn@ichn-arch-pc ~/tmp/fix_count_distinct 
╰─$ cat calc.py                                                                                                                                                                  
out = open('mysql_17241.txt').read().split('\n')[1:-1]
print(len(set(filter(lambda x: x[0] != 'NULL' and x[1] != 'NULL', [tuple(l.split('\t')) for l in out]))))
╭─ichn@ichn-arch-pc ~/tmp/fix_count_distinct 
╰─$ python calc.py                                                                                                                                                               
43860

@ichn-hu
Copy link
Contributor

ichn-hu commented Aug 13, 2020

This test case has been reported to MySQL, https://bugs.mysql.com/bug.php?id=100504 , and it has been verified by MySQL team, we might just close this issue, @ChenPeng2013 could you please verify this as well? @SunRunAway FYI.

@ichn-hu
Copy link
Contributor

ichn-hu commented Aug 17, 2020

@ChenPeng2013 , @SunRunAway Shall we close this issue?

@jebter jebter closed this as completed Aug 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity/critical sig/execution SIG execution type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests

7 participants