Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect row count estimation in TiDB 2.1.1 #8921

Closed
Guru107 opened this issue Jan 3, 2019 · 17 comments
Closed

Incorrect row count estimation in TiDB 2.1.1 #8921

Guru107 opened this issue Jan 3, 2019 · 17 comments
Assignees
Labels
sig/planner SIG: Planner type/bug The issue is confirmed as a bug.

Comments

@Guru107
Copy link

Guru107 commented Jan 3, 2019

Bug Report

Please answer these questions before submitting your issue. Thanks!

  1. What did you do?

    Create a table with a composite primary key.

CREATE TABLE `table1` (
`column_1` bigint(20) NOT NULL,
`column_2` int(11) NOT NULL,
`column_3` varchar(10) DEFAULT NULL,
`column_4` bigint(20) DEFAULT NULL,
`column_5` varchar(20) DEFAULT NULL,
`column_6` varchar(10) DEFAULT NULL,
`column_7` int(11) DEFAULT NULL,
`column_8` varchar(4) DEFAULT NULL,
`column_9` int(11) DEFAULT NULL,
`column_10` varchar(50) DEFAULT NULL,
`column_11` smallint(6) DEFAULT NULL,
`column_12` varchar(4000) DEFAULT NULL,
`column_13` decimal(10,2) DEFAULT NULL,
`column_14` decimal(10,2) DEFAULT NULL,
`column_15` decimal(10,2) DEFAULT NULL,
`column_16` varchar(1) DEFAULT NULL,
`column_17` datetime NOT NULL,
`column_18` varchar(200) DEFAULT NULL,
`column_19` varchar(2) DEFAULT NULL,
`column_20` text DEFAULT NULL,
`column_21` varchar(255) DEFAULT NULL,
`column_22` text DEFAULT NULL,
`column_23` varchar(50) DEFAULT NULL,
`column_24` varchar(20) DEFAULT NULL,
`column_25` bigint(20) DEFAULT NULL,
`column_26` varchar(1500) DEFAULT NULL,
`column_27` int(11) DEFAULT NULL,
`column_28` varchar(1000) DEFAULT NULL,
`column_29` text DEFAULT NULL,
`column_30` text DEFAULT NULL,
`column_31` decimal(10,2) DEFAULT NULL,
`column_32` decimal(10,2) DEFAULT NULL,
`column_33` decimal(10,2) DEFAULT NULL,
`column_34` decimal(10,2) DEFAULT NULL,
`column_35` datetime DEFAULT NULL,
`column_36` text DEFAULT NULL,
`column_37` varchar(5) DEFAULT NULL,
`column_38` int(11) DEFAULT NULL,
`column_39` varchar(1) DEFAULT NULL,
`column_40` varchar(1) DEFAULT NULL,
`column_41` varchar(5) DEFAULT NULL,
`column_42` int(11) DEFAULT NULL,
`column_43` varchar(5) DEFAULT NULL,
`column_44` decimal(10,2) DEFAULT NULL,
`column_45` decimal(10,2) DEFAULT NULL,
`column_46` int(11) DEFAULT NULL,
`column_47` decimal(10,2) DEFAULT NULL,
`column_48` varchar(5) DEFAULT NULL,
`column_49` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
`column_50` decimal(10,2) DEFAULT NULL,
`column_51` decimal(10,2) DEFAULT NULL,
`column_52` decimal(10,2) DEFAULT NULL,
`column_53` decimal(10,2) DEFAULT NULL,
`column_54` varchar(2) DEFAULT NULL,
`column_55` varchar(50) DEFAULT NULL,
`column_56` datetime DEFAULT NULL,
`column_57` varchar(500) DEFAULT NULL,
`column_58` varchar(5) DEFAULT NULL,
`column_59` varchar(100) DEFAULT NULL,
`column_60` varchar(10) DEFAULT NULL,
`column_61` datetime DEFAULT NULL,
`column_62` varchar(100) DEFAULT NULL,
`column_63` varchar(20) DEFAULT NULL,
`column_64` varchar(5) DEFAULT NULL,
`column_65` varchar(10) DEFAULT NULL,
`column_66` varchar(10) DEFAULT NULL,
`column_67` decimal(10,2) DEFAULT NULL,
`column_68` decimal(10,2) DEFAULT NULL,
`column_69` decimal(10,2) DEFAULT NULL,
`column_70` decimal(10,2) DEFAULT NULL,
`column_71` decimal(10,0) DEFAULT NULL,
`column_72` decimal(10,0) DEFAULT NULL,
`column_73` decimal(10,0) DEFAULT NULL,
`column_74` varchar(10) DEFAULT NULL,
`column_75` int(11) DEFAULT NULL,
`column_76` varchar(30) DEFAULT NULL,
`column_77` int(11) DEFAULT NULL,
`column_78` decimal(10,2) DEFAULT NULL,
  PRIMARY KEY (`column_1`,`column_2`),
  KEY `column_17` (`column_17`),
  KEY `column_49` (`column_49`),
  KEY `column_1` (`column_1`),
  KEY `column_40` (`column_40`),
  KEY `column_9` (`column_9`),
  KEY `column_43` (`column_43`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;

Run query
explain select column1, column2 from `table1` where (column1=01092213 AND column2=1) OR (column1=01091904 AND column2=2) OR (column1=01091970 AND column2=2) OR (column1=01092706 AND column2=1) OR (column1=01092355 AND column2=1) OR (column1=01092578 AND column2=2) OR (column1=01092623 AND column2=4) OR (column1=01092651 AND column2=1) OR (column1=01092369 AND column2=2) OR (column1=01092365 AND column2=1) OR (column1=01092238 AND column2=2) OR (column1=01092799 AND column2=1) OR (column1=01092770 AND column2=3) OR (column1=01092363 AND column2=2) OR (column1=01092365 AND column2=2) OR (column1=01093579 AND column2=1) OR (column1=01093051 AND column2=2) OR (column1=01093032 AND column2=1) OR (column1=01093100 AND column2=2) OR (column1=01093534 AND column2=1) OR (column1=01093432 AND column2=1) OR (column1=01092914 AND column2=1) OR (column1=01092861 AND column2=2) OR (column1=01093054 AND column2=2) OR (column1=01093298 AND column2=1) OR (column1=01093293 AND column2=3);

  1. What did you expect to see?
    Running explain on TiDB 2.0.8 returns

screen shot 2019-01-03 at 12 13 10 pm

  1. What did you see instead?
    Running explain on TiDB 2.1.1 returns a full table scan

screen shot 2019-01-03 at 12 16 05 pm

  1. What version of TiDB are you using (tidb-server -V or run select tidb_version(); on TiDB)?

    TiDB v2.1.1

@eurekaka
Copy link
Contributor

eurekaka commented Jan 3, 2019

It is actually using index scan according to the pictures you pasted? Please confirm the plan, and provide statistics of table if it is really table scan by using:

curl -G "http://${tidb-server-ip}:${tidb-server-status-port}/stats/dump/${db_name}/${table_name}" > ${table_name}_stats.json

@eurekaka eurekaka added type/bug The issue is confirmed as a bug. sig/planner SIG: Planner labels Jan 3, 2019
@Guru107
Copy link
Author

Guru107 commented Jan 3, 2019

@eurekaka Yes it is showing Index scan but it is scanning all rows in 2.1.1 as seen in count column. But in 2.0.8 if you see the count column it is only 40.

@eurekaka
Copy link
Contributor

eurekaka commented Jan 3, 2019

@Guru107 Does the table have changes between these 2 explains? seems like 2.1.1 uses column_1 index instead of primary index now. Please provide the statistics information I mentioned above, and complete result of explain analyze of the above query. Since you can provide the SQL query, I guess there should be no sensitive information need to hide in the explain output?

@Guru107
Copy link
Author

Guru107 commented Jan 3, 2019

@eurekaka actually the data is sensitive. I had changed the column names in the sql query so that's why I am not able to share. But it can be reproduced on any test table.

Here are the steps I did to reproduce the issue.

  1. Created a test table with 100 rows using the following schema in TiDB 2.0.8
CREATE TABLE `myTable` (
  `column1` mediumint(9) NOT NULL,
  `column2` mediumint(9) NOT NULL,
  `column3` varchar(255) DEFAULT NULL,
  `column4` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`column1`,`column2`),
  KEY `column1` (`column1`)
);

  1. Ran the following query on TiDB 2.0.8
explain select column1,column2 from myTable where (column1=0 and column2=1) or (column1=1 and column2=3) or (column1=2 and column2=5) or (column1=3 and column2=7) or (column1=4 and column2=9) or (column1=5 and column2=11) or (column1=6 and column2=13) or (column1=7 and column2=15) or (column1=8 and column2=17) or (column1=9 and column2=19) or (column1=10 and column2=21) or (column1=11 and column2=23) or (column1=12 and column2=25) or (column1=13 and column2=27) or (column1=14 and column2=29) or (column1=15 and column2=31) or (column1=16 and column2=33) or (column1=17 and column2=35) or (column1=18 and column2=37) or (column1=19 and column2=39) or (column1=20 and column2=41) or (column1=21 and column2=43) or (column1=22 and column2=45) or (column1=23 and column2=47) or (column1=24 and column2=49) or (column1=25 and column2=51) or (column1=26 and column2=53) or (column1=27 and column2=55) or (column1=28 and column2=57) or (column1=29 and column2=59) or (column1=30 and column2=61) or (column1=31 and column2=63) or (column1=32 and column2=65) or (column1=33 and column2=67) or (column1=34 and column2=69) or (column1=35 and column2=71) or (column1=36 and column2=73) or (column1=37 and column2=75) or (column1=38 and column2=77) or (column1=39 and column2=79) or (column1=40 and column2=81) or (column1=41 and column2=83) or (column1=42 and column2=85) or (column1=43 and column2=87) or (column1=44 and column2=89) or (column1=45 and column2=91) or (column1=46 and column2=93) or (column1=47 and column2=95) or (column1=48 and column2=97) or (column1=49 and column2=99)

I got the following output.

screen shot 2019-01-03 at 2 38 32 pm

  1. I updated the TiDB cluster from 2.0.8 to 2.1.1

  2. Ran the explain analyze query on TiDB 2.1.1 cluster

explain analyze select column1,column2 from myTable where (column1=0 and column2=1) or (column1=1 and column2=3) or (column1=2 and column2=5) or (column1=3 and column2=7) or (column1=4 and column2=9) or (column1=5 and column2=11) or (column1=6 and column2=13) or (column1=7 and column2=15) or (column1=8 and column2=17) or (column1=9 and column2=19) or (column1=10 and column2=21) or (column1=11 and column2=23) or (column1=12 and column2=25) or (column1=13 and column2=27) or (column1=14 and column2=29) or (column1=15 and column2=31) or (column1=16 and column2=33) or (column1=17 and column2=35) or (column1=18 and column2=37) or (column1=19 and column2=39) or (column1=20 and column2=41) or (column1=21 and column2=43) or (column1=22 and column2=45) or (column1=23 and column2=47) or (column1=24 and column2=49) or (column1=25 and column2=51) or (column1=26 and column2=53) or (column1=27 and column2=55) or (column1=28 and column2=57) or (column1=29 and column2=59) or (column1=30 and column2=61) or (column1=31 and column2=63) or (column1=32 and column2=65) or (column1=33 and column2=67) or (column1=34 and column2=69) or (column1=35 and column2=71) or (column1=36 and column2=73) or (column1=37 and column2=75) or (column1=38 and column2=77) or (column1=39 and column2=79) or (column1=40 and column2=81) or (column1=41 and column2=83) or (column1=42 and column2=85) or (column1=43 and column2=87) or (column1=44 and column2=89) or (column1=45 and column2=91) or (column1=46 and column2=93) or (column1=47 and column2=95) or (column1=48 and column2=97) or (column1=49 and column2=99)
  1. Got the following output after analyze

screen shot 2019-01-03 at 2 41 26 pm

  1. I retrived the following table statistics after upgrading to TiDB 2.1.1 ( forgot to get table stats before upgrade 😅 . Let me know if you need that. I'll have to perform the drill all over again 😅 )
{
    "database_name": "test",
    "table_name": "mytable",
    "columns": {
        "column1": {
            "histogram": {
                "ndv": 0
            },
            "cm_sketch": null,
            "null_count": 0,
            "tot_col_size": 0,
            "last_update_version": 405407327162990594
        },
        "column2": {
            "histogram": {
                "ndv": 0
            },
            "cm_sketch": null,
            "null_count": 0,
            "tot_col_size": 0,
            "last_update_version": 405407327162990594
        },
        "column3": {
            "histogram": {
                "ndv": 0
            },
            "cm_sketch": null,
            "null_count": 0,
            "tot_col_size": 582,
            "last_update_version": 405407336809889798
        },
        "column4": {
            "histogram": {
                "ndv": 0
            },
            "cm_sketch": null,
            "null_count": 0,
            "tot_col_size": 583,
            "last_update_version": 405407336809889798
        }
    },
    "indices": {
        "column1": {
            "histogram": {
                "ndv": 0
            },
            "cm_sketch": null,
            "null_count": 0,
            "tot_col_size": 0,
            "last_update_version": 405407327162990594
        },
        "primary": {
            "histogram": {
                "ndv": 0
            },
            "cm_sketch": null,
            "null_count": 0,
            "tot_col_size": 0,
            "last_update_version": 405407327162990594
        }
    },
    "count": 100,
    "modify_count": 100,
    "partitions": null
}

@winoros
Copy link
Member

winoros commented Jan 3, 2019

@Guru107 The column count in analyze statement is the count estimated by TiDB's planner. It's a value the TiDB think it should be, not the actual count.
As you can see in the column operator info, there're things like range: [0 1, 0 1], [1 3, 1 3] ... This indicates the range information that is used to seek index.
And you can see when you ran explain analyze statement, There's an extra row that contains the information of loops, time, rows. rows here is actual rows that fetched by the executor. And as you can see it 50.

You can get this information from docs-cn or docs

@eurekaka
Copy link
Contributor

eurekaka commented Jan 3, 2019

Yes, explain analyze shows 50 rows retrieved actually. BTW, the statistics seems to be empty, can you analyze the table first, and check the count again to see if it is still 100?

@Guru107
Copy link
Author

Guru107 commented Jan 3, 2019

@winoros Got it. I was confused because the count value was different in 2.0.8 and 2.1.1 versions. I am curious if there was some change in the explain logic in 2.1

@eurekaka I ran analyze table myTable.

screen shot 2019-01-03 at 3 11 33 pm

No change in the count column

screen shot 2019-01-03 at 3 14 27 pm

@eurekaka
Copy link
Contributor

eurekaka commented Jan 3, 2019

@Guru107 thanks, please provide the latest statistics information again, then we would check if the estimation count change is reasonable.

@Guru107
Copy link
Author

Guru107 commented Jan 3, 2019

@eurekaka Here's the table stats after analyze
table_stats.txt

@eurekaka
Copy link
Contributor

eurekaka commented Jan 3, 2019

@Guru107 Thanks.

@eurekaka
Copy link
Contributor

eurekaka commented Jan 3, 2019

The behavior change is caused by these lines in deriveIndexPathStats:

458     // If the `countAfterAccess` is less than `stats.RowCount`, there must be some inconsistent stats info.
459     // We prefer the `stats.RowCount` because it could use more stats info to calculate the selectivity.
460     if path.countAfterAccess < ds.stats.RowCount {
461         path.countAfterAccess = math.Min(ds.stats.RowCount/selectionFactor, float64(ds.statisticTable.Count))
462     }

this part is different in 2.0.8 and 2.1.1. @Guru107 thanks for the report, we will try to improve it.

@Guru107
Copy link
Author

Guru107 commented Jan 3, 2019

@eurekaka Does this impact query performance?

@eurekaka
Copy link
Contributor

eurekaka commented Jan 3, 2019

@Guru107 In your case, the query plan chosen is correct, so no performance impaction.

@eurekaka
Copy link
Contributor

eurekaka commented Jan 3, 2019

The change is introduced in #7233, @lamxTyler PTAL if we should make some adjustment.

@alivxxx
Copy link
Contributor

alivxxx commented Jan 3, 2019

The change is reasonable and needs no change. The root cause of the problem is the selectivity cannot process DNF now, refer https://github.com/pingcap/tidb/blob/master/statistics/selectivity.go#L249.

@eurekaka
Copy link
Contributor

eurekaka commented Jan 3, 2019

So if filter is in DNF form, selectivity may return estimation not as accurate as GetRowCountByIndexRanges because ranger handles DNF now while selectivity does not? then shall we mark whether it is DNF filter in selectivity and add a check in deriveIndexPathStats like below?

460     if path.countAfterAccess < ds.stats.RowCount && notInDNF {
461         path.countAfterAccess = math.Min(ds.stats.RowCount/selectionFactor, float64(ds.statisticTable.Count))
462     }

@alivxxx
Copy link
Contributor

alivxxx commented Jan 4, 2019

Then I think it is better to let selectivity also support DNF.

@zz-jason zz-jason changed the title Query over composite primary key does a full table scan in TiDB 2.1.1 Incorrect row count estimation in TiDB 2.1.1 Jan 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/planner SIG: Planner type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests

4 participants