Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: speed up the operation of "admin check table" #8572

Merged
merged 36 commits into from
Jul 31, 2019

Conversation

zimulala
Copy link
Contributor

@zimulala zimulala commented Dec 4, 2018

What problem does this PR solve?

Reduce the execution time of this operation of "admin check table".

What is changed and how it works?

compare the result of "select count() from table" and "select count() from table use index(idx)"

  • table count == index count
    We will check the records with the IndexLookUpReader method of all indexes.

  • table count > index count
    We will use the current index's checkTableRecord to check the records.

  • table count < index count
    We will use the current index's IndexLookUpReader to check the records.

Using 3 TiKV(8 Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz), 1 PD, 1 TiDB , the servers of TiKV and TiDB are not on a server.

CREATE TABLE `x` (
  `a` int(11) NOT NULL,
  `b` varchar(255) DEFAULT NULL,
  `c` blob DEFAULT NULL,
  `d` time DEFAULT NULL,
  `e` double DEFAULT NULL,
  `f` text DEFAULT NULL,
  `g` bit(10) DEFAULT NULL,
  PRIMARY KEY (`a`),
  KEY `d` (`d`),
  KEY `a_2` (`a`,`b`),
  KEY `c` (`c`(10),`d`),
  KEY `e` (`e`,`f`(100)),
  KEY `a_3` (`a`,`b`,`c`(2048)),
  KEY `d_2` (`d`,`e`,`f`(1024),`g`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |

before vs after

mysql> select count(*) from x use index(d);
+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (2.30 sec)

before this PR:
mysql> admin check table x;
More than 3 hours

after this PR:
mysql> admin check table x;
Query OK, 0 rows affected (45.52 sec)

Executing this operation with old code takes too long, so the following only tests the new code

mysql> select count(*) from x use index(d);
+-----------+
| count(*)  |
+-----------+
| 100000000 |
+—————+
Query OK, 0 rows affected (4 min 27.76 sec)
CREATE TABLE `lineitem` (
  `L_ORDERKEY` int(11) NOT NULL,
  `L_PARTKEY` int(11) NOT NULL,
  `L_SUPPKEY` int(11) NOT NULL,
  `L_LINENUMBER` int(11) NOT NULL,
  `L_QUANTITY` decimal(15,2) NOT NULL,
  `L_EXTENDEDPRICE` decimal(15,2) NOT NULL,
  `L_DISCOUNT` decimal(15,2) NOT NULL,
  `L_TAX` decimal(15,2) NOT NULL,
  `L_RETURNFLAG` char(1) NOT NULL,
  `L_LINESTATUS` char(1) NOT NULL,
  `L_SHIPDATE` date NOT NULL,
  `L_COMMITDATE` date NOT NULL,
  `L_RECEIPTDATE` date NOT NULL,
  `L_SHIPINSTRUCT` char(25) NOT NULL,
  `L_SHIPMODE` char(10) NOT NULL,
  `L_COMMENT` varchar(44) NOT NULL,
  PRIMARY KEY (`L_ORDERKEY`,`L_LINENUMBER`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin 

mysql> select count(*) from TPCH10.lineitem;
+----------+
| count(*) |
+----------+
| 59986052 |
+----------+
1 row in set (4.69 sec)
tidb> admin check table TPCH10.lineitem;
Query OK, 0 rows affected (1 min 16.92 sec)

mysql> select count(*) from TPCH50.lineitem;
+-----------+
| count(*)  |
+-----------+
| 300005811 |
tidb> admin check table TPCH50.lineitem;
Query OK, 0 rows affected (5 min 50.57 sec)

Check List

Tests

  • Unit test

Side effects

  • Possible performance regression

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation
  • Need to be included in the release note

This change is Reviewable

@zimulala zimulala added type/enhancement The issue or PR belongs to an enhancement. sig/execution SIG execution labels Dec 4, 2018
@winkyao
Copy link
Contributor

winkyao commented Dec 4, 2018

Really impressive.

@winkyao
Copy link
Contributor

winkyao commented Dec 4, 2018

Please fix ci.

@zimulala
Copy link
Contributor Author

zimulala commented Dec 5, 2018

/run-all-tests

@zimulala
Copy link
Contributor Author

PTAL @winkyao @jackysp

@zimulala
Copy link
Contributor Author

/run-unit-test

@jackysp
Copy link
Member

jackysp commented Dec 11, 2018

OK, great work! I need more time to review it.

@zimulala
Copy link
Contributor Author

/run-unit-test

@zimulala
Copy link
Contributor Author

PTAL @winkyao @jackysp @crazycs520 @XuHuaiyu

@winkyao
Copy link
Contributor

winkyao commented Jul 30, 2019

@zimulala Please address comments

@zimulala
Copy link
Contributor Author

PTAL @winkyao @AilinKid

Copy link
Contributor

@AilinKid AilinKid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pr is too long, needing time to deliberate. By now it's LGTM

@zimulala
Copy link
Contributor Author

/run-all-tests

executor/distsql.go Outdated Show resolved Hide resolved
Copy link
Contributor

@winkyao winkyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@winkyao winkyao added needs-cherry-pick-3.0 status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jul 31, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Jul 31, 2019

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Jul 31, 2019

@zimulala merge failed.

@winkyao
Copy link
Contributor

winkyao commented Jul 31, 2019

/run-unit-test

@zimulala zimulala merged commit 59e3eb7 into pingcap:master Jul 31, 2019
@zimulala zimulala deleted the check-table-1 branch July 31, 2019 09:11
@sre-bot
Copy link
Contributor

sre-bot commented Jul 31, 2019

cherry pick to release-3.0 failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/execution SIG execution status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants