Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ANALYZE TABLE execution time depending on number of partitions #35142

Closed
mjonss opened this issue Jun 3, 2022 · 9 comments
Closed

ANALYZE TABLE execution time depending on number of partitions #35142

mjonss opened this issue Jun 3, 2022 · 9 comments
Assignees
Labels
affects-6.4 affects-6.5 affects-6.6 sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.

Comments

@mjonss
Copy link
Contributor

mjonss commented Jun 3, 2022

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

tidb> show create table sbtest1;
+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table   | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| sbtest1 | CREATE TABLE `sbtest1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `k` int(11) NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */,
  KEY `k_1` (`k`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin AUTO_INCREMENT=10138618
PARTITION BY RANGE (`id`)
(PARTITION `p0` VALUES LESS THAN (800000),
 PARTITION `p1` VALUES LESS THAN (1600000),
 PARTITION `p2` VALUES LESS THAN (2400000),
 PARTITION `p3` VALUES LESS THAN (3200000),
 PARTITION `p4` VALUES LESS THAN (4000000),
 PARTITION `p5` VALUES LESS THAN (4800000),
 PARTITION `p6` VALUES LESS THAN (5600000),
 PARTITION `p7` VALUES LESS THAN (6400000),
 PARTITION `p8` VALUES LESS THAN (7200000),
 PARTITION `p9` VALUES LESS THAN (8000000),
 PARTITION `p10` VALUES LESS THAN (8800000),
 PARTITION `p11` VALUES LESS THAN (9600000),
 PARTITION `p12` VALUES LESS THAN (10400000),
 PARTITION `p13` VALUES LESS THAN (11200000),
 PARTITION `pMax` VALUES LESS THAN (MAXVALUE)) |
+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0,00 sec)

tidb> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (0,01 sec)

tidb> analyze table sbtest1 partition p0;
Query OK, 0 rows affected, 1 warning (2,71 sec)

tidb> analyze table sbtest1 partition p0,p1;
Query OK, 0 rows affected, 2 warnings (3,42 sec)

tidb> analyze table sbtest1 partition p0,p1,p2;
Query OK, 0 rows affected, 3 warnings (4,12 sec)

tidb> analyze table sbtest1 partition p0,p1,p2,p3,p4,p5;
Query OK, 0 rows affected, 6 warnings (7,01 sec)

tidb> analyze table sbtest1;
Query OK, 0 rows affected, 15 warnings (13,23 sec)

tidb> show warnings;
+-------+------+---------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                     |
+-------+------+---------------------------------------------------------------------------------------------+
| Note  | 1105 | Analyze use auto adjusted sample rate 0.139168 for table sbdynamic.sbtest1's partition p0   |
| Note  | 1105 | Analyze use auto adjusted sample rate 0.139042 for table sbdynamic.sbtest1's partition p1   |
| Note  | 1105 | Analyze use auto adjusted sample rate 0.138688 for table sbdynamic.sbtest1's partition p2   |
| Note  | 1105 | Analyze use auto adjusted sample rate 0.139260 for table sbdynamic.sbtest1's partition p3   |
| Note  | 1105 | Analyze use auto adjusted sample rate 0.138750 for table sbdynamic.sbtest1's partition p4   |
| Note  | 1105 | Analyze use auto adjusted sample rate 0.139360 for table sbdynamic.sbtest1's partition p5   |
| Note  | 1105 | Analyze use auto adjusted sample rate 0.139569 for table sbdynamic.sbtest1's partition p6   |
| Note  | 1105 | Analyze use auto adjusted sample rate 0.139443 for table sbdynamic.sbtest1's partition p7   |
| Note  | 1105 | Analyze use auto adjusted sample rate 0.139351 for table sbdynamic.sbtest1's partition p8   |
| Note  | 1105 | Analyze use auto adjusted sample rate 0.139379 for table sbdynamic.sbtest1's partition p9   |
| Note  | 1105 | Analyze use auto adjusted sample rate 0.140082 for table sbdynamic.sbtest1's partition p10  |
| Note  | 1105 | Analyze use auto adjusted sample rate 0.139198 for table sbdynamic.sbtest1's partition p11  |
| Note  | 1105 | Analyze use auto adjusted sample rate 0.210632 for table sbdynamic.sbtest1's partition p12  |
| Note  | 1105 | Analyze use auto adjusted sample rate 1.000000 for table sbdynamic.sbtest1's partition p13  |
| Note  | 1105 | Analyze use auto adjusted sample rate 1.000000 for table sbdynamic.sbtest1's partition pMax |
+-------+------+---------------------------------------------------------------------------------------------+
15 rows in set (0,00 sec)

tidb> select partition_name,table_rows from information_schema.partitions where table_name = 'sbtest1' and table_schema = 'sbdynamic';
+----------------+------------+
| partition_name | table_rows |
+----------------+------------+
| p0             |     790410 |
| p1             |     791129 |
| p2             |     793149 |
| p3             |     789887 |
| p4             |     792791 |
| p5             |     789320 |
| p6             |     788141 |
| p7             |     788850 |
| p8             |     789375 |
| p9             |     789215 |
| p10            |     785254 |
| p11            |     790240 |
| p12            |     522239 |
| p13            |          0 |
| pMax           |          0 |
+----------------+------------+
15 rows in set (0,01 sec)

tidb> select @@tidb_partition_prune_mode;
+-----------------------------+
| @@tidb_partition_prune_mode |
+-----------------------------+
| dynamic                     |
+-----------------------------+
1 row in set (0,00 sec)

tidb> set @@tidb_partition_prune_mode='static';
Query OK, 0 rows affected (0,00 sec)

tidb> analyze table sbtest1;
Query OK, 0 rows affected, 15 warnings (11,59 sec)

tidb> analyze table sbtest1 partition p0;
Query OK, 0 rows affected, 1 warning (1,44 sec)

tidb> analyze table sbtest1 partition p0,p1,p2,p3,p4,p5;
Query OK, 0 rows affected, 6 warnings (5,62 sec)

tidb> show warnings;
+-------+------+-------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                   |
+-------+------+-------------------------------------------------------------------------------------------+
| Note  | 1105 | Analyze use auto adjusted sample rate 0.139168 for table sbdynamic.sbtest1's partition p0 |
| Note  | 1105 | Analyze use auto adjusted sample rate 0.139042 for table sbdynamic.sbtest1's partition p1 |
| Note  | 1105 | Analyze use auto adjusted sample rate 0.138688 for table sbdynamic.sbtest1's partition p2 |
| Note  | 1105 | Analyze use auto adjusted sample rate 0.139260 for table sbdynamic.sbtest1's partition p3 |
| Note  | 1105 | Analyze use auto adjusted sample rate 0.138750 for table sbdynamic.sbtest1's partition p4 |
| Note  | 1105 | Analyze use auto adjusted sample rate 0.139360 for table sbdynamic.sbtest1's partition p5 |
+-------+------+-------------------------------------------------------------------------------------------+
6 rows in set (0,00 sec)

tidb> use sbnopart
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
tidb> analyze table sbtest1;
Query OK, 0 rows affected, 1 warning (2,42 sec)

tidb> show warnings;
+-------+------+---------------------------------------------------------------------------+
| Level | Code | Message                                                                   |
+-------+------+---------------------------------------------------------------------------+
| Note  | 1105 | Analyze use auto adjusted sample rate 0.011000 for table sbnopart.sbtest1 |
+-------+------+---------------------------------------------------------------------------+
1 row in set (0,01 sec)

tidb> show create table sbtest1;
+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table   | Create Table                                                                                                                                                                                                                                                                                                                                   |
+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| sbtest1 | CREATE TABLE `sbtest1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `k` int(11) NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */,
  KEY `k_1` (`k`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin AUTO_INCREMENT=10166202 |
+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0,01 sec)

tidb> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (0,56 sec)

tidb> use sbdynamic;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
tidb> analyze table sbtest1 with 0.011 samplerate;
Query OK, 0 rows affected (9,44 sec)

tidb> analyze table sbtest1 with 0.11 samplerate;
Query OK, 0 rows affected (11,03 sec)

Also note that under @@tidb_partition_prune_mode = 'dynamic' it takes longer time than using static prune mode.
Also decreasing the sample rate does take less time, but not proportionally to the decrease (1/10 sample rate -> 8/10 consumed time).

2. What did you expect to see? (Required)

time to run ANALYZE TABLE not taking additional time per partition to run, resulting in too long execution time for tables with many partitions (currently we support 8192 partitions).

3. What did you see instead (Required)

Execution time depending on number of partitions

4. What is your TiDB version? (Required)

tidb_version(): Release Version: v6.1.0-alpha-482-g3ed672faa8
Edition: Community
Git Commit Hash: 3ed672faa82612d75f28bc73117febe1b53f605a
Git Branch: HEAD
UTC Build Time: 2022-05-30 12:20:38
GoVersion: go1.18
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false
@mjonss mjonss added the type/bug The issue is confirmed as a bug. label Jun 3, 2022
@mjonss
Copy link
Contributor Author

mjonss commented Jun 3, 2022

Maybe only use a sample/subset of the partitions to run analyze on for dynamic prune mode?

@mjonss
Copy link
Contributor Author

mjonss commented Jun 3, 2022

Example of the same table, but with 8192 hash partitions, taking 2 1/2 hour to complete ANALYZE TABLE:

tidb> analyze table sbtest1;
Query OK, 0 rows affected, 8192 warnings (2 hours 34 min 29,78 sec)

tidb> show create table sbtest1;
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table   | Create Table                                                                                                                                                                                                                                                                                                                                                                            |
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| sbtest1 | CREATE TABLE `sbtest1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `k` int(11) NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */,
  KEY `k_1` (`k`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin AUTO_INCREMENT=10122187
PARTITION BY HASH (`id`) PARTITIONS 8192 |
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0,00 sec)

tidb> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (2,19 sec)

@xuyifangreeneyes
Copy link
Contributor

/assign

@ti-chi-bot ti-chi-bot added may-affects-4.0 This bug maybe affects 4.0.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.0 may-affects-6.1 labels Jun 6, 2022
@xuyifangreeneyes
Copy link
Contributor

xuyifangreeneyes commented Jun 6, 2022

There seems to be three reasons:

  1. We use an auto-adjusted sample rate for each non-partition table and each partition, which collects roughly 100000 rows. So the more partitions, the more samples are collected.
  2. We launch one analyze job for each non-partition table and each partition. The more partitions, the more analyze jobs need to be executed. However, the number of analyze workers are limited by tidb_build_stats_concurrency(the default value is 4). Some analyze jobs are blocked until others are finished.
  3. SaveTableStatsToStorage seems slow.

@seiya-annie seiya-annie added affects-4.0 This bug affects 4.0.x versions. affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects 5.4.x versions. affects-6.0 and removed may-affects-4.0 This bug maybe affects 4.0.x versions. may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. may-affects-6.0 may-affects-6.1 labels Jun 6, 2022
@Yisaer Yisaer added type/enhancement The issue or PR belongs to an enhancement. and removed affects-4.0 This bug affects 4.0.x versions. affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects 5.4.x versions. affects-6.0 affects-6.1 affects-6.2 affects-6.3 labels Oct 18, 2022
@chrysan
Copy link
Contributor

chrysan commented Feb 22, 2023

@Yisaer Can you help confirm with @mjonss if this issue can be closed since we have the analyze performance improved for partition table?

@Yisaer
Copy link
Contributor

Yisaer commented Feb 23, 2023

Hi, @mjonss
I tested this case by sysbench with 1 table, 1000000 rows, 1024 partitions on analyzing table which costs 35mins. I found that if the partition tables increased, the time costed on merging stats for global stats will also be increased.

We revised this processing in:
#38358
#38239
to solve this problem by merging partition stats and load partition stats in concurrency.

In our latest test, with 8 read partition stats concurrency and 16 merging partition concurrency, cost 5mins to analyze the table with 1024 partitions and cost 9min30s to analyze the table with 2048 partitions.

Thus I think the problem is solved and can be closed for now, feel free to reopen or comment if you have any questions.

@Yisaer Yisaer closed this as completed Feb 23, 2023
@seiya-annie seiya-annie removed type/bug The issue is confirmed as a bug. severity/major labels Jul 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.4 affects-6.5 affects-6.6 sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

7 participants