Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefix index implementation for utf8 string is incorrect #7104

Closed
birdstorm opened this issue Jul 19, 2018 · 1 comment · Fixed by #7109
Closed

Prefix index implementation for utf8 string is incorrect #7104

birdstorm opened this issue Jul 19, 2018 · 1 comment · Fixed by #7109
Labels
type/bug The issue is confirmed as a bug.

Comments

@birdstorm
Copy link
Contributor

birdstorm commented Jul 19, 2018

Please answer these questions before submitting your issue. Thanks!

  1. What did you do?
    If possible, provide a recipe for reproducing the error.
mysql> show create table t1;
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                         |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------+
| t1    | CREATE TABLE `t1` (
  `name` varchar(12) DEFAULT NULL,
  KEY `pname` (`name`(12))
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> insert into t1 values('借款策略集_网页');
Query OK, 1 row affected (0.14 sec)
  1. What did you expect to see?
mysql> select * from t1 where name = '借款策略集_网页';
+------------------------+
| name                   |
+------------------------+
| 借款策略集_网页        |
+------------------------+
1 row in set (0.00 sec)
  1. What did you see instead?
mysql> select * from t1 where name = '借款策略集_网页';
Empty set (0.01 sec)

mysql> alter table t1 drop index pname;
Query OK, 0 rows affected (0.33 sec)

mysql> select * from t1 where name = '借款策略集_网页';
+------------------------+
| name                   |
+------------------------+
| 借款策略集_网页        |
+------------------------+
1 row in set (0.00 sec)

mysql> create index pname on t1 (name(12));
Query OK, 0 rows affected (0.34 sec)

mysql> select * from t1 where name = '借款策略集_网页';
Empty set (0.00 sec)

mysql> select * from t1 where name < '借款策略集_网页';
+--------------+
| name         |
+--------------+
| 借款策略     |
+--------------+
1 row in set (0.01 sec)
  1. What version of TiDB are you using (tidb-server -V or run select tidb_version(); on TiDB)?
Release Version: v2.0.0
Git Commit Hash: 637e130e6a9ba2e54e158131c0466233db39a60e
Git Branch: HEAD
UTC Build Time: 2018-07-16 08:54:50
GoVersion: go version go1.10.1 darwin/amd64
TiKV Min Version: 2.0.0-rc.4.1

In our prefix index implementation, the index length is counted by its bytes.

However,

The effective maximum length of a VARCHAR in MySQL 5.0.3 and later is subject to the maximum row size (65,535 bytes, which is shared among all columns) and the character set used. For example, utf8 characters can require up to three bytes per character, so a VARCHAR column that uses the utf8 character set can be declared to be a maximum of 21,844 characters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants