Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expression: fix builtin 'CharLength' for binary string input #7410

Merged
merged 6 commits into from
Aug 20, 2018

Conversation

spongedu
Copy link
Contributor

What problem does this PR solve?

Currently in TiDB, builtin CharLength's behavior is not consistent with MySQL when deal with binary strings.

In MySQL:

mysql> select char_length(binary("数 据 库"));
+------------------------------------+
| char_length(binary("数 据 库"))    |
+------------------------------------+
|                                 11 |
+------------------------------------+
1 row in set (0.00 sec)

While in TiDB:

tidb> select char_length(binary("数 据 库"));
+------------------------------------+
| char_length(binary("数 据 库"))    |
+------------------------------------+
|                                  5 |
+------------------------------------+
1 row in set (0.00 sec)

What is changed and how it works?

take special care for binary strings when create builtinSig

Check List

Tests

  • Unit test
  • Integration test

Code changes

Side effects

Related changes

Copy link
Member

@shenli shenli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@spongedu
Copy link
Contributor Author

/run-all-tests

if isNull || err != nil {
return 0, isNull, errors.Trace(err)
}
return int64(len([]byte(val))), false, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just len(val) will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@shenli
Copy link
Member

shenli commented Aug 17, 2018

/run-all-tests

@coocood
Copy link
Member

coocood commented Aug 17, 2018

@spongedu
In this way, If we support more charset in the future, we may need to define a signature for each charset.

Have you considered adding a field to charLengthSig that indicates if it is binary?

@spongedu
Copy link
Contributor Author

At first sight I think we need only distinguish binary strings and non-binary strings. I'm not sure about this for now. I'll look into it and make some tests today or so.

Copy link
Contributor

@winkyao winkyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zz-jason
Copy link
Member

/run-all-tests

@zz-jason zz-jason added contribution This PR is from a community contributor. type/bugfix This PR fixes a bug. component/expression status/LGT3 The PR has already had 3 LGTM. labels Aug 20, 2018
@XuHuaiyu XuHuaiyu merged commit 5754f62 into pingcap:master Aug 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/expression contribution This PR is from a community contributor. status/LGT3 The PR has already had 3 LGTM. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants