Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

n #1

Closed
wants to merge 14 commits into from
Closed

n #1

wants to merge 14 commits into from

Conversation

xiongjiwei
Copy link
Owner

What problem does this PR solve?

charset: support collation utf8mb4_unicode_ci and utf8_unicode_ci

add utf8_unicode_ci and utf8mb4_unicode_ci collation interface, implement as utf8_bin and utf8mb4_bin now.

Unit Test Change

  • TestForbidUnsupportedCollations tests utf8mb4_roman_ci and utf8_roman_ci cause utf8mb4_unicode_ci and utf8mb4_unicode_ci are support now.
  • show collation will show utf8mb4_unicode_ci and utf8_unicode_ci
  • many other tests for utf8mb4_unicode_ci and utf8_unicode_ci

Note

  • Because of utf8mb4_unicode_ci and utf8_unicode_ci implement as utf8mb4_bin and utf8_bin, the test is also same as utf8mb4_bin and utf8_bin except when using ordering rule. These tests should change when actual implement.

  • should add utf8mb4_unicode_ci and utf8_unicode_ci to collationPriority and CollationStrictness

    // collationPriority is the priority when infer the result collation, the priority of collation a > b iff collationPriority[a] > collationPriority[b]
    collationPriority = map[string]int{
    charset.CollationASCII: 0,
    charset.CollationLatin1: 1,
    "utf8_general_ci": 2,
    charset.CollationUTF8: 3,
    "utf8mb4_general_ci": 4,
    charset.CollationUTF8MB4: 5,
    charset.CollationBin: 6,
    }
    // CollationStrictness indicates the strictness of comparison of the collation. The unequal order in a weak collation also holds in a strict collation.
    // For example, if a < b in a weak collation(e.g. general_ci), then there must be a < b in a strict collation(e.g. _bin).
    CollationStrictness = map[string]int{
    "utf8_general_ci": 0,
    "utf8mb4_general_ci": 0,
    charset.CollationASCII: 1,
    charset.CollationLatin1: 1,
    charset.CollationUTF8: 1,
    charset.CollationUTF8MB4: 1,
    charset.CollationBin: 2,
    }

see also 表达式中排序规则的 Coercibility 值

PS

I did not replace all the utf8_bin to utf8_unicode_ci and utf8mb4_bin to utf8mb4_unicode_ci in test, change the default collation to utf8mb4_unicode_ci.
It's no need to change them all cause it only used when new_collation_enabled is true.

Also, I write tests for utf8mb4_unicode_ci and utf8_unicode_ci. For pass the test now, I write the tests same as tests about utf8mb4_bin, utf8_bin. (In TDD, these tests should failed)

@xiongjiwei xiongjiwei closed this Jul 12, 2020
@xiongjiwei xiongjiwei changed the title Talent challenge n Jan 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant