Skip to content

Commit

Permalink
Fix clustering of gc=Cf, GCB=CN codepoints (microsoft#18285)
Browse files Browse the repository at this point in the history
* Previously we would mark all gc=Cf (Control, format) codepoints
  as zero-width, but that ignores that the majority of them are also
  GCB=CN (Control = does not join), which meant we ended up with
  zero-width grapheme clusters. Those cannot exist under a terminal.
  So, this PR makes all gc=Cf, GCB=CN codepoints zero-width, but also
  treats them as Extender codepoints, which mirrors `wcswidth`.
* This PR also updates the tables to Unicode 16.0.
* Finally, there's a minor code cleanup of the generator.

Closes microsoft#18267

## Validation Steps Performed
* Unit tests ✅
* Thai does not have random gaps anymore due to ZWSP ✅
  • Loading branch information
lhecker authored Dec 5, 2024
1 parent aa5459d commit 0961a77
Show file tree
Hide file tree
Showing 3 changed files with 822 additions and 867 deletions.
Loading

0 comments on commit 0961a77

Please sign in to comment.