Replies: 3 comments
-
This program is another example of not taking into account zero-width characters:
Output:
This is Note that the é is also 2 runes, but somehow 0xcc81 handled correctly. |
Beta Was this translation helpful? Give feedback.
-
V doesn't handle grapheme clusters. I create a feature request for this: #22117 |
Beta Was this translation helpful? Give feedback.
-
Thanks for reporting. |
Beta Was this translation helpful? Give feedback.
-
Describe the feature
The
.len_utf8()
function on strings gives the right number of utf-8 glyphs, but when tone or other marks or certain vowels are used that are displayed under or over a consonant, there is no way in V to find out the displayed length. There is no accounting for zero-width characters.The
.utf8_str_visible_length()
function gives the same result as the.len_utf8
function, and the.east_asian.display_width()
function (inencoding.utf8.east_asian
) also returns the same (incorrect) result.For example, the word
ผู้
is composed of 3 utf8 glyphs, but is displayed in 1 character position.Use Case
When calculating the displayed length of words or phrases in the Thai language.
Proposed Solution
Account for zero-width iso glyphs.
Other Information
No response
Acknowledgements
Version used
V 0.3.3 c16549b
Environment details (OS name and version, etc.)
Linux Mint 21.1
Beta Was this translation helpful? Give feedback.
All reactions