Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Punt to the operating system for character encodings
Without this, "may contain any Unicode characters" seemed too ambiguous. I wish there were cleaner references for the {language}.{encoding} locales like en_US.UTF-8 and UTF-8. But [1,2] seems too glib, and I can't find a more targetted UTF-8 link than just dropping folks into a Unicode chapter (which is what [1] does): The Unicode Standard, Version 6.0, §3.9 D92, §3.10 D95 (2011) With the current v8.0 (2015-06-17), it's still §3.9 D92 and §3.10 D95. The TR35 link is for: In addition, POSIX locales may also specify the character encoding, which requires the data to be transformed into that target encoding. and the POSIX §6.2 link is for: In other locales, the presence, meaning, and representation of any additional characters are locale-specific. [1]: https://en.wikipedia.org/wiki/UTF-8 [2]: https://en.wikipedia.org/wiki/Locale#POSIX_platforms Signed-off-by: W. Trevor King <wking@tremily.us> Reviewed-by: Jesse Butler <jeeves.butler@gmail.com>
- Loading branch information