Skip to content

Commit

Permalink
Punt to the operating system for character encodings
Browse files Browse the repository at this point in the history
Without this, "may contain any Unicode characters" seemed too
ambiguous.

I wish there were cleaner references for the {language}.{encoding}
locales like en_US.UTF-8 and UTF-8.  But [1,2] seems too glib, and I
can't find a more targetted UTF-8 link than just dropping folks into a
Unicode chapter (which is what [1] does):

  The Unicode Standard, Version 6.0, §3.9 D92, §3.10 D95 (2011)

With the current v8.0 (2015-06-17), it's still §3.9 D92 and §3.10 D95.

The TR35 link is for:

  In addition, POSIX locales may also specify the character encoding,
  which requires the data to be transformed into that target encoding.

and the POSIX §6.2 link is for:

  In other locales, the presence, meaning, and representation of any
  additional characters are locale-specific.

[1]: https://en.wikipedia.org/wiki/UTF-8
[2]: https://en.wikipedia.org/wiki/Locale#POSIX_platforms

Signed-off-by: W. Trevor King <wking@tremily.us>
Reviewed-by: Jesse Butler <jeeves.butler@gmail.com>
  • Loading branch information
wking committed Dec 5, 2015
1 parent ffdd704 commit 3606bcf
Showing 1 changed file with 9 additions and 0 deletions.
9 changes: 9 additions & 0 deletions runtime.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@ $ funC [global-options] <COMMAND> [command-specific-options] <command-specific-a
None are required, but the runtime may support options that start with at least one hyphen.
Global options may take positional arguments (e.g. `--log-level debug`), but the option parsing must be such that `funC <COMMAND>` is unambiguously an invocation of `<COMMAND>` for any `<COMMAND>` that does not start with a hyphen (including commands not specified in this document).

## Character encodings

This API specification does not cover character encodings, but runtimes should conform to their native operating system.
For example, POSIX systems define [`LANG` and related environment variables][posix-lang] for [declaring][posix-locale-encoding] [locale-specific character encodings][posix-encoding], so a runtime in an `en_US.UTF-8` locale should write its [version](#version) to stdout in [UTF-8][].

## Commands

### version
Expand Down Expand Up @@ -141,5 +146,9 @@ $ echo $?
0
```

[posix-encoding]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap06.html#tag_06_02
[posix-lang]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_02
[posix-locale-encoding: http://www.unicode.org/reports/tr35/#Bundle_vs_Item_Lookup
[standard-streams]: https://github.com/opencontainers/specs/blob/v0.1.1/runtime-linux.md#file-descriptors
[systemd-listen-fds]: http://www.freedesktop.org/software/systemd/man/sd_listen_fds.html
[UTF-8]: http://www.unicode.org/versions/Unicode8.0.0/ch03.pdf

0 comments on commit 3606bcf

Please sign in to comment.