Skip to content

Commit

Permalink
Include aliases in the Available operations
Browse files Browse the repository at this point in the history
The Available abstract operations (e.g. AvailableCalendars) should return
all possible aliases, so that other places in the spec (e.g. the
Temporal.Calendar constructor) can use them to determine whether a given
input value is valid. This input value can subsequently be canonicalized
by another abstract operation (e.g. CanonicalizeCalendar).

In Intl.supportedValuesOf(), on the other hand, we should _not_ return all
possible aliases, so we filter them out using a Canonicalize operation
before returning the list of Available codes as an array to the caller.

Not all of the kinds of codes here have aliases, and not even all of them
have a concept of "canonical". A quick investigation shows:

- Calendar: aliased; case-regularized; limited values "available" but any
  well-formed value accepted, unknown values coerced to the locale's
  default

- Collation: not aliased; case-regularized; limited values "available" but
  any well-formed value accepted, unknown values coerced to the locale's
  default

- Currency: not sure if it is aliased because I don't have a copy of
  ISO 4217; case-regularized; limited values "available" but any
  well-formed value accepted and used

- Numbering system: not aliased; not case-regularized; limited values
  "available" but any well-formed value accepted, unknown values coerced
  to the locale's default

- Time zone: aliased; case-regularized

- Unit of measurement: not aliased; not case-regularized; limited values
  "available" but simple combinations of core values also accepted and
  used

So, I conclude that we need Canonicalize operations for calendars, time
zones, and possibly currency units.

An alternative approach would be to write Canonicalize operations for all
of the kinds of codes, and have them perform the case-regularization (or
for numbering systems and units of measurement they would be no-ops).

Closes: tc39#37
  • Loading branch information
ptomato committed Jul 16, 2022
1 parent 95efe24 commit 3485c77
Show file tree
Hide file tree
Showing 6 changed files with 158 additions and 12 deletions.
19 changes: 17 additions & 2 deletions biblio.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,16 @@
{
"location": "https://tc39.es/ecma402/",
"entries": [
{
"type": "op",
"aoid": "CanonicalizeLocaleList",
"id": "sec-canonicalizelocalelist"
},
{
"type": "op",
"aoid": "CanonicalizeUnicodeLocaleId",
"id": "sec-canonicalizeunicodelocaleid"
},
{
"type": "op",
"aoid": "GetOptionsObject",
Expand All @@ -14,8 +24,13 @@
},
{
"type": "op",
"aoid": "CanonicalizeLocaleList",
"id": "sec-canonicalizelocalelist"
"aoid": "IsStructurallyValidLanguageTag",
"id": "sec-isstructurallyvalidlanguagetag"
},
{
"type": "op",
"aoid": "IsValidDateTimeFieldCode",
"id": "sec-isvaliddatetimefieldcode"
},
{
"type": "op",
Expand Down
41 changes: 41 additions & 0 deletions displaynames.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
<emu-clause id="intl-displaynames-objects">
<h1>DisplayNames Objects</h1>

<p>...</p>

<emu-clause id="sec-intl-displaynames-abstracts">
<h1>Abstract Operations for DisplayNames Objects</h1>

<emu-clause id="sec-canonicalcodefordisplaynames" aoid="CanonicalCodeForDisplayNames">
<h1>CanonicalCodeForDisplayNames ( _type_, _code_ )</h1>
<p>
The CanonicalCodeForDisplayNames abstract operation takes arguments _type_ (a String) and _code_ (a String). It verifies that the _code_ argument represents a well-formed code according to the _type_ argument and returns the case-regularized form of the _code_. The algorithm refers to <a href="https://www.unicode.org/reports/tr35/#Identifiers">UTS 35's Unicode Language and Locale Identifiers grammar</a>. The following steps are taken:
</p>
<emu-alg>
1. If _type_ is *"language"*, then
1. If _code_ does not match the `unicode_language_id` production, throw a *RangeError* exception.
1. If ! IsStructurallyValidLanguageTag(_code_) is *false*, throw a *RangeError* exception.
1. Return ! CanonicalizeUnicodeLocaleId(_code_).
1. If _type_ is *"region"*, then
1. If _code_ does not match the `unicode_region_subtag` production, throw a *RangeError* exception.
1. Return the ASCII-uppercase of _code_.
1. If _type_ is *"script"*, then
1. If _code_ does not match the `unicode_script_subtag` production, throw a *RangeError* exception.
1. Assert: The length of _code_ is 4, and every code unit of _code_ represents an ASCII letter (0x0041 through 0x005A and 0x0061 through 0x007A, both inclusive).
1. Let _first_ be the ASCII-uppercase of the substring of _code_ from 0 to 1.
1. Let _rest_ be the ASCII-lowercase of the substring of _code_ from 1.
1. Return the string-concatenation of _first_ and _rest_.
1. If _type_ is *"calendar"*, then
1. If _code_ does not match the Unicode Locale Identifier `type` nonterminal, throw a *RangeError* exception.
1. If _code_ uses any of the backwards compatibility syntax described in <a href="https://unicode.org/reports/tr35/#BCP_47_Conformance">Unicode Technical Standard #35 LDML § 3.3 BCP 47 Conformance</a>, throw a *RangeError* exception.
1. Return the ASCII-lowercase of _code_.
1. If _type_ is *"dateTimeField"*, then
1. If the result of IsValidDateTimeFieldCode(_code_) is *false*, throw a *RangeError* exception.
1. Return _code_.
1. Assert: _type_ is *"currency"*.
1. If ! IsWellFormedCurrencyCode(_code_) is *false*, throw a *RangeError* exception.
1. Return <del>the ASCII-uppercase of _code_</del><ins>CanonicalizeCurrency(_code_)</ins>.
</emu-alg>
</emu-clause>
</emu-clause>
</emu-clause>
16 changes: 15 additions & 1 deletion intl.html
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,21 @@ <h1>Intl.supportedValuesOf ( _key_ )</h1>
1. Let _list_ be AvailableUnits( ).
1. Else,
1. Throw a *RangeError* exception.
1. Return CreateArrayFromList( _list_ ).
1. Let _result_ be a new empty List.
1. For each element _value_ of _list_, do
1. If _key_ is *"calendar"*, then
1. Let _canonical_ be CanonicalizeCalendar(_value_).
1. Else if _key_ is *"currency"*, then
1. Let _canonical_ be CanonicalizeCurrency(_value_).
1. <mark>NOTE: If CanonicalizeCurrency only does a case-transformation, then the above line is a no-op and can be removed.</mark>
1. Else if _key_ is *"timeZone"*, then
1. Assert: IsValidTimeZoneName(_value_) is *true*.
1. Let _canonical_ be CanonicalizeTimeZoneName(_value_).
1. Else,
1. Let _canonical_ be _value_.
1. If _result_ does not contain an element equal to _canonical_, then
1. Append _canonical_ to the end of _result_.
1. Return CreateArrayFromList( _result_ ).
</emu-alg>
</emu-clause>
</ins>
Expand Down
48 changes: 39 additions & 9 deletions locales-currencies-tz.html
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,23 @@ <h1>IsWellFormedCurrencyCode ( _currency_ )</h1>
</emu-clause>

<ins class="block">
<emu-clause id="sec-canonicalizecurrency" type="abstract operation">
<h1>
CanonicalizeCurrency (
_currency_: a String that is a well-formed currency code as verified by IsWellFormedCurrencyCode,
): a String
</h1>
<dl class="header">
<dt>description</dt>
<dd>The returned String is the well-formed and upper case canonicalized 3-letter ISO 4217 currency code corresponding to _currency_.</dd>
</dl>
<emu-alg>
1. Let _normalized_ be the ASCII-uppercase of _currency_.
1. <mark>NOTE: Apply any other aliasing transformations here; I'm not sure if ISO 4217 defines any because I don't have a copy of it. If not, then this abstract operation can be removed.</mark>
1. Return _normalized_.
</emu-alg>
</emu-clause>

<emu-clause id="sec-availablecurrencies" type="implementation-defined abstract operation">
<h1>AvailableCurrencies (
): a List of Strings</h1>
Expand Down Expand Up @@ -135,15 +152,11 @@ <h1>
</dl>

<emu-alg>
1. Let _names_ be a List of all supported Zone and Link names in the IANA Time Zone Database.
1. Let _result_ be a new empty List.
1. For each element _name_ of _names_, do
1. Assert: ! IsValidTimeZoneName( _name_ ) is *true*.
1. Let _canonical_ be ! CanonicalizeTimeZoneName( _name_ ).
1. If _result_ does not contain an element equal to _canonical_, then
1. Append _canonical_ to the end of _result_.
1. Sort _result_ in order as if an Array of the same values had been sorted using %Array.prototype.sort% using *undefined* as _comparefn_.
1. Return _result_.
1. Let _timeZones_ be a List containing the String value of each Zone or Link name in the IANA Time Zone Database that is supported by the implementation.
1. Assert: _timeZones_ contains *"UTC"*.
1. Assert: _timeZones_ does not contain any element that does not identify a Zone or Link name in the IANA Time Zone Database.
1. Sort _timeZones_ in order as if an Array of the same values had been sorted using %Array.prototype.sort% with *undefined* as _comparefn_.
1. Return _timeZones_.
</emu-alg>
</emu-clause>
</ins>
Expand Down Expand Up @@ -304,6 +317,23 @@ <h1>Calendar Types</h1>
The ECMAScript 2023 Internationalization API Specification identifies calendars using a <em>calendar type</em> as defined by <a href="https://unicode.org/reports/tr35/tr35-dates.html#Calendar_Elements">Unicode Technical Standard #35, Part 4, Section 2</a>. Their canonical form is a string containing all lower case letters with zero or more hyphens.
</p>

<emu-clause id="sec-canonicalizecalendar" type="abstract operation">
<h1>
CanonicalizeCalendar (
_id_: a String that is a calendar type,
): a String that is a calendar type
</h1>
<dl class="header">
<dt>description</dt>
<dd>It returns the canonical and case-regularized form of _id_.</dd>
</dl>
<emu-alg>
1. Let _normalized_ be the ASCII-lowercase of _id_.
1. If _normalized_ is an aliased calendar type, return the canonical calendar type. For example, if _normalized_ is *"islamicc"*, return *"islamic-civil"*.
1. Return _normalized_.
</emu-alg>
</emu-clause>

<emu-clause id="sec-availablecalendars" type="implementation-defined abstract operation">
<h1>AvailableCalendars (
): a List of Strings</h1>
Expand Down
44 changes: 44 additions & 0 deletions numberformat.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
<emu-clause id="numberformat-objects">
<h1>NumberFormat Objects</h1>

<emu-clause id="sec-intl-numberformat-constructor">
<h1>The Intl.NumberFormat Constructor</h1>

<p>...</p>

<emu-clause id="sec-setnumberformatunitoptions" aoid="SetNumberFormatUnitOptions">
<h1>SetNumberFormatUnitOptions ( _intlObj_, _options_ )</h1>
<p>
The abstract operation SetNumberFormatUnitOptions resolves the user-specified options relating to units onto the intl object.
</p>
<emu-alg>
1. Assert: Type(_intlObj_) is Object.
1. Assert: Type(_options_) is Object.
1. Let _style_ be ? GetOption(_options_, *"style"*, *"string"*, &laquo; *"decimal"*, *"percent"*, *"currency"*, *"unit"* &raquo;, *"decimal"*).
1. Set _intlObj_.[[Style]] to _style_.
1. Let _currency_ be ? GetOption(_options_, *"currency"*, *"string"*, *undefined*, *undefined*).
1. If _currency_ is *undefined*, then
1. If _style_ is *"currency"*, throw a *TypeError* exception.
1. Else,
1. If ! IsWellFormedCurrencyCode(_currency_) is *false*, throw a *RangeError* exception.
1. Let _currencyDisplay_ be ? GetOption(_options_, *"currencyDisplay"*, *"string"*, &laquo; *"code"*, *"symbol"*, *"narrowSymbol"*, *"name"* &raquo;, *"symbol"*).
1. Let _currencySign_ be ? GetOption(_options_, *"currencySign"*, *"string"*, &laquo; *"standard"*, *"accounting"* &raquo;, *"standard"*).
1. Let _unit_ be ? GetOption(_options_, *"unit"*, *"string"*, *undefined*, *undefined*).
1. If _unit_ is *undefined*, then
1. If _style_ is *"unit"*, throw a *TypeError* exception.
1. Else,
1. If ! IsWellFormedUnitIdentifier(_unit_) is *false*, throw a *RangeError* exception.
1. Let _unitDisplay_ be ? GetOption(_options_, *"unitDisplay"*, *"string"*, &laquo; *"short"*, *"narrow"*, *"long"* &raquo;, *"short"*).
1. If _style_ is *"currency"*, then
1. Set _intlObj_.[[Currency]] to <del>the ASCII-uppercase of _currency_</del><ins>CanonicalizeCurrency(_currency_)</ins>.
1. Set _intlObj_.[[CurrencyDisplay]] to _currencyDisplay_.
1. Set _intlObj_.[[CurrencySign]] to _currencySign_.
1. If _style_ is *"unit"*, then
1. Set _intlObj_.[[Unit]] to _unit_.
1. Set _intlObj_.[[UnitDisplay]] to _unitDisplay_.
</emu-alg>
</emu-clause>
</emu-clause>

<p>...</p>
</emu-clause>
2 changes: 2 additions & 0 deletions spec.emu
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,5 @@ contributors: Google, Ecma International
</emu-intro>
<emu-import href="./locales-currencies-tz.html"></emu-import>
<emu-import href="./intl.html"></emu-import>
<emu-import href="./displaynames.html"></emu-import>
<emu-import href="./numberformat.html"></emu-import>

0 comments on commit 3485c77

Please sign in to comment.