Skip to content
This repository has been archived by the owner on Jan 25, 2022. It is now read-only.

Date-time field widths not set for PartitionDateTimePattern #45

Closed
anba opened this issue May 27, 2020 · 12 comments · Fixed by #52
Closed

Date-time field widths not set for PartitionDateTimePattern #45

anba opened this issue May 27, 2020 · 12 comments · Fixed by #52
Assignees

Comments

@anba
Copy link
Contributor

anba commented May 27, 2020

DateTimeStylePattern only returns a pattern string, let's say for dateStyle the string "{day}.{month}.{year}". When this pattern string is processed in 13.1.6 PartitionDateTimePattern:

Let f be the value of dateTimeFormat's internal slot whose name is the Internal Slot column of the matching row.

We hit a problem, because the internal slots were never set in InitializeDateTimeFormat, so they still default to undefined.

Three things to fix:

@zbraniecki
Copy link
Member

@anba, I'm confused about your point:

Extend https://tc39.es/proposal-intl-datetime-style/#sec-intl.datetimeformat-internal-slots to include the date-time component widths.

The spec has:

[[LocaleData]][locale] must contain [[DateFormat]], [[TimeFormat]] and [[DateTimeFormat]] fields, the value of these fields are Records, where each of which has [[full]], [[long]], [[medium]] and [[short]] fields. These fields have string pattern values for [[DateFormat]] and [[DateTimeFormat]]; for [[TimeFormat]], they are records with [[pattern]] and [[pattern12]] fields, which are string patterns. For [[DateTimeFormat]], the field values must contain the strings "{0}" and "{1}".

Apply them in InitializeDateTimeFormat.

InitializeDateTimeFormat has 37.b:

Let pattern be DateTimeStylePattern(dateStyle, timeStyle, dataLocaleData, hc).

Modify Intl.DateTimeFormat.prototype.resolvedOptions to not output these fields when dateStyle or timeStyle is present.

In 1.4.1 we use Table 2 which has dateStyle and timeStyle and expose the results for them if those fields are set.

I'm confused about your report, can you clarify?

@anba
Copy link
Contributor Author

anba commented Jun 6, 2020

Sure. Let's take new Intl.DateTimeFormat("en", {timeStyle: "short"}). InitializeDateTimeFormat will call DateTimeStylePattern in step 37.b. The returned pattern string is going to be {hour}:{minute} {ampm} (based on the CLDR pattern string h:mm a).

Pattern strings in ECMA-402 don't include any field widths, so when the string {hour}:{minute} {ampm} is eventually processed in PartitionDateTimePattern, the correct field widths are read in step 14.c.i from the DateTimeFormat object. But as currently spec'ed, we don't set the field widths in InitializeDateTimeFormat, so the variable f in step 14.c.i will have the value undefined. (When date/timeStyle is not used, InitializeDateTimeFormat step 38.d.iii.1 sets the field widths!)

And because f has the value undefined, the PartitionDateTimePattern operation can't be executed.

So we need to fix InitializeDateTimeFormat to set the field widths when DateTimeStylePattern is used. And instead of simply storing {hour}:{minute} {ampm} as the pattern string, we need to store a record similar to:

{
  [[pattern]]: "{hour}:{minute} {ampm}",
  [[hour]]: "numeric",
  [[minute]]: "2-digit",
}

DateTimeStylePattern must then be modified to return a record, possibly by combining two records when date- and time-style parts are present.

The record returned from DateTimeStylePattern must then be applied to the DateTimeFormat instance in InitializeDateTimeFormat.

All this will ensure PartitionDateTimePattern can properly format the pattern string.

Furthermore, we will then need to adjust resolvedOptions, because the other date-time internal slots are no longer undefined when dateStyle/timeStyle is used. (That way we ensure new Intl.DateTimeFormat("en", {timeStyle: "short"}).resolvedOptions().hour === undefined, as it is already currently spec'ed.)


The overall record structure will need to be something like:

{
  [[TimeFormat]]: {
    [[full]]: ...,
    [[long]]: ...,
    [[medium]]: ...,
    [[short]]: {
      [[pattern]]: {
        [[string]]: "{hour}:{minute}",
        [[hour]]: "2-digit",
        [[minute]]: "2-digit",
      },
      [[pattern12]]: {
        [[string]]: "{hour}:{minute} {ampm}",
        [[hour]]: "numeric",
        [[minute]]: "2-digit",
      },
    },
  },
  [[DateFormat]]: ...,
  [[DateTimeFormat]]: ...,
}

We can't use:

[[short]]: {
  [[pattern]]: "{hour}:{minute}",
  [[pattern12]]: "{hour}:{minute} {ampm}",
  [[hour]]: "2-digit",
  [[minute]]: "2-digit",
}

because CLDR/ICU may give different field widths for different hour-cycles. For example, hour is "numeric" for "h12", but "2-digit" for "h23" for "en".

(Edit: See the next comment for why this may actually be okay to do for now.)

@anba
Copy link
Contributor Author

anba commented Jun 6, 2020

[...] because CLDR/ICU may give different field widths for different hour-cycles. For example, hour is "numeric" for "h12", but "2-digit" for "h23" for "en".

Hmm, this actually already a problem in the main ECMA-402 spec when compared to what's implemented in engines:

js> new Date("2020-01-01T01:02:03").toLocaleString("en-u-hc-h12")
"1/1/2020, 1:02:03 AM"
js> new Date("2020-01-01T01:02:03").toLocaleString("en-u-hc-h23") 
"1/1/2020, 01:02:03"

Per spec it is illegal to format hour once as "1" and another time as "01". So I think you can punt on the issue about different field widths for [[pattern]] and [[pattern12]], and use this record structure:

[[short]]: {
  [[pattern]]: "{hour}:{minute}",
  [[pattern12]]: "{hour}:{minute} {ampm}",
  [[hour]]: "2-digit",
  [[minute]]: "2-digit",
}

@zbraniecki
Copy link
Member

Ahh! Thank you! This is an excellent introspection into the issue, and very actionable for me!

So I think you can punt on the issue about different field widths for [[pattern]] and [[pattern12]], and use this record structure:

Am I correct that we should file it against ecma402?

@zbraniecki
Copy link
Member

@anba

I started #48 to address this, and first focused on the InitializeDateTimeFormat fix to set the field widths.
I used the simplified model pending the ecmar402 issue.

I have some questions about the other two:

Extend https://tc39.es/proposal-intl-datetime-style/#sec-intl.datetimeformat-internal-slots to include the date-time component widths.

I'm not sure how to rewrite the paragraph we're adding with this patch to match what you're asking about. We do have Table 2 fields in it already, right? What's missing then?

Modify Intl.DateTimeFormat.prototype.resolvedOptions to not output these fields when dateStyle or timeStyle is present.

I'm not sure if this is the right thing to do.

let dtf = new Intl.DateTimeFormat(undefined, {
  dateStyle: "long"
});
// Option 1
dtf.resolvedOptions(); // {dateStyle: "long"}

// Option 2
dtd.resolvedOptions(); // {dateStyle: "long", "year": "numeric", month: "long", "day: "numeric"}

Are we sure we want the former, and not he latter?

@sffc @littledan @anba?

@sffc
Copy link

sffc commented Jun 7, 2020

It might be nice to have the fields be returned in resolvedOptions, rather than undefined. I find it a bit surprising that we currently return only dateStyle but not the fields to which the dateStyle resolved. I don't have a great sense of the consequences of this decision, though.

@anba
Copy link
Contributor Author

anba commented Jun 8, 2020

I don't have a great sense of the consequences of this decision, though.

One consequence is that the resolved options can no longer be used to copy an Intl.DateTimeFormat instance, because of #43.

@anba
Copy link
Contributor Author

anba commented Jun 8, 2020

I'm not sure how to rewrite the paragraph we're adding with this patch to match what you're asking about. We do have Table 2 fields in it already, right? What's missing then?

Instead of:

[[LocaleData]][locale] must contain [[DateFormat]], [[TimeFormat]] and [[DateTimeFormat]] fields, the value of these fields are Records, where each of which has [[full]], [[long]], [[medium]] and [[short]] fields. These fields have string pattern values for [[DateFormat]] and [[DateTimeFormat]]; for [[TimeFormat]], they are records with [[pattern]] and [[pattern12]] fields, which are string patterns. For [[DateTimeFormat]], the field values must contain the strings "{0}" and "{1}".

We need something like:

[[LocaleData]][locale] must contain [[DateFormat]], [[TimeFormat]] and [[DateTimeFormat]] fields, the value of these fields are Records, where each of which has [[full]], [[long]], [[medium]] and [[short]] fields. For [[DateFormat]] and [[TimeFormat]], the value of these fields must be a record, which has a subset of the fields shown in Table 1, where each field must have one of the values specified for the field in Table 1. Each of the records must also have a pattern field, whose value is a String value that contains for each of the date and time format component fields of the record a substring starting with "{", followed by the name of the field, followed by "}". If the record has an hour field, it must also have a pattern12 field, whose value is a String value that, in addition to the substrings of the pattern field, contains a substring "{ampm}". For [[DateTimeFormat]], the field value must be a string pattern which contains the strings "{0}" and "{1}".

@anba
Copy link
Contributor Author

anba commented Jun 8, 2020

Am I correct that we should file it against ecma402?

Yes, I think that makes sense. We have to ask ourselves if it makes sense to allow that the hour-cycle option affects the pattern selection algorithm.

From what I can tell, this only affects the "hour", "minute", and "second" fields. Theoretically ICU supports the UDATPG_MATCH_ALL_FIELDS_LENGTH option to enforce a specific field width, but in practice this option can't be used due to the issue described in https://bugzilla.mozilla.org/show_bug.cgi?id=1284868#c14.

@sffc
Copy link

sffc commented Jun 8, 2020

One consequence is that the resolved options can no longer be used to copy an Intl.DateTimeFormat instance, because of #43.

We have some properties in Intl that get remapped; for example, style: "currency" causes fraction digits to be recalculated in Intl.NumberFormat.

One other approach here could be to not return dateStyle at all, and only return the fields. But, that might not roundtrip completely, due to the formatRange bug. However, hopefully that bug will be fixed soon on the CLDR side.

https://unicode-org.atlassian.net/browse/CLDR-13425

@zbraniecki
Copy link
Member

I spun off #49 and tc39/ecma402#454 so that the current scope of this issue remains just what's in #48.

Hope that works for everyone.

@sffc
Copy link

sffc commented Jun 8, 2020

Sounds good. Keep this issue for the editorial stuff.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants