-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft plan to align canonical time zone IDs across implementations #806
Comments
Firefox time zone canonicalisation always returns an IANA tzdata Zone, potentially using a Zone entry from
(Mentioning only The time zone canonicalisation overrides in Firefox don't take js> new Intl.DateTimeFormat("en", {timeZone: "Asia/Chungking"}).resolvedOptions().timeZone
"Asia/Chongqing" whereas V8/JSC return But we only use js> new Intl.DateTimeFormat("en", {timeZone: "Asia/Hanoi"}).resolvedOptions().timeZone
typein:1:1 RangeError: invalid time zone in DateTimeFormat(): Asia/Hanoi |
@anba Looking at #272, it seems that the main goal of using Is that correct? If so, then would using I'm asking because |
I updated the OP with anba's info, and added some pseudocode to clarify what "use zone.tab" would mean. |
And yes, the main reason for using
That means that for example It seems like sometimes it's still necessary to look into Hmm, apropos ICU region overrides: We have to watch out how strictly we make zone.tab normative. For example TimeZonesOfLocale with strict reference to zone.tab places
|
Oops, updated now.
Yes. And, using a more populous zone as another example, it'd mean that Europe/Bratislava would be a primary ID, instead of the current state in Firefox where Europe/Bratislava is a non-primary ID that resolves to Europe/Prague.
There are two questions here: how should we determine which IDs are primary, and what time zone rules should be used? For the first question of which IDs are primary, AFAICT (although not 100% sure, so please let me know if this is a wrong assumption) that only
Will this work? Or are there cases I'm not thinking of? For the second question of which time zone rules to use, I think there are two options that we should use for all time zones:
AFAIK, all major browsers seem to use option (2), so I'd be OK to leave this as-is unless it attracts a lot of user complaints.
I don't fully understand how those ICU overrides fit into my "use
So far I've only been thinking of using Do you have an idea for how region=>Zone resolution could work if all
Agreed. Although the spec text you linked above doesn't seem to be very limiting, because it just says "of those in common use in region" without defining exactly what "in common use" means. If we can come up with a good algorithm for region=>Zone mapping, then should we make that spec text more explicit so that implementations will remain more consistent? |
I've picked
There are two issues with the proposed algorithm:
Assuming IANA tzdata files are parsed in Vanguard format. When parsing in Rearguard format, the (2) Some Links won't get resolved to the expected Zone. For example Here's a detailed list of Link names, the proposed target when using only
In addition to the aforementioned
And a list of all Links, including
And finally a list of Links, including
The only problematic new proposed target is resolving
No, I haven't yet looked into that. |
Great conversation here, thanks. FYI, CLDR is proposing to add the IDs from zone.tab into CLDR data in cases where the CLDR canonical ID (the first ID in the list) is not the one listed in zone.tab. See unicode-org/cldr#3105. I spot-checked Yoshito's work in that PR and it looks like every problematic ID has an Assuming that PR lands, do you think we should simplify the spec by simply referring to CLDR as ECMAScript's source of IDs (including which ones are primary vs. non-primary), instead of trying to define the algorithm for how we interpret the IANA time zone database? It seems like (with Yoshito's PR landed) CLDR may be closer in use cases and intent than TZDB which seems to be diverging quite a bit from what ECMAScript wants, at least in terms of supporting the at-least-one-zone-per-country model that ECMAScript prefers. Do you know if there are any IDs are missing from CLDR data? (https://github.com/unicode-org/cldr/blob/main/common/bcp47/timezone.xml)
Regardless, do you have a preference for whether ECMAScript implementations should use vanguard or rearguard?
Weirdly, despite the text in the comment, in order to get the output of the TZDB makefile to include these three lines, you need to use
Agreed. In addition (much lower priority) I think that Antarctica/South_Pole should resolve to Antarctica/McMurdo not Pacific/Auckland. Note that the CLDR data in Yoshito's PR would enable this mapping too (in addition to Chuuk). |
FYI, there's now a proposed ICU API that will expose the CLDR data linked above. See https://sourceforge.net/p/icu/mailman/message/37881038/ for API details. |
Ah, right.
I don't think it matters right now, because vanguard or rearguard is mostly about supporting negative daylight saving time. And as long as we don't have a method which returns the difference from the current time zone offset to the standard time zone offset, cf. |
@justingrant - this is great. I'm hoping it will also fix Thanks. |
New ICU API-
see https://github.com/unicode-org/icu/blob/main/icu4c/source/i18n/unicode/timezone.h |
My understanding of the state of this issue is:
Does that sound right @justingrant ? |
In the meantime, do we want V8 and JSC to use the new ICU API to be able to return modern IDs like Asia/Calcutta from
Yep, this sounds right. |
Okay, yep, that change seems positive because there's already an expectation in code that the system time zone is subject to change and new identifiers can be added at any time. Is there some sort of PR that can be put up to recommend this behavior in ECMA-402, split from the rest of the proposal in Temporal? |
I'm not sure this necessarily needs any spec changes. Firefox already uses the modern IDs, and @anba has argued (convincingly, IMO) that the spec already requires using the latest IDs. So I think V8 and JSC can simply start using the new ICU APIs. This won't solve all the cross-engine inconsistencies (@anba's comment) highlights a few corner cases, but the most popular ones should be handled by just using ICU. Also, CLDR's data isn't necessarily complete. See https://unicode-org.atlassian.net/browse/CLDR-17111. So there will be mop-up work required, but IMO it will be a lot easier to mop up once Calcutta and Kiev are handled. One thing to watch out for is that these changes may break users who are expecting the old names, so it should be carefully rolled out in Canary before releasing to everyone. |
In the 2024-01-18 meeting of TG2, we discussed part of this issue: whether implementations should move to use newer canonical IDs (e.g. Asia/Ho_Chi-Minh, Asia/Kolkata, Europe/Kyiv) before Temporal lants. Consensus was that we should wait ~6 months to see if Temporal can land first, so that we can avoid changing things twice for users, but if Temporal was delayed then we can reconsider. Looking back up at the OP in this issue, this conclusion answered question (3), and the conclusion was for (3b). We still need to resolve (1) and (2):
My suggestion to resolve both (1) and (2) is that all engines (including Mozilla's SpiderMonkey used in Firefox) should switch to using ICU's new API that returns modern IDs. And if we're unhappy with the canonical values returned by that API, then we should fix the data upstream in CLDR rather than engines overloading on their own. This won't happen for popular zones like Europe/Kyiv, but there are some corner cases and smaller zones (noted earlier in this thread) where it may matter. We can discuss this in a later TG2 meeting. Not urgent. FYI @sffc |
I think this is the same as the proposal that we decided to delay until Temporal, right? Using the new ICU API would mean a user-visible change, which we should just roll out at the same time as the Temporal change. |
Not necessarily. There are differences in how V8 vs. JSC vs. SM deal with ICU vis-a-vis canonicalization. For example, the way that Firefox currently reports the modern IDs is that AFAIK SM doesn't use ICU for canonicalization at all, but instead builds the canonical mapping separately. I think we should ask all engines to use ICU's new API for canonicalization as part of their work to support Temporal, and if there are problems with the underlying CLDR data then we should raise them with CLDR now so that they'll be fixed in time for the release of Temporal on each engine. In practical terms, this means that we'd replace the current answer to (1) in the OP with simply "Use ICU's new API, and if the results have problems then work with CLDR to fix the data". My understanding is that the above is what V8 is planning. JSC is an interesting case where AFAIK it uses the OS's copy of ICU rather than bundling it into Safari like Chrome does. So there may be more lead time required to make that change than the other engines. I'm not sure what are the implications of this longer lead time, although @Constellation may know. For SpiderMonkey, there's minimal user impact to switching to ICU's API because the only canonicalizations that will change are obscure cases. But if those obscure cases are blockers for SM, then we should figure that out now so SM can retire the custom canonicalization implementation. @anba, do you think that ICU's new API is now close enough for you to use it? |
TG2 discussion: https://github.com/tc39/ecma402/blob/master/meetings/notes-2024-01-18.md#draft-plan-to-align-canonical-time-zone-ids-across-implementations-806 Conclusion (written before @justingrant's comment above): Do not make any changes right now. Wait for Temporal and make a change then. If there is a change to Temporal's timeline, then potentially revisit this. |
This issue proposes some ideas and a draft plan for how implementations can align on a common set of canonical time zone IDs, in order to fix problems like:
There are three questions to answer:
This is an early draft, so please let me know if I made mistakes below or if you see a better way to achieve the goal of using up-to-date canonical IDs in ECMAScript. Note that this plan below is complimentary but unrelated to the now-Stage 3 proposal-caonical-tz proposal.
ECMA-262 currently uses the terms "primary time zone identifier" and "non-primary time zone identifier" instead of "canonical" and "non-canonical". I'm mostly using the newer terms in this issue, but for clarity I use "canonical" when referring to ICU's output, because that's what ICU calls it.
Feedback is welcome, especially from @sffc @FrankYFTang @anba @Constellation @gibson042 @dminor.
1. Which time zone IDs should be primary?
To avoid messy geopolitical judgement calls, I recommend that we defer to the IANA Time Zone Database to decide which IDs should be canonical, using the following simple rules:
zone.tab
should be a primary time zone identifier in ECMAScript. Becausezone.tab
includes at least one unique time zone for each ISO 3166-1 country code, if all zone.tab IDs are canonical then time zone changes in a country will not affect any other country.zone.tab
to be non-canonical, then thezone.tab
ID should primary in ECMAScript, and CLDR's outdated canonical ID should be a non-primary time zone identifier that resolves to thezone.tab
ID. This will fix cases where ICU currently returns an outdated ID like Asia/Calcutta and Europe/Kiev.Chrome and Safari, which returns ICU's canonical IDs as-is, currently have 19 IDs that use outdated ICU canonical identifiers. Firefox, which overrides ICU's canonicalization, currently has 11 non-primary IDs that resolve to another country's primary ID, like
Europe/Bratislava
resolving toEurope/Prague
. This proposal would change those engines' behavior to follow the rules above.In actual implementation pseudocode, what I'm proposing is this:
The JSON objects below ere generated by a simple JS app using code that's similar to the pseudocode above. You can run and edit it at https://codesandbox.io/s/zone-tab-mismatches-mlf93j.
For Chrome and Safari, the object below lists IDs from
zone.tab
where ICU uses an outdated ID. The key is the ICU ID and the value is what should be primary.For Firefox, the keys of the object below are
zone.tab
IDs that are not canonical in Firefox. Unlike Chrome/Safari discussed above, Firefox's overrides ICU's canonicalization using the TZDBbackward
file.These overrides solve the outdated IDs problem that Chrome and Safari have, but they introduce a new problem: some IDs merge multiple ISO 3166-1 country codes. For example, Slovakia's time zone resolves in Firefox to Europe/Prague in the Czech Republic, but Europe/Bratislava should also be primary. Using zone.tab instead of backward to power the overrides should fix this problem.
2. How should these canonicalization changes get into implementations?
@sffc and others recommend that CLDR and ICU be the right long-term home for all time zone info, including canonicalization. Although CLDR is currently designing a solution to expose IANA canonical IDs, it's unlikely a solution in CLDR and ICU will ship until 2024 at the earliest.
For V8 and JSC, there are only 19 outdated names, and new renames are very rare: only 4 in the last 8 years. Should we hard-code these 19 mappings until CLDR and ICU delivers the long-term solution? If not, is there another way to speed up these changes?
For Firefox, the change would be to use
zone.tab
instead ofbackward
.3. When should we ship these changes?
Here's a few options for when to ship these changes. Which do you prefer?
Temporal.TimeZone
ships. It'll include proposal-canonical-tz to stop canonicalizing user-inputted IDs, so therefore less userland code should be affected by the primary ID changes.My preference would be for (b), because it seems less risk of breaking the web than (a). But I could also be convinced that (a) is OK, especially if we're able to run tests beforehand on a small % of users before it's rolled out to everyone. Do browsers have a way to do tests like that?
I'd support (d) if we're able to verify through testing of real apps that these changes would be too disruptive.
Notes
We were originally hoping to tackle this plan as part of proposal-caonical-tz, but that proposal just reached Stage 3 so we're moving the IDs plan into ECMA-402 because the scope of the proposal is now locked down.
The text was updated successfully, but these errors were encountered: