diff --git a/meetings/notes-2024-03-28.md b/meetings/notes-2024-03-28.md new file mode 100644 index 00000000..2aea2703 --- /dev/null +++ b/meetings/notes-2024-03-28.md @@ -0,0 +1,264 @@ +# 2023-03-28 ECMA-402 Meeting + +## Logistics + +### Attendees + +- Shane Carr - Google i18n (SFC), Co-Moderator +- Ben Allen - Igalia (BAN) +- Ujjwal Sharma - Igalia (USA), Co-Moderator +- Eemeli Aro - Mozilla (EAO) +- Jesse Alama - Igalia (JMN) +- Frank Yung-Fong Tang - Google i18n, V8 (FYT) +- Louis-Aimé de Fouquières - Invited Expert (LAF) +- Richard Gibson - OpenJS Foundation (RGN) +- Yusuke Suzuki - Apple (YSZ) +- Henri Sivonen - Mozilla (HJS) (later part of meeting) + +### Standing items + +- [Discussion Board](https://github.com/tc39/ecma402/projects/2) +- [Status Wiki](https://github.com/tc39/ecma402/wiki/Proposal-and-PR-Progress-Tracking) -- please update! +- [Abbreviations](https://github.com/tc39/notes/blob/master/delegates.txt) +- [MDN Tracking](https://github.com/tc39/ecma402-mdn) +- [Meeting Calendar](https://calendar.google.com/calendar/embed?src=unicode.org_nubvqveeeol570uuu7kri513vc%40group.calendar.google.com) +- [Matrix](https://matrix.to/#/#tc39-ecma402:matrix.org) + +## Status Updates + +### Updates from the Editors + +RGN: Cleanup changes from last month, big one upcoming with aligning to BCP-47. Limited to editorial work until next plenary. + +### Updates from the MessageFormat Working Group + +EAO: Most of the current work is editorial; the release of CLDR 45, and thereby the tech preview, is upcoming. Along with the simultaneous ICU releases, ICU4J has landed and the ICU4C implementation is in last steps of review, hopefully landing this week. Other than that, we’ve been identifying matters to address during tech preview, discussions that have been left partly incomplete. + +### Updates from Implementers + +https://github.com/tc39/ecma402/wiki/Proposal-and-PR-Progress-Tracking + +USA: One new PR, 839, apart from that we’ve been quiet on normative changes, since we’re not accepting them until next edition. I will reach out to implementers to gather implementation information from JSC. + +FYT: 1st action item should be to designate whether feature has been shipped, or at least we need to know whether it’s been addressed. As a committee, we need to deal with MDN and tests. + +BAN: We've been landing DurationFormat tests – coverage for arbitrary combinations of non-numeric styles, for formatting fractional milliseconds and microseconds, for the three different ListFormat outputs depending on the DurationFormat’s base style. + +USA: The tests are sitting in Test262 review. Happy to accept help in reviewing the tests. + +### Updates from the W3C i18n Group + +EAO: No updates + +## Proposals and Discussion Topics + +https://github.com/tc39/ecma402/projects/2 + +### Drop string source support #47 + +https://github.com/tc39/proposal-intl-messageformat/pull/47 + +EAO: We did not reach a conclusion last meeting whether we as TG2 are comfortable leaving out the DSL from the first iteration of MessageFormat. Concerns raised by SFC and FYT that doing this but including the data model representation of messages, that this would potentially create a fracture in the ecosystem that would need to be remediated later. Request from Shane for participants in this group to get wider input on how different people who aren’t actively participating in this call feel about this change. On that note I have spoken with people inside Mozilla, who believe that this change will not introduce a danger of fracturing the ecosystem. Reasons: + +1. The ecosystem is already very fractured – adding one more to that mix does not make it worse +2. A key part of what we’re doing with the data model work in MessageFormat 2 is that we’re bringing in a data structured representation of messages that make it much easier to transform messages from one representation to another. As such, any blocks that currently exist in people being able to migrate from one format to another, or to a future representation, these barriers are going to be significantly reduced going forward. This gets us to advance this proposal without waiting for potentially years for steps to be taken on the syntax parsing. + +SFC: The Google position on this proposal is that we’re excited about the progress of the MessageFormat 2.0 standard, hope to see it make it into the Web platform. Making a data model based proposal without string syntax is concerning to the internationalization team. + +1. First, the primary deliverable artifact for the WG is the syntax, not the data model. The syntax is subject to a significant amount of design work, and is the primary deliverable. + +2. Second, we believe Intl.MessageFormat is only useful with a serialized form. The lack of this would result in users inventing non-standard competing serialized forms. + +3. Third, having the data model alone lends itself to JSON serialization. The verbosity of JSON form is an impediment to authoring, a concern we spent considerable time trying to prevent. + +4. Fourth, the processing of the data model format into a string is relatively easy, which doesn't meet the bar for an ECMA-402 proposal. + +5. Fifth, we are still expecting changes to the data model based on user feedback on the technical preview implementations in ICU; we have already received feedback from the ICU-TC from an initial cursory review. Stabilizing it in ECMA-402 is premature at this point in time. + +Google i18n encourages EAO to continue efforts to standardize this with the syntax. + +We don’t feel that the data model only version of the proposal is the right decision at this time. + +USA: Thank you for raising these points. We need to recognize that the status quo is a compromise, not anyone’s preference. My understanding of how closely the data model is mapped to the actual syntax is that unless there is a lot of abstraction going on, most of the serializers would develop a syntax very similar to the one we designed, just because of how it relates to the data model. Fragmentation is a risk, but MessageFormat and the ability to format messages is the key functionality, and parsing messages is incremental and can be addressed on the userland level. The point I’m making is that there is a big difference between doing everything in MessageFormat in userland, instead of just doing the parsing in userland. + +EAO: My main query here is how much is this Google position established? How much can we through discussion today or later work to change this view, or has it effectively already been established that this is how it is? + +SFC: There’s concerns on many different levels. I hear EAO and USA on the perspective that this is just a compromise, and won’t create the competition of different serialized forms. That’s one of the five concerns. We also believe that the proposal must stand on its own and have its own merits, rather than just being an incremental change. We can’t assume that some other proposal will come after this one that will complete this one, in terms of for example the message bundles syntax. Evaluating this proposal as stands on its own, these are major concerns, and it's not evident to us how they could be surmounted. None of these positions are written in stone, but we nevertheless have trouble supporting advancement to stage 2 without addressing these concerns. + +FYT: The big issue is that we’re talking about JavaScript libraries, and likely transmission to the user. Somewhere in there you have to get into the JavaScript engine, and that has to be done in some form of transport. [...] This will be on the Web, and if you have a verbose form like JSON, you’re increasing load on the internet unnecessarily. Say CNN, say, wanted to use this thing, they have to call the library from the user’s browser, then transmit a lot of data from it (if using JSON). And if they don’t use JSON, they have to invent something new. We have concerns about putting this thing in a JavaScript library – which is different than doing it on the backend, there are very small amount of transport. This is different, since it’s spreading to millions or billions of users – very different concern. + +EAO: I hear you. Some of the points that you raised we can certainly discuss, and overcome differences in views. However, the point that this fundamentally needs to be a syntax proposal cannot be addressed through this sort of dialogue. I think I need to step back from championing these proposals, because I’m not happy having them on my todo list for three, four, five years as they go forward, and to focus more on W3C. I don’t think with this situation that this is going to really move forward here. My question: how does this work in practice, given that I’m the champion for these proposals? For Intl.MessageFormat, Dan Minor is also a champion, but he hasn’t been active recently. + +USA: I hope this proposal doesn’t become abandoned; it’s the most exciting proposal in my opinion. Clarifying question for EAO: in the current design, which still includes the syntax, the data model is still exposed to the user. If we do remove the parsing, assuming that it was fair for folks at the time to have concerns about the syntax, given that it’s now in tech preview and getting more and more stable, could there just be a MessageFormat parse proposal? In the long term, would people even notice? + +EAO: The PR I have up removes 15 lines of spec text. The addition of the parser – which gets to refer to the MessageFormat 2 spec – is a very small spec change. My plan was to introduce a third MessageFormat proposal with the parser, much like the message resource proposal that is adding a resource parser as well. + +USA: I was going to propose that, worst-case scenario, we break it into its own proposal, and not all of MessageFormat is blocked on those ten lines. I’d be glad to co-champion it. I don’t think it’s diplomatic of us to either hold off the rest of the proposal or complicate it by saying “we’re okay with taking this slow” – I think it’s well within the process of TC39 and achieves the final goal. + +SFC: The syntax parser is just 12 lines of spec, but our feeling is that it fundamentally changes the proposal in a significant way. It’s no longer a MessageFormat syntax proposal, it’s a JSON-based string formatting proposal based on a fairly esoteric JSON schema which is still in review. It’s hard to support that proposal. As I said, and I want to doubly emphasize, I want to see MessageFormat get into the standard – that’s definitely something we want to see. We’re only concerned about the direction that EAO has proposed here; we remain excited about the original proposal without the removal of the syntax parser. I think there is room to continue to pursue this proposal, or at least the MessageFormat concept, and a message bundle-like solution may be worth pursuing. If EAO doesn’t think it’s possible to compromise between the Google position and the f5 position, it may make sense to put the TC39 proposal on pause for the time being. There should, though, be space in the Web platform to pursue this. + +USA: I would slightly disagree with your characterization of the situation. I agree that the experience of authoring a message might change if it is actually the data model that starts being used. My understanding of the situation is that we are splitting the meat of the function – we are keeping the prototype format and formatToParts while removing the constructor, and the constructor can come in later but the ability to format them is the fundamental ability. The existence of the syntax being included in the API that’s first shipped or not might be a small period of time depending on how it proceeds in TC39, my understanding of the situation is that it’s a minor inconvenience that can be fixed using an npm module rather than a fundamental change in the user experience. + +FYT: I don’t understand. If you don’t have the syntax, all you stuff into that thing is an object. And the data model isn’t really a fixed thing in MessageFormat, based on my conversations with colleagues. It’s still under review, even if it’s in tech preview – it’s a prototype that needs to be verified. It’s a good proposal, but there’s nothing proven yet. + +USA: I agree with your characterization of the data model. It’s not perfect, and until this change in strategy it was not the most sought-after aspect. It is an intermediate representation, which means people might have to use a 3rd party package to do the parsing, but I don’t think people would ship the data model. + +FYT: But that’s what we want to not happen. + +USA: But I think it's better than not having any message formatting. + +FYT: No, I disagree. + +EAO: It is correct to say that the syntax and data model are works in progress, in tech preview. They are likely to experience small changes before the designs are locked down. The syntax and data model that the JS API will ever support are only going to be the ones that are locked down. But the changes to the data model or syntax or overall behavior of message formatting are small enough going forward that it is entirely conceivable to do what we’ve tried to do here, which is defining and working with the JS API for working with MessageFormat 2 messages. The current state of the syntax is not relevant to determining whether this thing is ready to be used. A most significant meta-point here is that I am stepping back as a champion of these proposals. I don’t know if this means that they move back to stage 0 unless another champion comes forward. This needs to go through TG1. I am mentioning this here so that if there are people interested here in championing it, it can proceed. This appears to be a forum where there’s conflicting interests and views that it may be more effective to advance Web localization elsewhere than in TC39 at this point. This was not my plan, but this is where the situation is leading me personally. + +SFC: I would encourage EAO to, if you’re still active on the proposal, that there’s no harm in keeping it on the backburner at stage 1. + +EAO: But because of this pushback, this uncertainty that is going to take years and years, I’m not interested in having this on my backburner. + +SFC: The right thing to do is say “I’m no longer a champion of this proposal”, but record that it reached stage 1 (indicating that TC39 considers it worth considering), and someone else may pick it up in a few years. + +### Displaying era for a date: how CLDR may help + +- [Slides](https://docs.google.com/presentation/d/11wsfwbJZzTFEWrD0ZCJPD3uPhSACbGr6O37asZnzvZk/edit#slide=id.p) + +LAF: What is “era” in a date expression? If you think about it, it should always be displayed or said when you express a year, because it is the basis for the year and shows the reckoning system. In many cultures the era is generally implied, but is displayed if the year figures alone are ambiguous. If you are in the UK using the Gregorian calendar, “CE” is implied, but if you’re in the UK using the Buddhist calendar you must specify BE. And if you’re giving a year that could be either CE or BCE, you have to specify it to avoid ambiguity. + +LAF: [proposal involves including attributes in CLDR indicating the year ranges in which eras are required.] + +LAF: In France, for the Gregorian calendar, even if you generally hide the era it’s required in the ambiguous cases. But in France the era should always be displayed for the Buddhist calendar. However, if it’s known that in a given piece of content that all dates are expressed in the Buddhist calendar, it is admissible to hide the era. + +LAF: eraDisplay option has four values: +“auto”, which follows CLDR +“always”, which always displays the era +“never”, which does not ever show it +“hide”, which sets the “era” qualifier to “implied.” + + +SFC: First question, I need clarification on the differences between “auto” and “hide”. I think it’s a good idea to have this vary on locale – very useful. Another thing that your proposal does is introduce this concept of decoupling the calendar system from the reckoning system. This is a change that we’ll have to investigate in CLDR. We’ve investigated this in some areas in the sense that we allow eras to be shared between calendar systems. It’s not immediately clear to me that in addition to a calendar and an era system we also need a reckoning system? + +LAF: In “auto”, you would have the era displayed in European countries, say, if you use the Buddhist calendar. Whereas if you use “hide” with the Buddhist calendar, you have no need to show the era, since there is just one era strictly speaking. It’s a slight difference. Another example is that if you use any Muslim calendar, [...] + +SFC: So does “hide” mean “it is understood that we’re using the Buddhist calendar in this document, only display the era if it could be in a different epoch” + +LAF: Correct. + +SFC: We’ve previously suggested that the calendar that is actually used in Europe is not one of the 18 calendars, but we could add it – give it a code like “juli-greg” + +LAF: Yes! But there are differences – in the UK it changed in 1753, in France it changed in 1582 to 1585, in Germany it changed in 1700. I understand the thing is a little complicated, and putting things within CLDR could help. This is not essential, though. + +SFC: This is good material; we should continue this discussion and get it added to CLDR. once that’s added, we can move the ECMAScript proposal first. Could land in CLDR 46, so we have some time to move this forward. I may invite you, LAF, to a CLDR meeting to discuss the data model side with CLDR. + +### Introducing wrapper units like CurrencyNumber and UnitNumber to keep the options together for formatting and other operations. + +EAO: This was inspired by SFC’s concerns during the MessageFormat2 function registry discussions. We have number formatters and also DateTimeFormat that has options like unit or currency for number, for time zone for DateTimeFormat, that aren’t options for how to format a thing but instead part of the thing being formatted. So in JavaScript we’re missing a thing for containing a value and its unit, currency, time zone, etc., and I was wondering if there’d be support for a proposal to add something like this. I am mentioning it as a thing that might be interesting, but if someone’s going to champion it it’s not going to be me. + +SFC: Temporal is solving this problem with regard to time zones, and we’ve worked with them on how to do the formatting of a zoned date time. We’ve spent quite a bit of time having those discussions, so for time zones, yes. For currencies and units, it’s something that’s worth exploring. I know that with Jesse’s Decimal proposal in order to carry precision, which is something that carries semantic meaning, the Decimal proposal is going to solve that for precision of numbers. In terms of solving it for unit and currency, it’s a worthwhile area. I think the motivation is not as strong as for time zone, but units, definitely, we have a number of units proposals on the backburner – smart units and other things involving smart units – and when those go through a “unit with quantity” type would be interesting. We won’t have a currency conversion in the standard, though, because currencies change all the time. One approach might be to have the format function take the currency in an options bag, as opposed to in the constructor, but we also have to think of the impact on data loading costs. Overall, I’m supportive of this general direction if EAO or anyone else wants to pursue this. It’s independently motivated from MessageFormat. + +EAO: Much of my interest was from the MessageFormat side of things, and so although I support this advancing, I’m not going to be working on advancing it. + +### Intl.NumberFormat compact notation leaves out comma #751 + +https://github.com/tc39/ecma402/issues/751 + +SFC: Looks like this has been resolved in terms of discussion. It appears to be an engine bug. I would say that we need test coverage for this. It’s locale-dependent. + +RGN: Both looks valid – it’s a locale-dependent thing, and the formatting of a locale could change as new info comes in. it’s not clearly wrong, so I’m uncomfortable with tests asserting guarantees. + +YSZ: I think it’s possible that it’s already fixed or changed on the ICU side, or if it’s purely depending on the locale. + +SFC: I’ll just close this one. + +YSZ: V8 is also showing a comma in both cases, as well as SM. All three engines agree now. + + +### Clarification on PartitionDateTimePattern: Constructing NumberFormat instances + +https://github.com/tc39/ecma402/issues/563 + +BAN: Is efficiency really a concern for us in this context, given that spec text is written for readability rather than efficiency? + + +RGN: All else equal, it would be better to spec out an algorithm that if followed, would be more efficient. In this case, looking at FormatDateTimePattern, I see steps 4-9 are constructing NumberFormat dependent upon locale being provided in the DateTimeFormat itself, in this very long algorithm. I think it would make sense to take that construction and move it to be part of the DateTimeFormat object. This would also streamline the definition by taking steps that are currently the same every time through and moving them elsewhere. It’s purely editorial, but it would be my preference to move these steps elsewhere. + +SFC: This is exactly the type of thing that I’ve been pushing for in the Decimal proposal for reflecting the precision through the format function. Jesse’s proposal doesn’t handle minimum integer digits, because there isn’t a concept of leading zeroes. But as far as I’m concerned, this concept of minimum integer digits – I’ve seen people have to have five different formatters to handle different configurations. If they were in the format function, it wouldn’t be necessary to construct all these objects. Another potential direction could be to create a private AO on Intl.NumberFormat that’s able to use a different value of minimumIntegerDigits only during formatting. In an editorial way – we could also make a public API for doing this, but even just in the spec it’s a better approach. + +RGN: It’s against the general pattern in all of 402 that you give all the options in the constructor and you get a streamlined set of formatting functions. + +SFC: Yes, but minimum integers are a data model concern – “year 2” and “year 02” are different things. Year 2 is a long time ago, 02 could be 2002 or 1902. This is evidence for why minimum digits are semantic concerns. + +RGN: Potentially. The overall patterns for all the formatting functions is that they’re not parameterized by anything other than a value. If we were to look at changing that, I would want to do it holistically. But whether or not we do, editorially there’s refactoring that is possible here. + +### Numeric range validation is inconsistent #691 + +https://github.com/tc39/ecma402/issues/691 + +RGN: When we read in options for things like minimumIntegerDigits and maximumSignificantDigits, etc., is we are coercing. Part of that is validating against range boundaries. If what comes in coerces to a number that is non-integer and we get different behavior when we check it against the range before/after rounding. In Temporal, they landed on the approach of “range validation follows truncation” – work hard to get an integer, and only then check if it’s in or out of bounds. + +SFC: Are there cases in 262 where this comes up? It’s filed in 402, and we certainly have a lot of places in 402 where we do this range checking. + +RGN: I don’t know if it’s in 262 – I remember I looked, and if I had found anything I would have likely mentioned it. Maybe a toString? + +FYT: This checking range, what is it checking the range of? Checking if it’s from 0 to 9, something like that? + +RGN: If you look at something like SetNumberFormatDigitOptions, it reads in values that it passes to DefaultNumberOption with a range of acceptable values, minimum, maximum, fallback. Current definition of DefaultNumberOption compares the number against the range, and if it passes truncate and return result. + +FYT: And that could be floating point – oh boy. + +RGN: Similar code in ECMA-262 toFixed converts it to an integer and then checks against range. That’s different near the boundaries. Fixing this would make for more consistency between the specs, but it would be a normative change. I suspect, though, that no one is doing this and that the normative change is unlikely to affect code. + +SFC: This seems like something worth fixing. Is this something to bring to 262? Concrete suggestion: someone creates a PR against 402 fixing this, then we add a 30 minute agenda item to TG1 discussing the pull request and (if they consider it a good thing) change it in 262 if necessary. + +RGN: I like that solution a lot. Would anyone object to the normative change at all? It seems like there is no objection – we want to do this, we want to discuss it in plenary, and we want 262 editor feedback. + +BAN: +1 + +FYT: +1 + +#### Conclusion + +Someone to create a ECMA-402 PR and put it on the TG1 agenda to establish as a standard style to follow in these cases now and in the future. + +### How to show different language time formats in different countries #703 + +https://github.com/tc39/ecma402/issues/703 + + +FYT: So the issue is that if you specify the hour, it’ll just say “5” instead of “5 o’clock” or something like that. + +RGN: Do we agree that this issue is as described in the comment: resolvedOptions doesn’t round-trip for formatting behavior. Or is the issue something else? + +SFC: We could do what Anba suggested. Basically the action is that we should remap `r` to relatedYear [?] + +We should probably add test behavior here – it’s locale-dependent, but if there’s a year in the input options there should be a year in the resolvedOptions, and it’s kind of a bug if we’re not. Maybe a good action is to just introduce a unit to test262 that if you have year in the input field you should also have year in the output fields, and that this works in all calendars and all locales, with a primary focus on the case we know with Chinese formatting. + +RGN: When we’re talking about observable behavior, we’re talking about output from resolvedOptions? That’s the thing that changes based on fixing this? + +FYT: Where do we have that in the spec about related year? We do, actually. + +SFC: We have it as formatToParts + +FYT: Describing the internal pattern. How does that reflect to resolvedOptions? + +SFC: It has relatedYear and yearName, we just don’t have it in resolvedOptions and should. Anba’s solution is that if it has a related year, we add the year numeric. + +RGN: We’re not looking for the pattern, the spec is ignorant of that. + +FYT: It’s not an option! And the resolvedOption reflects what options are put in. + +RGN: In this case, what the difference between the output and the resolvedOptions? What are we actually looking at? Definitely year is dropped, which is a concrete thing we want to have fixed. It also changes minute as well, and second. That’s probably fine, though. + +SFC: Where are the slots even set? There, step 45 of CreateDateTimeFormat, they’re set using bestFormat, which is using the pattern. I think it’s a bug in the spec, and we should just fix it. In this case we should check not just for the year but also related year and year name.; + +FYT: So just a special case there? + +SFC: It could be a special case, but I’d note that there was another case I was looking at the description of the [[year]] field. We also have “{ampm}” and “{dayPeriod}”, so there might be a similar concern. + +RGN: Modification to step 45 such that… what? We’re looking at the selected best format, does not have field [[year]], but it does have field [[relatedYear]], then we would want to act as if it had field [[year]], but with what value? I don’t think the spec change is required here. I don’t like how the spec is structured, where we have dependencies, but I think it only goes the other way: anything mentioned in the pattern as a placeholder must exist as a field on the Record, but not the other way around. + +#### Conclusion + +Yes, the observed behavior is a bug: the output contains a year (relatedYear and yearName), so the resolvedOptions should also contain a year. Unclear if this is a spec bug or an implementation bug. Further investigation is warranted. + +## Review TG1 Agenda + +Reviewed. No 402 topics this cycle.