Skip to content
This repository has been archived by the owner on Jul 31, 2023. It is now read-only.

Returns array instead of iterator #23

Closed
FrankYFTang opened this issue May 17, 2021 · 20 comments
Closed

Returns array instead of iterator #23

FrankYFTang opened this issue May 17, 2021 · 20 comments

Comments

@FrankYFTang
Copy link
Collaborator

Split from #22
13. CreateSupportedValuesIterator ( list )

  • There needs to be a separate %SupportedValuesIterator% object, otherwise there's no next method to iterate over the elements.
  • I guess ? Yield(E) should be changed to ! Yield(E) given that there's no way to resume this generator with an abrupt completion. (I have no idea why ECMA-262 always uses fallible ? Yield() for all its iterators, even when the operation is infallible.)
  1. Intl.supportedValuesOf ( key [ , options ])
    I have some doubts about returning iterators instead of for example arrays, because:
  • The motivation for iterators was time zones, which has ~465 elements when called with no region. All other lists, with the exception of currencies, are significantly smaller.
  • But time zones can't be lazily retrieved from ICU, because we need to canonicalise and sort all entries.
  • So unless we account for the case where implementations cache the sorted and canonicalised list of time zones, using iterators over arrays doesn't buy us much. Especially because right now we don't yet have any good built-in Iterator support API and I don't see any real progress on https://github.com/tc39/proposal-iterator-helpers in the last 10 months.
  • And if we end up in a place where users always have to perform var array = [...Intl.supportedValuesOf(key)] to get something useful to operate upon, we may have made a design mistake.
  • The "Usage example" only shows how to use the new API, but not really when to use it. If I had a better feeling when to use this API, I could more easily say which return type is more appropriate.
    • I do see use cases for time zones and currencies, but other types may not be useful at the moment, because we don't provide means to localise the returned strings. For example currencies can be translated via Intl.DisplayNames and time zones can be translated via Intl.DateTimeFormat (and the "Extend TimeZoneName Option" proposal will even provide better localisations!), but neither calendars, nor collations, nor numbering systems, nor units can be localised right now. Let's say an application wants to provide a date control with a customisable calendar, just providing the BCP47 names to an end user won't be super useful. (Later: I totally forgot about the "Intl.DisplayNames v2" proposal, so "unit" is a better example for missing localisation atm.)
  • Please note: I can't say with 100% certainty which is the correct way to go forward, I'm just always a bit hesitant when new patterns are added, because that may also shape future APIs and because I think we should aim to have a consistent overall API approach.
  • Also: The "Intl Locale Info Proposal" proposal uses arrays for its return values, so it might be prudent to align the design decisions across these two proposals.
@FrankYFTang FrankYFTang changed the title Return array instead of iterator Returns array instead of iterator May 17, 2021
@FrankYFTang
Copy link
Collaborator Author

@anba @sffc

@ljharb
Copy link
Member

ljharb commented May 17, 2021

Arrays seem like they're much nicer to work with, that's for sure.

@sffc
Copy link
Collaborator

sffc commented May 18, 2021

I'm hesitant to spec an API that involves allocating an array of hundreds or thousands of elements.

@anba
Copy link
Collaborator

anba commented May 19, 2021

I'm hesitant to spec an API that involves allocating an array of hundreds or thousands of elements.

Only a couple of hundreds. 😄

  • ~465 strings for time zones
  • ~300 strings for currencies
  • 66 strings for numbering systems
  • 43 strings for units
  • 19 strings for collations
  • 18 strings for calendars

There are two angles here:

  1. The implementation aspect, for example the concern whether it's too costly to allocate a couple of hundred strings at once.
  2. The specification resp. user aspect, which depends on how we expect these functions to be used in practice.

Do you have concerns for both issues or only one of them?

@sffc
Copy link
Collaborator

sffc commented May 19, 2021

Do you have concerns for both issues or only one of them?

A bit of both.

I expect the common use case for these strings to involve a for loop iterating over them. It's well established software engineering best practice that iterators should not need to allocate extra memory.

@ljharb
Copy link
Member

ljharb commented May 19, 2021

If you’re iterating over all of them, you’ll almost always have to reify them into an array - especially prior to iterator helpers landing - so the memory cost would be smaller as an array than as an iterator + array, no?

@sffc
Copy link
Collaborator

sffc commented May 19, 2021

If you’re iterating over all of them, you’ll almost always have to reify them into an array

Not if you are:

  • Turning them into DOM elements instead of a JS array
  • Performing search/lookup/match operations where you are only going to store a subset of the strings

What are the use cases where you would actually want these in an array?

@ljharb
Copy link
Member

ljharb commented May 19, 2021

When i want to use filter, map, etc (absent iterator helpers). Also in general i prefer dealing with arrays of things in programming rather than having to do imperative loops all over the place.

@sffc
Copy link
Collaborator

sffc commented May 19, 2021

If you really want an array, you can use the spread operator ...

What I don't think we should do is force people to get an array from these functions if they don't need one. It seems like bad API design for new APIs. Python, Java, Rust, C++, and many other languages have moved from arrays to iterators a long time ago.

@FrankYFTang
Copy link
Collaborator Author

FrankYFTang commented May 19, 2021

Currently in Intl API all the formatToParts return array, especially for example- Intl.ListFormat.prototype.formatToParts. If the input is a list of 500 items the return value will be an array of >500 items. so it is NOT the first method Intl API return array, right? How come when we added formatToParts we decided to return array instead of iterator if that will be an issue, especially for Intl.ListFormat.prototype.formatToParts which take an iterator as input...

@sffc
Copy link
Collaborator

sffc commented May 19, 2021

formatToParts is fundamentally different. It's the same as the string returned by .format but with additional annotations, computed on the fly and returned to the client. I see this also in the same category as the regex groups function.

Here, the data is deterministic. Under the hood, we probably have a static array-like structure stored in C++. We want to give a way to access that static C++ array without having to copy the whole thing into JS every time someone calls the function.

@FrankYFTang
Copy link
Collaborator Author

but your "I expect the common use case for these strings to involve a for loop iterating over them. It's well established software engineering best practice that iterators should not need to allocate extra memory." statement also apply to formatToParts, right?

@ljharb
Copy link
Member

ljharb commented May 19, 2021

@sffc the alternative is forcing people to make an array if they don't need an iterator. For a relatively finitely-bounded set of values, I don't think either choice is inherently better or less inconvenient for some.

@sffc
Copy link
Collaborator

sffc commented May 19, 2021

but your "I expect the common use case for these strings to involve a for loop iterating over them. It's well established software engineering best practice that iterators should not need to allocate extra memory." statement also apply to formatToParts, right?

No, my statement doesn't apply to formatToParts, because there is no way to avoid allocating memory. If we didn't give the array to JS, we'd still have to allocate it in C++.

@FrankYFTang
Copy link
Collaborator Author

FrankYFTang commented May 20, 2021

No, my statement doesn't apply to formatToParts, because there is no way to avoid allocating memory. If we didn't give the array to JS, we'd still have to allocate it in C++.

hum? Why is that? In v8, Intl.ListFormat.prototype.formatToParts is calling nextPosition() of icu::FormattedList in C++ to iterate through, so internally it is calling an iterator, If the API was designed to return iterator, it could surely remember the icu::FormattedList internal slot and allocate the memory of one call at a time. Say if the input is an iterator of 1000 item and the formatTo parts has 2003 parts, the next of JS will be called 2003 times and each time it call the nextPosition() and allocate memory for that, right?

@FrankYFTang
Copy link
Collaborator Author

In 2021-06-03 ECMA402 TG2 we spend a lot of time discuss this. No agreement reach. I will put down each persons's opinion later. I will also change the spec in a way to have both approach available and label and back to ECMA402 again later to let people take a look before the July meeting.

@sffc
Copy link
Collaborator

sffc commented Jun 3, 2021

@FrankYFTang
Copy link
Collaborator Author

FrankYFTang commented Jun 4, 2021

What people paste into the chat channel about this issue

Louis-Aimé de Fouquières12:30

Equal to me.

Jeff Walden12:31

Yeah -- JS writers are going to assume an array here, most likely. And JS doesn't have the same high-quality standalone iteration helpers like Python/Rust/etc. have, to make working with iterators easy.

Leo Balter12:31

for the records, I'm ok either way as well

Yusuke Suzuki12:31

I'm ok for Array too.

Richard Gibson12:31

I'm also ok either way, but share Jordan's dispreference for introducing new iterator prototypes

Leo Balter12:34

Not any strong preference, but Arrays seems more intuitive for Developers Experience. In cases like this, I'd consult with JS Educators.
https://twitter.com/laurieontech
Laurie helped with pretty good insights on other proposals, it might be the case here.

@FrankYFTang
Copy link
Collaborator Author

Here is what I will do. I will put in both the specification for Iterator and Array into the proposal and mark them so next time we can take a look at two different version of spec text to vote for.

@FrankYFTang
Copy link
Collaborator Author

I now change it to array.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants