Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is a TypedArray a sequence? #868

Closed
guest271314 opened this issue Apr 5, 2020 · 12 comments
Closed

Is a TypedArray a sequence? #868

guest271314 opened this issue Apr 5, 2020 · 12 comments

Comments

@guest271314
Copy link

Consider the following from File API

To process blob parts given a sequence of BlobPart's parts and BlobPropertyBag options, run the following steps:
[...]
2. For each element in parts:
[...]
2. If element is a BufferSource, get a copy of the bytes held by the buffer source, and append those bytes to bytes.

and this repository description of sequence

2.13.27. Sequence types — sequence
The sequence type is a parameterized type whose values are (possibly zero-length) lists of values of type T.

Sequences are always passed by value. In language bindings where a sequence is represented by an object of some kind, passing a sequence to a platform object will not result in a reference to the sequence being kept by that object. Similarly, any sequence returned from a platform object will be a copy and modifications made to it will not be visible to the platform object.

The literal syntax for lists may also be used to represent sequences, when it is implicitly understood from context that the list is being treated as a sequences. However, there is no way to represent a constant sequence value inside IDL fragments.

Sequences must not be used as the type of an attribute or constant.

Note: This restriction exists so that it is clear to specification writers and API users that sequences are copied rather than having references to them passed around. Instead of a writable attribute of a sequence type, it is suggested that a pair of operations to get and set the sequence is used.

The type name of a sequence type is the concatenation of the type name for T and the string "Sequence".

Any list can be implicitly treated as a sequence, as long as it contains only items that are of type T.

It is not immediately clear if a JavaScript TypedArray is considered a sequence either in File API or in this specification.

When [] is not used in Blob constructor and a TypedArray is passed the values of the TypedArray are converted to string, resulting in a RangeError when attempting to convert the ArrayBuffer representation of the Blob back to a TypedArray

var floats = new Float32Array([0.00005549501292989589, 0.00006459458381868899, 0.000058644378441385925, 0.00006201512587722391]);
var file = new Blob(floats); // pass TypedArray
file.arrayBuffer().then(b =>  console.log(new Float32Array(b))).catch(console.error);
// Chromium error message
RangeError: byte length of Float32Array should be a multiple of 4
    at new Float32Array (<anonymous>)
// Nightly error message
RangeError: "attempting to construct out-of-bounds TypedArray on ArrayBuffer"
// why?
file.text().then(console.log).catch(console.error);
Promise {<pending>}
0.000055495012929895890.000064594583818688990.0000586443784413859250.00006201512587722391

the RangeError can be avoided and conversion back to TypedArray from ArrayBuffer is possible when the value passed at Blob constructor is within []

var floats = new Float32Array([0.00005549501292989589, 0.00006459458381868899, 0.000058644378441385925, 0.00006201512587722391]);
var file = new Blob([floats]); // pass TypedArray within []
file.arrayBuffer().then(b =>  console.log(new Float32Array(b))).catch(console.error);
Promise {<pending>}
Float32Array(4) [0.00005549501292989589, 0.00006459458381868899, 0.000058644378441385925, 0.00006201512587722391]
// why?
file.text().then(console.log).catch(console.error);
Promise {<pending>}
Q�h8�v�8��u8���8

If a TypedArray is already a sequence it should not be necessary to wrap the TypedArray in array literal in Blob constructor to avoid the TypedArray values being converted to a string.

See w3c/FileAPI#147 (comment) and the test cases that follow at w3c/FileAPI#147 (comment).

Is a TypedArray a sequence?

Is this working as intended?

@guest271314
Copy link
Author

Browsing this repository for what a sequence is, unless missing a critical part of what is being described, per https://tc39.es/ecma262/#sec-%typedarray%.prototype-@@iterator a TypedArray is an iterable that is converted to a sequence #325 (comment)

Any iterable that authors pass is converted to a sequence<>, with the checking of values done during the conversion.

Therefore it should not be necessary to do

new Blob([TypedArray])

for the TypedArray to not be converted to a string

new Blob(TypedArray)

should suffice for passing a ArrayBufferView that is should not be converted to a string when omitting array literal [ArrayBufferView] at Blob (and File) constructors?

@bathos
Copy link
Contributor

bathos commented Apr 5, 2020

The argument in question is optional sequence<BlobPart>, where BlobPart is (BufferSource or Blob or USVString), and where Blob is the interface type whose constructor this is.

You’re correct that a TypedArray can be converted to a sequence. But that’s exactly what you’re seeing. The members of the iterable become strings because they’re originally numbers — and numbers can be coerced to USVString (in BlobPart) but not BufferSource of Blob.

@domenic
Copy link
Member

domenic commented Apr 5, 2020

Closing, since this has been answered.

@domenic domenic closed this as completed Apr 5, 2020
@guest271314
Copy link
Author

@bathos

This is the part that am not grasping

The members of the iterable become strings because they’re originally numbers — and numbers can be coerced to USVString (in BlobPart) but not BufferSource of Blob.

Why does new Blob(TypedArray) result in a different output than new Blob([TypedArray])?

What does [TypedArray] do that TypedArray alone does not in that case?

@bathos
Copy link
Contributor

bathos commented Apr 5, 2020

The Float32Array is a BufferSource.

@guest271314
Copy link
Author

What happens here:

var MY_JSON_FILE = [[[`{
  "hello": "world"
}`]]];

var blob = new Blob([[MY_JSON_FILE]]);

var fr = new FileReader();

fr.addEventListener("load", e => {
  console.log(e.target.result)
});

fr.readAsText(blob);

outputs

{
  "hello": "world"
}

where do the extra [ and ] disappear to? Why are not the extra [ and ] also converted to a string?

Am trying to concretely determine when [ and ] are required in order to avoid string conversion.

@bathos
Copy link
Contributor

bathos commented Apr 5, 2020

Array.prototype.toString invokes Array.prototype.join, so the array's members are coerced to string and joined with comma. As there is only one member at every level, no commas are added, thus [[[[['x']]]]] coerces to the string "x".

image

@guest271314
Copy link
Author

new Blob(TypedArray) is expected to be converted to string while new Blob([TypedArray]) is not expected to be converted to a string? Or, that is, both are converted to strings, yet different forms of strings? Working as intended?

(Should a note be included in File API conveying there is a difference between passing the two options to the constructor, or is that expected to be common knowledge?)

@bathos
Copy link
Contributor

bathos commented Apr 5, 2020

In neither case is the TypedArray converted to a string. In the former case, the members of the iterable are converted to strings, and in the latter case they are not, because the TypedArray is already a BufferSource.

In other words, the following two statements are effectively the same — both provide a sequence whose members are numbers, and those numbers end up being cast to string:

new Blob(Uint8Array.of(1, 2, 3));
new Blob([ 1, 2, 3 ]);

@guest271314
Copy link
Author

That clarifies the algorithm somewhat. However, that example, and the fundamental difference between passing a TypedArray directly and passing a TypedArray wrapped in an array literal [] are not immediately clear at the File API specification. At some point in the past, if recollect accurately, it was mandatory to pass [] at File constructor, though not Blob constructor. There is obviously a different output when

new Blob([Uint8Array.of(1, 2, 3)]).text().then(console.log)

is used compared to

new Blob(Uint8Array.of(1, 2, 3)).text().then(console.log)

however, as asked above, is that difference simply expected to be common knowledge, without a published example with accompanying description of the two options?

@bathos
Copy link
Contributor

bathos commented Apr 5, 2020

I couldn’t say what the intentions or history are. In theory, the type of the argument could have been (BlobPart or sequence<BlobPart>) and due to the algorithm for coercing a union type, a lone TypedArray would then end up recognized as a BlobPart rather than a sequence, I believe. That is probably what you were hoping for / would have found more intuitive.

Whether such a change is web-compatible now, I’m not sure; you could open an issue on the File API repo if you think it’s worth investigating. I’d suggest searching there as well in case this has come up previously. It wouldn’t be a Web IDL level issue.

FWIW, just to keep the picture clear, it’s not that an array is mandatory. It’s that the argument’s type is sequence<BlobPart>. Arrays are just the most common way to provide a sequence of values, but this could be any iterable other than string, e.g. a Set or a generator instance.

@guest271314
Copy link
Author

Am not sure if it is worth investigating from a specification standpoint.

The first time encountered that was probably due to user error by passing an array of arrays (of Uint8Arrays) which were images which were retrieved at Promise.all() then() callback. The second occasion was passing a Float32Array (audio) directly to Blob() constructor, then saving that as a file with no extension set (where got string as output). FWIW, the goal is to stream video and audio as Uint8Array which is then converted to Uint16Array and Float32Array.

Yes, experimented with Set passed to Blob during the same session where linked above testing different values passed to Blob and File constructor; probably a less user-error constructor to use where each Blob or File will be unique and to some extent prevent an array of arrays case. From a "human readable" perspective simply considered a TypedArray an iterable that could be passed directly to Blob constructor without the expectation that the output would be a string, which is what encounted when attempting to parse the Float32Array that saved as a file without extension.

From a front-end standpoint the inclusion or omission of [] within Blob constructor can result is considerably different outputs. Will search File API for this topic and similar issues. Prefer to file bugs/issues immediately as perform a modicum of experiments at the front-end and the browser and OS could crash at any time and the experimentation, and result will be lost.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants