Import with type text, bytes, and URL #9444

kriskowal · 2023-06-19T17:07:43Z

I’m working with a group of TC39 delegates on what we call Module Harmony, an effort to make proposals pertaining to the module system coherent. I am consequently looking for the right venue to propose and establish a precedent for a host integration with modules, specifically to address the portability of code that uses the module system to express a dependency upon plain text, bytes, or references to assets. Concretely, I would like to propose that:

import text from 'text.txt' with { type: 'text' };
import bytes from 'bytes.oct' with { type: 'bytes' };
import imageUrl from 'image.jpg' with { type: 'url' };

Such that:

typeof text === 'string';
bytes instanceof Uint8Array;
typeof imageUrl === 'string'; // edit: was instanceof URL

So that a module can express these kinds of dependency in a way that is portable. Specifically, I aim for a program to be run on the server side and the client side of a web application, both raw and thru an optimizing translation (e.g., bundling). With import attributes, ECMA 262 is already sufficiently expressive to allow a host integration to address this problem without additional features, and would be coherent with future 262 proposals, particularly virtual module sources.

bathos · 2023-06-19T17:50:52Z

Re: the URL import, using a URL instance for this seems contradictory to guidance in WHATWG URL:

A standard that exposes URLs should expose the URL as a string (by serializing an internal URL). A standard should not expose a URL using a URL object. URL objects are meant for URL manipulation.

For a module, not using the mutable URL representation would seem particularly important, I’d think?

kriskowal · 2023-06-19T17:52:56Z

For a module, not using the mutable URL representation would seem particularly important, I’d think?

A string representation of the URL would entirely satisfy the motivating use cases.

annevk · 2023-06-20T06:48:01Z

Looking at https://fetch.spec.whatwg.org/#body-mixin I wonder if we want arrayBuffer instead, but I suppose that was a mistake on Fetch's part and it should have been bytes returning a view (we could still add that I suppose).

I'm not sure I understand how url works. How is it different from import.meta.resolve('image.jpg')?

Do we need to solve @domenic's #7017 about feature detection at the same time?

There's also #4321 from @Jamesernator and #7706 from @7ombie. These all look like duplicates, but I'm fine with keeping them open until we have some kind of plan. One thing that's raised in the latter that's important here is what to do about MIME types. Would we not check response MIME types for these, similar to Fetch? Or would we try to enforce something?

cc @whatwg/modules

littledan · 2023-06-20T15:24:52Z

How is it different from import.meta.resolve('image.jpg')

Good question. The semantics would actually be the same. One piece of motivation is that this form is more "declarative"-looking and therefore statically analyzable (which should mostly help build tools, given that not enough information is available for a prefetcher to use this). See more information about motivation (for a previous iteration of this idea) at https://github.com/tc39/proposal-asset-references . Also note that some people in TC39 are considering whether we should propose some other syntax for this, besides using import attributes.

kriskowal · 2023-06-20T16:46:30Z

Yes, this would provide a statically analyzable alternate route to the same url value, analogous to static vs dynamic import. This is less interesting for the web than it is interesting because it establishes a convention that build tooling would benefit from.

For example, a bundler that takes a whole web application directory tree and generates a new tree, the bundler would be able to discern the dependency and rewrite the URL.

For a bundler that takes a whole web application tree and generates a single JavaScript file, it would have the option of embedding the underlying data URL.

That’s to say, any static syntax that reveals the url of an asset in a way that implies a dependency needs to be arranged by a bundler is an improvement on the status quo. This is one of the options we are considering.

As @annevk mentions in chat, this approach has the disadvantage of introducing a code path under the host import hook that bypasses a fetch.

For this reason, the alternative approach is to introduce another import phase, as we do with import source and import defer proposals, except the phase would occur before fetch. This has a different smell: it is not clear that such a module would advance beyond the asset phase. It is clear that it would not compose well with import with type, since the type is irrelevant unless we advance to fetch. We would presumably be obliged to allow the module system to fetch an image (for example) and fail to interpret it as JavaScript.

The implication for Module virtualization is that an asset import would have to bypass the import hook and provide an alternate lane that can be interrupted before fetch (to produce a url) and then again before parse (to produce bytes or text) before possibly proceeding to produce source, at which point it will have done all the work currently subsumed by the host import hook.

[added:]

The implication for Module virtualization if we pursue with type is simply that these are different module source types that terminate at exporting a default value when they’re evaluated. So, the proposed import hook virtualization would just return a non-JavaScript module source with the appropriate behavior.

Jamesernator · 2023-06-29T06:25:09Z

There's also #4321 from @Jamesernator

The suggestions there had quite a different flavour given at the time JSON modules were proposed to be derived based on MIME type, rather than the current approach that uses import attributes (which MIME type must agree with).

This new style with import attributes is strictly more useful as one can interpret essentially anything as an array buffer/text regardless of it's actual MIME type.

e.g. In my previous suggestion, text would only be successfully imported if it were text/plain, but a lot of stuff might be in text/yaml, text/json5, etc etc.

As such that old issue can be closed in strong favour of this one.

One thing that's raised in the latter that's important here is what to do about MIME types. Would we not check response MIME types for these, similar to Fetch? Or would we try to enforce something?

For urls there's obviously nothing to do as no fetching is involved.

For array buffers, checking MIME types is undesirable as people might be loading any content for some processing (e.g. images, audio, application specific formats, are all reasonable reasons to import array buffers).

For text checking the type/essence is similar to array buffers, any MIME type (not just text/*) might contain text. However we do need to know about encoding, so the parameter charset should probably be respected.

Alternatively for text, we could have a separate attribute that indicates what format to decode as (potentially useful if the server doesn't know what charset files are using).

import someText from "./file.txt" with { type: "text", encoding: "utf16" };
import someText from "./oldData.dat" with { type: "text", encoding: "latin2" };

// Would default to utf8 naturally so these would be equivalent
import someText2 from "./file2.ini" with { type: "text", encoding: "utf8" };
import someText2 from "./file2.ini" with { type: "text" };

Jarred-Sumner · 2024-04-23T09:55:09Z

In Bun v1.1.5, we are adding bundler & runtime support for text, json & toml. text is UTF-8 and replaces invalid UTF-8 with FFD. We probably will support BOM later to handle UTF-16. Named imports (excluding default) with type: “text” throw an error at parse time.

oven-sh/bun#10456

kriskowal · 2024-10-18T05:59:26Z

Kindly consider TC 39 Stage 1 immutable ArrayBuffer for type: 'bytes'. https://github.com/tc39/proposal-immutable-arraybuffer

7ombie · 2024-11-12T23:10:37Z

Sorry if this is a dumb question, but why do we care about the MIME type for raw bytes and UTF-8? I thought that was a security concern that stemmed from the fact that browsers parse the result. If we just get the bytes or characters we asked for (like a static fetch), I'm not sure why it needs to be any tighter than that.

All a static analyzer would see is a filepath that ends on some extension (say .png), and must (at least generally) assume it's a path to a PNG file.

kriskowal · 2024-11-12T23:19:26Z

I find it reasonable to enable or even encourage import with type bytes to also specify the expected MIME type, possibly with another attribute like mimeType so that the module cannot be deceived into misinterpreting the imported content, especially for dynamic import, for example, import(location, { with: { type: 'bytes', mimeType: 'image/png' } }). I would want the mimeType assertion to be optional since not all binary data that modules can usefully interpret has a MIME type, except insofar as application/octet-stream is sufficiently abstract to apply to anything.

7ombie · 2024-11-13T00:02:55Z

@kriskowal - That makes perfect sense. There's a benefit in being able to opt into extra checks and balances, but no reason to require them.

Jamesernator · 2024-12-03T01:41:44Z

Something that probably should be done with { type: "url" } is the ability to set a destination so the browser can preload into the right place. i.e.:

import imageUrl from "./image.png" with { type: "url", preloadAs: "image" };

import workerUrl from "./worker.js" with { type: "url", preloadmoduleAs: "worker" };
// Or with source phase imports: https://github.com/tc39/proposal-esm-phase-imports
import source workerSrc from "./worker.js" with { preloadmoduleAs: "worker" };

annevk added the topic: script label Jun 20, 2023

annevk added addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest labels Jun 20, 2023

Jamesernator mentioned this issue Jun 29, 2023

Additional module types #4321

Closed

petamoriken mentioned this issue Jul 31, 2024

Feature request: Add --asset flag to compile denoland/deno#17994

Closed

lucacasonato mentioned this issue Sep 2, 2024

Support import foo from "./bar.txt" with { as: "bytes" } denoland/deno#25354

Open

petamoriken mentioned this issue Nov 26, 2024

Caching and loading non-importable resources denoland/deno#5987

Open

petamoriken mentioned this issue Jan 4, 2025

Add text and binary module types denoland/deno_core#1025

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Import with type text, bytes, and URL #9444

Import with type text, bytes, and URL #9444

kriskowal commented Jun 19, 2023 •

edited

Loading

bathos commented Jun 19, 2023

kriskowal commented Jun 19, 2023

annevk commented Jun 20, 2023

littledan commented Jun 20, 2023

kriskowal commented Jun 20, 2023 •

edited

Loading

Jamesernator commented Jun 29, 2023 •

edited

Loading

Jarred-Sumner commented Apr 23, 2024 •

edited

Loading

kriskowal commented Oct 18, 2024 •

edited

Loading

7ombie commented Nov 12, 2024

kriskowal commented Nov 12, 2024

7ombie commented Nov 13, 2024

Jamesernator commented Dec 3, 2024

Import with type text, bytes, and URL #9444

Import with type text, bytes, and URL #9444

Comments

kriskowal commented Jun 19, 2023 • edited Loading

bathos commented Jun 19, 2023

kriskowal commented Jun 19, 2023

annevk commented Jun 20, 2023

littledan commented Jun 20, 2023

kriskowal commented Jun 20, 2023 • edited Loading

Jamesernator commented Jun 29, 2023 • edited Loading

Jarred-Sumner commented Apr 23, 2024 • edited Loading

kriskowal commented Oct 18, 2024 • edited Loading

7ombie commented Nov 12, 2024

kriskowal commented Nov 12, 2024

7ombie commented Nov 13, 2024

Jamesernator commented Dec 3, 2024

kriskowal commented Jun 19, 2023 •

edited

Loading

kriskowal commented Jun 20, 2023 •

edited

Loading

Jamesernator commented Jun 29, 2023 •

edited

Loading

Jarred-Sumner commented Apr 23, 2024 •

edited

Loading

kriskowal commented Oct 18, 2024 •

edited

Loading