Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added debug id proposal #20

Merged
merged 4 commits into from
Apr 12, 2023
Merged

Conversation

mitsuhiko
Copy link
Contributor

This is a proposal for #19. The idea is to embed globally unique debug or build IDs into source maps and transpiled JavaScript files. This enables a work flow similar to native object files and their debug companions or WASM files (WebAssembly/tool-conventions#133).

Rendered text

debug ID is not defined.

Additionally, this specification applies only to non-indexed source maps and
currently specifies references only for JavaScript.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its unclear to me what this means? Does it mean that only JS files have the //# debugId comment in them? Because you can have any arbitrary file as one of the "sources".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original spec also leaves open CSS and other formats.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhh absolutely, I always forget that this is not exclusive to JS. 🤔

Comment on lines +85 to +86
are placed on it. It is however recommended to generate deterministic debug IDs
(UUID v3 or v5) so that rebuilding the same artifacts yields stable IDs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today I learned: https://en.wikipedia.org/wiki/Universally_unique_identifier#Versions_3_and_5_(namespace_name-based)

UUIDs have special version and variant tags, so they don’t use the full 128-bits.

But as you said, I would leave that up to the specific toolchain, as long as the identifier is formatted like a UUID and sufficiently unique, it will be fine :-)

Comment on lines +89 to +91
bidirectional mapping between them. The linking of source maps and transpiled
files via HTTP headers is explicitly not desired. A file identified by a Debug
ID must have that Debug ID embedded to ensure the file is self-identifying.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this needs to be mentioned here or anywhere. Tools like webpack have an option to have "hidden" (See https://webpack.js.org/configuration/devtool/#devtool) sourcemaps, which are created but not referenced, to create some sense of "obfuscation". As the Debug ID would just be some random bytes, they do not pose any "obfuscation" risk so to say, and tools should always put them in.


## JavaScript API for Debug ID Resolution

Today `error.stack` in most runtimes only returns the URLs of the files referenced
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error.stack returns an opaque string. With some luck you are able to parse a URL out of it though.

proposals/debug-id.md Outdated Show resolved Hide resolved
Comment on lines +136 to +137
* `System.getDebugIdForUrl(url)` looks up the debug ID for a given script file by
URL that has already been loaded by the browser in the current context.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this might be the most controversial point of this proposal.

While a lot of tools use the System global, I believe no "official" spec is using it?

Also, "URL that has already been loaded by the browser" might need a bit more clarification, for example:

  • It should return the debug-id of scripts "that are current loaded" into the engine.
  • The URL should match the one that was used to load the script, and should also match the ones pretty-printed in Error.stack output, import.meta.url, etc.
  • Specifically, it should have the same query-string and fragment.
  • It should not do any network IO
  • On mismatch, should it return undefined or rather throw a ReferenceError or any other kind of error?
  • On mismatch, should it give an explanation of the reason? Is the file not loaded at all? Do the querystring parameters not match? etc…

Comment on lines +153 to +158
* a JSON file containing a toplevel object with the keys `mapping`, `version`,
`debugId` and `sourcesContent` should be considered to be a self-identifying
source map.
* a UTF-8 encoded text file matching the regular expression
`(?m)^//# debugId=([a-fA-F0-9-]{12,})$` should be considered a transpiled
JavaScript file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 I like this, especially also implicitly forcing JS files to be UTF-8 ;-)

Should we also propose a JSON Schema along with a "$schema" field for source maps, or would that be a bit too much?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly want to see initial feedback but a JSON schema for source maps would be very valuable by itself.

Comment on lines +163 to +165
files from the server. That way a tool such as a browser or a crash reporter could
be pointed to a S3, GCS bucket or an HTTP server that can serve up source maps and
build artifacts keyed by debug id.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should also plan for browsers and local development servers.
Using something like Import Maps could solve this, and point the browser to the local development server to resolve the corresponding source map.

Related to my note above, this could potentially replace the various different ways that bundlers / development servers can offer source maps today. I believe offering a local symbol server would also benefit performance a bit, as the tools wouldn’t have to embed base-64 encoded sourcemaps into the development assets, but sourcemaps would only be needed to be serialized on demand.

proposals/debug-id.md Outdated Show resolved Hide resolved
mitsuhiko and others added 3 commits April 3, 2023 15:37
Co-authored-by: Arpad Borsos <swatinem@swatinem.de>
Co-authored-by: Arpad Borsos <swatinem@swatinem.de>
@robpaveza
Copy link

robpaveza commented Apr 5, 2023

At Microsoft we've been using the SHA-256 hash of the script contents, plus the name of the file of course.

A possible drawback here is if only the original source changes - e.g., with newlines or comments only - then there is a possibility of source maps not matching up. But I can say that in about 18 months of the Edge DevTools team using this mechanism for our own internal debugging, we haven't run into this issue at all. (Bottom line is, if the old and the new code both mapped to the same output contents, and the error existed in the old contents, then you should be able to reason about your code -- because the executing code itself didn't change. This approach is also hardened if you're including the original source inside your source maps).

That having been said, the advantage we liked about this approach was that you don't have to index the same source map if particular files don't change. For example, we're not investing much in the "Welcome" panel in DevTools, so even though we build it every day multiple times as part of the CI, it's not changing, so we don't need to index every CI run's output of this file. It also allows us to cache the source maps reasonably reliably (Edge has been caching source maps uniquely against the filename+hash tuple since Edge 99), again having showing really good results.

From a technical perspective, V8 provides an API as part of the CallSite API which is provided in the Error.prepareStackTrace callback to obtain the script hash -- getScriptHash() which will return this hash if the script isn't being treated as opaque.

@mitsuhiko
Copy link
Contributor Author

@robpaveza we are definitely quite interested in supporting that at the very least in addition, as this is already something that can be supported on browsers today. For what it's worth that requires using Error.prepareStackTrace which is quite tricky to use for a tool like us (as we have no guarantee that nobody else is patching that) and if the browser does not support the script hash, it's hard to fall back to.

In theory the fallback for no browser support on debug IDs is just making an XHR request to the file and parsing the debugId comment with a regex.

@littledan
Copy link
Member

This is a very good starting point for continued discussion, so I'm landing this, and will land other analogous proposal documents without expecting them to reflect the consensus of everyone involved.

@littledan littledan merged commit 31f6372 into tc39:master Apr 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants