-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added debug id proposal #20
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,237 @@ | ||
# Source Map Debug ID Proposal | ||
|
||
This document presents a proposal to add globally unique build or debug IDs to | ||
source maps and transpiled JavaScript files, making build artifacts | ||
self-identifying. | ||
|
||
## Background | ||
|
||
Source maps play a crucial role in debugging minified JavaScript files by | ||
providing a mapping between the minified code and the original source code. | ||
However, the current source map specification lacks important properties such as | ||
self-describing and self-identifying capabilities for both the JavaScript | ||
artifact (the transpiled JavaScript file) as well as the source map. This | ||
results in a subpar user experience and numerous practical problems. To address | ||
these issues, we propose an extension to the source map format: the addition of | ||
globally unique Debug IDs. | ||
|
||
## Objective and Benefits | ||
|
||
The primary objective of this proposal is to enhance the source map format by | ||
introducing globally unique Debug IDs, enabling better identification and | ||
organization of minified JavaScript files and their corresponding source maps. | ||
This improvement will streamline the debugging process and reduce the likelihood | ||
of errors arising from misidentification or misassociation of files. | ||
|
||
Debug IDs (also sometimes called Build IDs) are already used in the native language | ||
ecosystem and supported by native container formats such as PE, ELF, MachO or | ||
WASM. | ||
|
||
The proposed solution offers the following benefits: | ||
|
||
1. Improved File Identification: The introduction of globally unique Debug IDs | ||
will make it easier to identify and associate minified JavaScript files with | ||
their corresponding source maps. | ||
|
||
2. Self-Identifying Files: This specification changes source maps and minified | ||
JavaScript files so that they become self-identifying, eliminating the need | ||
for external information to work with the files. | ||
|
||
3. Streamlined Debugging Process: The implementation of Debug IDs will simplify | ||
and streamline the debugging process by reducing the likelihood of errors | ||
resulting from misidentification or misassociation of files. | ||
|
||
4. Standardization: The adoption of this proposal as a web standard will | ||
encourage a consistent and unified approach to handling source maps and | ||
minified JavaScript files across the industry. | ||
|
||
5. Guaranteed bidirectionality: Today source maps do not provide the ability to | ||
reliably resolve back to the transpiled file they are from. However in | ||
practice tools often require this as they are often leveraging the | ||
transpiled artifact to resolve scope information by parsing the source. | ||
|
||
6. Symbol server support: with Debug IDs and source maps with embedded sources | ||
it becomes possible to support symbol server lookup from symbol servers. | ||
|
||
## Scope | ||
|
||
This proposal sets some specific limitations on source maps to simplify the | ||
processing in the wider ecosystem. Debug IDs are at present only specified to | ||
source maps with embedded sources or where sources are categorically not | ||
available. The lookup for original sources from a source map identified by a | ||
debug ID is not defined. | ||
|
||
Additionally, this specification applies only to non-indexed source maps and | ||
currently specifies references only for JavaScript. | ||
|
||
## Terms | ||
|
||
In the context of this document: | ||
|
||
- **Source Map:** Refers to a non-indexed, standard source map. | ||
- **Transpiled File:** Refers to a transpiled (potentially minified) JavaScript file. | ||
- **Debug ID:** Refers to a UUID as described in this document. | ||
|
||
## Debug IDs | ||
|
||
Debug IDs are globally unique identifiers for build artifacts. They are | ||
specified to be UUIDs. In the context of this proposal, they are represented in | ||
hexadecimal characters. When comparing debug IDs they must be normalized. This | ||
means that `85314830-023f-4cf1-a267-535f4e37bb17` and | ||
`85314830023F4CF1A267535F4E37BB17` are equivalent but the former representation | ||
is the canonical format. | ||
|
||
The way a debug ID is generated is specific to the toolchain and no requirements | ||
are placed on it. It is however recommended to generate deterministic debug IDs | ||
(UUID v3 or v5) so that rebuilding the same artifacts yields stable IDs. | ||
Comment on lines
+85
to
+86
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Today I learned: https://en.wikipedia.org/wiki/Universally_unique_identifier#Versions_3_and_5_(namespace_name-based) UUIDs have special version and variant tags, so they don’t use the full 128-bits. But as you said, I would leave that up to the specific toolchain, as long as the identifier is formatted like a UUID and sufficiently unique, it will be fine :-) |
||
|
||
Debug IDs are embedded in both source maps and transpiled files, allowing a | ||
bidirectional mapping between them. The linking of source maps and transpiled | ||
files via HTTP headers is explicitly not desired. A file identified by a Debug | ||
ID must have that Debug ID embedded to ensure the file is self-identifying. | ||
Comment on lines
+89
to
+91
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure if this needs to be mentioned here or anywhere. Tools like webpack have an option to have |
||
|
||
### Debug IDs in Source Maps | ||
|
||
We propose adding a `debugId` property to the source map at the top level of | ||
the source map object. This property should be a string value representing | ||
the Debug ID in hexadecimal characters, preferably in the canonical UUID | ||
format: | ||
|
||
```json | ||
{ | ||
"version": 3, | ||
"file": "app.min.js", | ||
"debugId": "85314830-023f-4cf1-a267-535f4e37bb17", | ||
"sources": [...], | ||
"sourcesContent": [...], | ||
"mappings": "..." | ||
} | ||
``` | ||
|
||
### Debug IDs in JavaScript Artifacts | ||
|
||
Transpiled JavaScript files containing a Debug ID must embed the ID near the end | ||
of the source, ideally on the last line, in the format `//# debugId=<DEBUG_ID>`: | ||
|
||
```javascript | ||
//# debugId=85314830-023f-4cf1-a267-535f4e37bb17 | ||
``` | ||
|
||
If the special `//# sourceMappingURL=` comment already exists in the file, it is | ||
recommended to place the `debugId` comment in the line above to maintain | ||
compatibility with existing tools. Because the last line already has meaning in | ||
the existing specification for the `sourceMappingURL` comment, tools are | ||
required to examine the last 5 lines to discover the Debug ID. | ||
|
||
## JavaScript API for Debug ID Resolution | ||
|
||
Today `error.stack` in most runtimes only returns the URLs of the files referenced | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
by the stack trace. For Debug IDs to be useful, a solution would need to be added | ||
to enable mapping of JavaScript file URLs to Debug IDs. | ||
|
||
The strawman proposal is to add the Debug ID in two locations: | ||
|
||
* `import.meta.debugId`: a new property that should return the debug ID as UUID | ||
of the current module if has one | ||
mitsuhiko marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* `System.getDebugIdForUrl(url)` looks up the debug ID for a given script file by | ||
URL that has already been loaded by the browser in the current context. | ||
Comment on lines
+143
to
+144
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe this might be the most controversial point of this proposal. While a lot of tools use the Also, "URL that has already been loaded by the browser" might need a bit more clarification, for example:
|
||
|
||
## Appendix A: Self-Description of Files | ||
|
||
Unfortunately, neither transpiled JavaScript files nor source maps can be easily | ||
identified without employing heuristics. Unlike formats like ELF binaries, they | ||
lack a distinctive header for identification purposes. When batch processing | ||
files, the ability to differentiate between various files is invaluable, but | ||
this capability is not fully realized in the context of source maps or | ||
transpiled JavaScript files. Although solving this issue is beyond the scope of | ||
this document, addressing it would significantly aid in distinguishing different | ||
files without relying on intricate heuristics. | ||
|
||
Nevertheless, we recommend that tools utilize the following heuristics to | ||
determine self-identifying JavaScript files and source maps: | ||
|
||
* a JSON file containing a toplevel object with the keys `mapping`, `version`, | ||
`debugId` and `sourcesContent` should be considered to be a self-identifying | ||
source map. | ||
* a UTF-8 encoded text file matching the regular expression | ||
`(?m)^//# debugId=([a-fA-F0-9-]{12,})$` should be considered a transpiled | ||
JavaScript file. | ||
Comment on lines
+160
to
+165
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍🏻 I like this, especially also implicitly forcing JS files to be UTF-8 ;-) Should we also propose a JSON Schema along with a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mostly want to see initial feedback but a JSON schema for source maps would be very valuable by itself. |
||
|
||
## Appendix B: Symbol Server Support | ||
|
||
With debug IDs it becomes possible to resolve source maps and minified JavaScript | ||
files from the server. That way a tool such as a browser or a crash reporter could | ||
be pointed to a S3, GCS bucket or an HTTP server that can serve up source maps and | ||
build artifacts keyed by debug id. | ||
Comment on lines
+170
to
+172
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe we should also plan for browsers and local development servers. Related to my note above, this could potentially replace the various different ways that bundlers / development servers can offer source maps today. I believe offering a local symbol server would also benefit performance a bit, as the tools wouldn’t have to embed base-64 encoded sourcemaps into the development assets, but sourcemaps would only be needed to be serialized on demand. |
||
|
||
The structure itself is inspired by [debuginfod](https://sourceware.org/elfutils/Debuginfod.html): | ||
|
||
* transpiled JavaScript artifact: `<DebugIdFirstTwo>/<DebugIdRest>/js` | ||
* source map: `<DebugIdFirstTwo>/<DebugIdRest>/sourcemap` | ||
mitsuhiko marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
with the following variables: | ||
|
||
* `DebugIdFirstTwo`: the first two characters in lowercase of the hexadecimal Debug ID | ||
* `DebugIdRest`: the remaining characters in lowercase of the hexadecimal Debug ID without dashes | ||
|
||
## Appendix C: Emulating Debug IDs | ||
|
||
In the absence of browser support for loading debug IDs a transpiler can inject | ||
some code to maintain a global dictionary of loaded JavaScript files which allows | ||
experimentation with this concept: | ||
|
||
```javascript | ||
(function() { | ||
try { | ||
throw new Error(); | ||
} catch (err) { | ||
let match; | ||
if ((match = err.stack.match(/(?:\bat |@)(.*?):\d+:\d+$/m)) !== null) { | ||
let ids = (globalThis.__DEBUG_IDS__ = globalThis.__DEBUG_IDS__ || {}); | ||
ids[match[1]] = "<DEBUG_ID>"; | ||
} | ||
} | ||
})(); | ||
``` | ||
|
||
```javascript | ||
function getDebugIdForUrl(url) { | ||
return __DEBUG_IDS__ && _DEBUG_IDS__[url] || undefined; | ||
} | ||
``` | ||
|
||
## Appendix D: Parsing Debug IDs | ||
|
||
The following Python code shows how Debug IDs are to be extracted from | ||
transpiled JavaScript and source map files: | ||
|
||
```python | ||
import re | ||
import uuid | ||
import json | ||
|
||
|
||
_debug_id_re = re.compile(r'^//# debugId=(.*)') | ||
|
||
|
||
def normalize_debug_id(id): | ||
try: | ||
return uuid.UUID(id) | ||
except ValueError: | ||
return None | ||
|
||
|
||
def debug_id_from_transpiled_javascript(source): | ||
for line in source.splitlines()[::-5]: | ||
match = _debug_id_re.index(line) | ||
if match is not None: | ||
debug_id = normalize_debug_id(match.group(1)) | ||
if debug_id is not None: | ||
return debug_id | ||
|
||
|
||
def debug_id_from_source_map(source): | ||
source_map = json.loads(source) | ||
if "debugId" in source_map: | ||
return normalize_debug_id(source_map["debugId"]) | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its unclear to me what this means? Does it mean that only JS files have the
//# debugId
comment in them? Because you can have any arbitrary file as one of the"sources"
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original spec also leaves open CSS and other formats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohhh absolutely, I always forget that this is not exclusive to JS. 🤔