This document presents a proposal to add globally unique build or debug IDs to source maps and generated code, making build artifacts self-identifying and facilitating bidirectional references between Source Maps and generated code.
Source maps proposal at stage 2 of the process, see Our process document
Luca Forstner
Source maps play a crucial role in debugging by providing a mapping between generated code and the original source code. However, the current source map specification lacks important properties such as self-describing and self-identifying capabilities for both the generated code as well as the source map. This results in a subpar user experience and numerous practical problems, most prominently making it difficult to associate Source Maps with the corresponding generated code. To address these issues, we propose an extension to the source map format: the addition of globally unique Debug IDs.
The primary objective of this proposal is to enhance the source map format by introducing globally unique Debug IDs, enabling better identification and organization of generated code and their corresponding source maps. This improvement will streamline the debugging process and reduce the likelihood of errors arising from misidentification or misassociation of files.
Debug IDs (also sometimes called Build IDs) are already used in the native language ecosystem and supported by native container formats such as PE, ELF, MachO or WASM.
The proposed solution offers the following benefits:
-
Improved File Identification: The introduction of globally unique Debug IDs will make it easier to identify and associate generated code with its corresponding source map.
-
Self-Identifying Files: This specification changes both, source maps and generated code so that they become self-identifying, eliminating the need for external information to reference them.
-
Streamlined Debugging Process: The implementation of Debug IDs will simplify and streamline the debugging process by reducing the likelihood of errors resulting from misidentification or misassociation of files.
-
Standardization: The adoption of this proposal as a web standard will encourage a consistent and unified approach to handling source maps and generated code across the industry.
-
Guaranteed bidirectionality: Today source maps do not provide the ability to reliably resolve back to the generated file they are from. However in practice tools often require this as they are often leveraging the generated artifact to resolve scope information by parsing the source.
-
Symbol server support: with Debug IDs and source maps with embedded sources it becomes possible to support symbol server lookup from symbol servers.
This proposal sets some specific limitations on source maps to simplify the processing in the wider ecosystem. Debug IDs are at present only specified to source maps with embedded sources or where sources are categorically not available. The lookup for original sources from a source map identified by a debug ID is not defined.
Additionally, this specification applies only to non-indexed source maps and currently specifies references only for JavaScript.
In the context of this document:
- Source Map: Refers to a non-indexed, standard source map.
- Generated Code: Refers to a code generated by a compiler, for example a JavaScript minifier.
- Debug ID: Refers to a UUID as described in this document.
Debug IDs are globally unique identifiers for build artifacts.
They are specified to be UUIDs in the format of 85314830-023f-4cf1-a267-535f4e37bb17
.
The format is intentionally chosen to be strict to ensure consistency and simplicity in generating and consuming tooling.
Debug IDs are embedded in both source maps and transformed files, allowing a bidirectional mapping between them. The linking of source maps and transformed files via HTTP headers is explicitly not desired. A file identified by a Debug ID must have that Debug ID embedded to ensure the file is self-identifying.
The way a Debug ID is generated is specific to the toolchain and the only proposed requirement is that Debug IDs are 128-bit values. We propose this requirement to ensure consistency and promote simplicity across the ecosystem.
Since Debug IDs are embedded in build artifacts, it is recommended that tools generated deterministic Debug IDs (e.g. UUIDv3, UUIDv5) whenever possible, so that the produced artifacts are stable across builds. Specification-wise, Debug IDs do not need to be deterministic. Determinism is not enforced so that tools can employ non-deterministic fallback mechanisms in case of colliding Debug IDs between two different generated artifacts.
We propose adding a debugId
property to the source map at the top level of the source map object.
This property must be a string value representing the Debug ID in hexadecimal characters, using the canonical UUID format:
{
"version": 3,
"file": "app.min.js",
"debugId": "85314830-023f-4cf1-a267-535f4e37bb17",
"sources": [...],
"sourcesContent": [...],
"mappings": "..."
}
Generated JavaScript files containing a Debug ID must embed the ID near the end of the source, ideally on the last line, in the format //# debugId=<DEBUG_ID>
using the canonical UUID format:
//# debugId=85314830-023f-4cf1-a267-535f4e37bb17
If the special //# sourceMappingURL=
comment already exists in the file, it is recommended to place the debugId
comment in the line above to maintain compatibility with existing tools.
Because the last line already has meaning in the existing specification for the sourceMappingURL
comment, tools are required to examine the last 5 lines to discover the Debug ID.
Note on the end of file: for all intents and purposes having the Debug ID at the top of the file would be preferable. However this has the disadvantage that a tool could not add a Debug ID to a file without having to adjust all the tokens in the source map by the offset that this line adds. Having it at the end of the file means it's after all tokens which would allow a separate tool to add Debug IDs to generated files and source maps.
Today error.stack
in most engines only returns the URLs of the files referenced by the stack trace.
For Debug IDs to be useful, a solution would need to be added to enable mapping of JavaScript file URLs to Debug IDs.
The strawman proposal is to add the Debug ID in two locations:
import.meta.debugId
: a new property that should return the debug ID as a string of the current module if it has one, in the canonical UUID format.- A Function property in the global scope
getDebugIdForUrl(url)
that looks up the debug ID for a given script file by URL that has already been loaded by the browser in the current context.
Unfortunately, neither generated JavaScript files nor source maps can be easily identified without employing heuristics. Unlike formats like ELF binaries, they lack a distinctive header for identification purposes. When batch processing files, the ability to differentiate between various files is invaluable, but this capability is not fully realized in the context of source maps or generated JavaScript files. Although solving this issue is beyond the scope of this document, addressing it would significantly aid in distinguishing different files without relying on intricate heuristics.
Nevertheless, we recommend that tools utilize the following heuristics to determine self-identifying JavaScript files and source maps:
- A JSON file containing a top-level object with the keys
mapping
,version
,debugId
andsourcesContent
should be considered to be a self-identifying source map. - A UTF-8 encoded text file matching the regular expression
(?m)^//# debugId=([a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12})$
should be considered a generated JavaScript file.
With debug IDs it becomes possible to resolve source maps and generated code from the server. That way a tool such as a browser or a crash reporter could be pointed to a S3, GCS bucket or an HTTP server that can serve up source maps and build artifacts keyed by debug id.
The structure itself is inspired by debuginfod:
- generated code artifact:
<DebugIdFirstTwo>/<DebugIdRest>/source.js
- source map:
<DebugIdFirstTwo>/<DebugIdRest>/sourcemap.json
with the following variables:
DebugIdFirstTwo
: the first two characters in lowercase of the hexadecimal Debug IDDebugIdRest
: the remaining characters in lowercase of the hexadecimal Debug ID without dashes
Note that debuginfod usually does not use extensions on the path lookup syntax so the more natural filenames would just be source
and sourcemap
.
For this proposal, we include a repository for "polyfilling" Debug IDs. It includes an implementation of plugins for various popular build-tooling as well as an implementation for a runtime API to access Debug IDs.
Note: While polyfilling is possible and is in wide production use already1, we have found a plethora of issues:
- Complexity in setup and compatibility
- Polyfills usually require nasty workarounds for build-tool quirks
- Build-tools often don't allow for modifying source-maps
- Injecting Debug IDs into transitive dependencies is error prone and in some cases ruins the entire polyfilling process
- The polyfills inflate bundle-size more than necessary
- Chicken-and-egg situations with Subresource Integrity
The following Source Map Generators have implemented Debug IDs as proposed:
- Rollup (
output.sourcemapDebugIds
option) - Oxc (
debug_id
API) - Expo (Injected by default)
- Rolldown (
output.sourcemapDebugIds
option)
The following Source Map Consumers/Debuggers have implemented Debug IDs:
- Sentry.io (Docs)
- How should the
//# debugId=...
comment be parsed by consuming tools and JavaScript engines? - How does the
//# debugId=...
comment interact with the//# sourceMappingURL=...
comment?
Footnotes
-
Sentry.io is using the polyfills to enable its users to inject Debug IDs into generated code and Source Maps and is processing multiple hundreds of millions of artifacts with Debug IDs a month. Debug IDs in this very limited form have anecdotally worked out really well. ↩