-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Source Maps v4 Proposal #538
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,245 @@ | ||||||||||||||
# Source Maps v3.1 Revision | ||||||||||||||
|
||||||||||||||
Authors: [Rob Paveza](https://github.com/robpaveza) | ||||||||||||||
|
||||||||||||||
## Status of this Document | ||||||||||||||
This document is a starting point for engaging the community in developing collaborative solutions. Source maps are not generally managed or standardized within the W3C or a similar governing body beyond what is supported within the browser developer tools. This document is not intended to block something in that direction, but rather to engage the community with options and discussion about the nature of developer tools. | ||||||||||||||
* This document status: **Active** | ||||||||||||||
* Expected venue: Ad-hoc discussions, to be expanded if there is interest in the community | ||||||||||||||
* **Current version: this document** | ||||||||||||||
|
||||||||||||||
### Revising the standard | ||||||||||||||
|
||||||||||||||
We want to recognize the previous standard and the contributions the former authors have made as part of this implementation. We are taking this avenue of publishing in the Microsoft Edge Explainers repository because the former standard has not been updated in 8 years, and the Mozilla source maps discussion list is not even available in archives. | ||||||||||||||
|
||||||||||||||
## Introduction | ||||||||||||||
|
||||||||||||||
The [previous version of the Source Maps specification](https://sourcemaps.info/spec.html) is reasonably thorough in supporting live debugging. However, one shortcoming that the specification lacks is that it isn't obvious how to decode the names of functions on the call stack. Several implementations (including Node.js and a library [stacktrace-gps](https://github.com/stacktracejs/stacktrace-gps)) use name guessing, and [pasta-sourcemaps](https://github.com/bloomberg/pasta-sourcemaps) uses precompilation and a custom extension to reconstruct a stack trace. | ||||||||||||||
|
||||||||||||||
## Goals | ||||||||||||||
|
||||||||||||||
* Improve post-hoc debugging capabilities | ||||||||||||||
* Preserve compatibility with existing tooling | ||||||||||||||
|
||||||||||||||
## Non-Goals | ||||||||||||||
|
||||||||||||||
Nothing specific at this time. | ||||||||||||||
|
||||||||||||||
## Use Cases | ||||||||||||||
|
||||||||||||||
*The following was taken from [an issue opened on the TypeScript repo](https://github.com/microsoft/TypeScript/issues/46695):* | ||||||||||||||
|
||||||||||||||
Suppose I receive the following information from my application: | ||||||||||||||
|
||||||||||||||
``` | ||||||||||||||
TypeError: Cannot read properties of undefined (reading 'x') | ||||||||||||||
at re.draw (bundle.js:1:16372) | ||||||||||||||
at re.drawLayer (bundle.js:1:14170) | ||||||||||||||
at be.draw (bundle.js:1:74592) | ||||||||||||||
at Te.render (bundle.js:1:114230) | ||||||||||||||
at Te.reflowCanvas (bundle.js:1:113897) | ||||||||||||||
at HTMLDocument.anonymous (bundle.js:1:135849) | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
The source map will resolve the first stack frame to `draw(src, sprite, x, y, frame)`, and will correctly point out that the failure here is that the undefined value is actually `frame`. However, *that is not the name of the function*. The function's name is `draw`, which is a member of the `Render` class. | ||||||||||||||
|
||||||||||||||
If I simply apply the value of the `names` array to the function, decoding the stack trace is not particularly useful. It would look something like this: | ||||||||||||||
|
||||||||||||||
``` | ||||||||||||||
TypeError: Cannot read properties of undefined (reading 'x') | ||||||||||||||
at frame (file1.ts:302:7) | ||||||||||||||
at draw (file1.ts:144:5) | ||||||||||||||
at Render (file2.ts:95:5) | ||||||||||||||
at drawArray (file3.ts:178:5) | ||||||||||||||
at render (file3.ts:155:5) | ||||||||||||||
at game (file4.ts:39:5) | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
Using the source map to navigate the source code by hand, I can reconstruct the original call stack: | ||||||||||||||
|
||||||||||||||
``` | ||||||||||||||
TypeError: Cannot read properties of undefined (reading 'x') | ||||||||||||||
at Render#draw (file1.ts:302:7) | ||||||||||||||
at Render#drawLayer (file1.ts:144:5) | ||||||||||||||
at GameObject#draw (file2.ts:95:5) | ||||||||||||||
at Game#render (file3.ts:178:5) | ||||||||||||||
at Game#reflowCanvas (file3.ts:155:5) | ||||||||||||||
at [anonymous function passed to addEventListener] (file4.ts:39:5) | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
But doing this required that I dump the mappings and the source files from the source map and manually inspect the source files. In general, this requires that I use a library to parse the `mappings` field from Source Maps (because [as of Source Maps v3](https://sourcemaps.info/spec.html), this field is stored in a stateful way). | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is what is meant by "stateful", but this wasn't immediately obvious to me when first reading.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not just that it's VLQ-encoded, as VLQs can be stateless. It's that many of the mappings in V3 are relative to the previous line entry. The comma- and semicolon-separation also are required. The new proposed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
|
||||||||||||||
## Proposed Solution | ||||||||||||||
|
||||||||||||||
We want to adopt the general approach taken by the `pasta-sourcemaps` ("Pretty (and) Accurate Stack Trace Analysis") library. However, instead of adding to the `names` field, we will add two fields. | ||||||||||||||
|
||||||||||||||
* Add two additional field to the source map: `scopes` and `scopeNames` | ||||||||||||||
* The `scopes` field should be a list of ranges and pointers into `scopeNames` | ||||||||||||||
* The ranges in the field should always point to **Original Source** locations. | ||||||||||||||
Comment on lines
+76
to
+78
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Untangling these from There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, I was thinking about this, too. Since the original change, I pivoted that there's only one |
||||||||||||||
|
||||||||||||||
### Practical example | ||||||||||||||
|
||||||||||||||
Suppose we have the following TypeScript source: | ||||||||||||||
|
||||||||||||||
```ts | ||||||||||||||
// example.ts | ||||||||||||||
1 function foo(bar: number): never { | ||||||||||||||
2 throw new Error('Intentional.'); | ||||||||||||||
3 } | ||||||||||||||
4 foo(); | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
Direct TypeScript compilation would emit something fairly similar: | ||||||||||||||
|
||||||||||||||
```js | ||||||||||||||
// example.js | ||||||||||||||
1 function foo(bar) { | ||||||||||||||
2 throw new Error('Intentional.'); | ||||||||||||||
3 } | ||||||||||||||
4 foo(); | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
But running it through a minifier would probably transform it to an IIFE lambda: | ||||||||||||||
|
||||||||||||||
```js | ||||||||||||||
// example.min.js | ||||||||||||||
1 (()=>{throw new Error('Intentional.')})(); | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
Examining the call stack of this looks something like: | ||||||||||||||
|
||||||||||||||
``` | ||||||||||||||
Error: Intentional. | ||||||||||||||
at <anonymous>:1:13 | ||||||||||||||
at <anonymous>:1:40 | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
We can get to the exact *locations* with source maps today, but assuming that the `mapping`'s 5th field ("the zero-based index into the `names` list associated with the segment) is encoded as it typically is, which is meant for live debugging, the decoded stack would actually look like this: | ||||||||||||||
|
||||||||||||||
``` | ||||||||||||||
Error: Intentional | ||||||||||||||
at throw (example.js:2:5) | ||||||||||||||
at foo (example.js:4:1) | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
What we want to get to from here would be, at minimum: | ||||||||||||||
|
||||||||||||||
``` | ||||||||||||||
Error: Intentional. | ||||||||||||||
at foo (example.js:2:5) | ||||||||||||||
at Global code (example.js:4:1) | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
### Improving the contents of the map | ||||||||||||||
|
||||||||||||||
Let us re-examine the original code in this example: | ||||||||||||||
|
||||||||||||||
```ts | ||||||||||||||
1 function foo(bar: number): never { | ||||||||||||||
2 throw new Error('Intentional.'); | ||||||||||||||
3 } | ||||||||||||||
4 foo(); | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
The scopes list should be as follows: | ||||||||||||||
|
||||||||||||||
1. Line 1 Col 1 -> Line 1 Col 34 (captures `function foo(bar: number): never ` including the trailing space): `Global code` | ||||||||||||||
1. Line 1 Col 35 -> Line 3 Col 1 (captures the entire body of `foo`): `foo` | ||||||||||||||
1. Line 3 Col 2 -> Line 4 Col 7 (captures the newline + invocation of `foo`): `Global code` | ||||||||||||||
Comment on lines
+146
to
+148
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Contrasting with pasta-sourcemaps: Pasta uses "nested" scopes. Referring to ^ as "listed" scopes for this thread, equivalent would be:
The spec doesn't seem to dwell on this, but some observations:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure I understand what "nested" scopes are here, but regarding the second comment: the approach to bisecting the
Doesn't that meet your requirements? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cross-post from webpack/webpack#14996 (comment):
"Nested" scopes have one "entry" per scope in the source code. Nested will tend to result in fewer mappings, and a data structure that more semantically corresponds to the original source There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ahhh, I understand! Actually, the way I implemented it in TypeScript was the "nested" version, and while I think I was meaning for this to indicate the "logical" view of the scopes list, but you're right -- I did put this into the expected list of these in the spec. I'll get that fixed. |
||||||||||||||
|
||||||||||||||
By de-duplicating the names, that yields the expected `scopeNames` value of `['Global code', 'foo']`. | ||||||||||||||
|
||||||||||||||
Finally, `scopes` will need to be an array of strings, and this array should have the same number of entries as the `sources` field. Each string will be a comma-delimited list of Base64-encoded VLQs with the following set of entries (please note, all offsets are 0-based, in concurrence with the previous Source Maps spec): | ||||||||||||||
|
||||||||||||||
1. A value representing starting line, relative to the previous starting line value. If this is the first value, it is an absolute value. | ||||||||||||||
2. A value representing the absolute starting column number. | ||||||||||||||
3. A value representing the ending line, **relative to the starting line** value. | ||||||||||||||
4. A value representing the ending starting column number. | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Think this is a typo, and should be one of:
Or, the approach currently taken by pasta-sourcemaps:
pasta-sourcemaps encoding spec Some code samples and how these affect the ending column number values (for each we list the values for each of the above strategies on Sample Code 1 (Prettier default) class PackageJson {
parseFooBar() {
// foo
}
static reasonablyLongFunctionName() {
// bar
}
}
Sample Code 2 class PackageJson
{
parseFooBar()
{
// foo
}
static reasonablyLongFunctionName()
{
// bar
}
}
With Base64 VLQ mappings, smaller is better. Numbers [-15, 15] encode as a single Base64 character.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There may be similar considerations for the other fields too, I just opted to go a bit deeper on this one in particular There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Relative ending column may be preferable here to absolute. Now that I have the TypeScript compiler change, I'm trying to get it to compile itself to see how the performance characteristics change, and we can use iteration here to check which produces smaller values. My supposition, however, is that most often, the absolute values of ending column will generally be < the absolute values of the starting column, because that's usually how we write JS: function doSomething() { // <-- starting column
// ...
} // <-- ending column That's less true for transforming already-minified files, e.g., react-dom.production.min.js. I'll do some experiments with the TypeScript compiler and report back before we call this "completed" |
||||||||||||||
5. An index into the `scopeNames` array. | ||||||||||||||
|
||||||||||||||
This enhancement chooses to use a combination of relative and absolute numbers because of the nature of the values in question. For source line offsets, using relative encoding makes sense, because the numbers are always expected to be increasing. For column numbers, however, it is highly likely that, because these values represent locations in the *source* file, offsets from starting- to ending-location would be negative, and further, that offsets from an outer-scoped function to an inner-scoped function would be negative. Given that one might expect most source lines to be less than 120 characters in length, there does not appear to be a major need to encode the column as relative. | ||||||||||||||
|
||||||||||||||
Given the above list, we should then encode the above values: | ||||||||||||||
|
||||||||||||||
```json | ||||||||||||||
{ | ||||||||||||||
"sources": ["example.js"], | ||||||||||||||
"scopeNames": ["Global code", "foo"], | ||||||||||||||
"scopes": [ | ||||||||||||||
"[0,0,0,33,0],[0,34,2,0,1],[2,1,1,6,0]" | ||||||||||||||
] | ||||||||||||||
} | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
Or, when encoded with VLQ, as: | ||||||||||||||
|
||||||||||||||
```json | ||||||||||||||
{ | ||||||||||||||
"sources": ["example.js"], | ||||||||||||||
"scopeNames": ["Global code", "foo"], | ||||||||||||||
"scopes": [ | ||||||||||||||
"AAAiCA,AkCEAC,ECCMA" | ||||||||||||||
] | ||||||||||||||
} | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
### Specific naming recommendations | ||||||||||||||
|
||||||||||||||
The following recommendations are **recommendations only** and not part of the "standard". They are, rather, a best attempt to identify how common examples can produce high-fidelity diagnostics experiences. | ||||||||||||||
|
||||||||||||||
* **Class names included**: Class member functions should indicate that they are members of classes. Although a common JavaScript convention has been to use `#` as a shorthand for `Class.prototype.member`, the inclusion of `#` as valid syntax for `private` field shorthand suggests that this should not be used. This spec recommends that the expansion `ClassName.functionName` should be used for instance functions, and `static ClassName.functionName` for static functions. | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Contrasting this with pasta-sourcemaps (from here): method: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think in the first pass of the TS compiler, I rendered these as: method: I'm open to changing though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think not including a |
||||||||||||||
* **Anonymous callback name inference**: Callback functions are often anonymous, but the name of the function being invoked is typically interesting. In the following example, an interesting function name might be something along the lines of `anonymous function passed to map`: | ||||||||||||||
|
||||||||||||||
```ts | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Style Nit: Indenting the code blocks with 4 spaces will bring them in line with the |
||||||||||||||
return myArray.map((item, index) => { | ||||||||||||||
/// ... | ||||||||||||||
}); | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
* **Computed names expressions**: Computed names also present an interesting problem of data that are not available at build time. As best as possible, the original source text should be simply preserved: | ||||||||||||||
|
||||||||||||||
```ts | ||||||||||||||
class Example { | ||||||||||||||
static [Symbol.species]() { return Object; } | ||||||||||||||
['to' + 'String']() { | ||||||||||||||
return 'Hello, world.'; | ||||||||||||||
} | ||||||||||||||
} | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
* In the above example, we would want to see `static Example.[Symbol.species]` and `Example.['to' + 'String']` in the `scopeNames` field of the source map. | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Contrasting with pasta-sourcemaps (from here): Looks like it yields
|
||||||||||||||
|
||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There's a few related discussions that are in-depth enough to ensure readers of this explainer are aware of. |
||||||||||||||
## Privacy and Security Considerations | ||||||||||||||
|
||||||||||||||
### Privacy | ||||||||||||||
|
||||||||||||||
There should be no privacy considerations with this change. | ||||||||||||||
|
||||||||||||||
### Security | ||||||||||||||
|
||||||||||||||
There should be no security concerns with this change. | ||||||||||||||
|
||||||||||||||
## Alternative Solutions | ||||||||||||||
|
||||||||||||||
The original change proposed with the issue was to enhance the `mappings` array: | ||||||||||||||
|
||||||||||||||
> Substantively: Only the `mappings` field would be altered, and would be altered by adding the 6th field. The complete section is included below: | ||||||||||||||
> | ||||||||||||||
> > The “mappings” data is broken down as follows: | ||||||||||||||
> > | ||||||||||||||
> > - each group representing a line in the generated file is separated by a ”;” | ||||||||||||||
> > - each segment is separated by a “,” | ||||||||||||||
> > - each segment is made up of 1,4<s> or 5</s>**, 5, or 6** variable length fields. | ||||||||||||||
> > | ||||||||||||||
> > The fields in each segment are: | ||||||||||||||
> > | ||||||||||||||
> > 1. The zero-based starting column of the line in the generated code that the segment represents. If this is the first field of the first segment, or the first segment following a new generated line (“;”), then this field holds the whole base 64 VLQ. Otherwise, this field contains a base 64 VLQ that is relative to the previous occurrence of this field. Note that this is different than the fields below because the previous value is reset after every generated line. | ||||||||||||||
> > 2. If present, an zero-based index into the “sources” list. This field is a base 64 VLQ relative to the previous occurrence of this field, unless this is the first occurrence of this field, in which case the whole value is represented. | ||||||||||||||
> > 3. If present, the zero-based starting line in the original source represented. This field is a base 64 VLQ relative to the previous occurrence of this field, unless this is the first occurrence of this field, in which case the whole value is represented. Always present if there is a source field. | ||||||||||||||
> > 4. If present, the zero-based starting column of the line in the source represented. This field is a base 64 VLQ relative to the previous occurrence of this field, unless this is the first occurrence of this field, in which case the whole value is represented. Always present if there is a source field. | ||||||||||||||
> > 5. If present, the zero-based index into the “names” list associated with this segment. This field is a base 64 VLQ relative to the previous occurrence of this field, unless this is the first occurrence of this field, in which case the whole value is represented. | ||||||||||||||
> > 6. **If present, the zero-based index into the "names" list associated with the call stack of this segment. This field is a base 64 VLQ relative to the previous occurrence of this field, unless this is the first occurrence of this field, in which case the whole value is represented.** | ||||||||||||||
> | ||||||||||||||
> In addition, the `version` field of the spec should be bumped to 4. | ||||||||||||||
|
||||||||||||||
However, this is likely to require modifications across a tool chain. Supposing we had a tool chain that required, e.g., `typescript`, `webpack`, and `terser`, all of these tools would need to be modified to be aware of the 6th quantity in the VLQ of each mapping, *even if it was just to pass them through* (which is likely the case for this particular use case). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to jump back and forth between these three stacks a few times to fully grock what was shifting (both good and bad) between them, so trying to add a touch more detail explaining the differences.