-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Source Maps v4 Proposal #538
Conversation
This public enhancement proposal updates the existing Source Maps spec to improve stack trace decoding capabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really excited by these developments! I like the clear examples in the explainer, they make it very easy to ramp up on the problem space.
I've added various inline comments drawing comparisons with the existing pasta design. Regarding the naming recommendations, pasta was inspired mostly by representations seen in places in the devtools. I don't believe there to be one "right" answer though! Looking in the parser fixtures folder you can see other choices that were made, including cases where functions were nested (including deeply) in objects.
Pasta is currently a little lacking in the experimentation/playground/REPL department, but we could iterate on that. It could be a useful starting point for iterating on different designs as it contains most of the parsing/business-logic needed already.
at [anonymous function passed to addEventListener] (file4.ts:39:5) | ||
``` | ||
|
||
But doing this required that I dump the mappings and the source files from the source map and manually inspect the source files. In general, this requires that I use a library to parse the `mappings` field from Source Maps (because [as of Source Maps v3](https://sourcemaps.info/spec.html), this field is stored in a stateful way). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is what is meant by "stateful", but this wasn't immediately obvious to me when first reading.
But doing this required that I dump the mappings and the source files from the source map and manually inspect the source files. In general, this requires that I use a library to parse the `mappings` field from Source Maps (because [as of Source Maps v3](https://sourcemaps.info/spec.html), this field is stored in a stateful way). | |
But doing this required that I dump the mappings and the source files from the source map and manually inspect the source files. In general, this requires that I use a library to parse the `mappings` field from Source Maps (because [as of Source Maps v3](https://sourcemaps.info/spec.html), this field is encoded as [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity)). | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not just that it's VLQ-encoded, as VLQs can be stateless. It's that many of the mappings in V3 are relative to the previous line entry. The comma- and semicolon-separation also are required. The new proposed scopes
VLQs are a bit less stateful, but are still stateful from VLQ to VLQ entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But doing this required that I dump the mappings and the source files from the source map and manually inspect the source files. In general, this requires that I use a library to parse the `mappings` field from Source Maps (because [as of Source Maps v3](https://sourcemaps.info/spec.html), this field is stored in a stateful way). | |
But doing this required that I dump the mappings and the source files from the source map and manually inspect the source files. In general, this requires that I use a library to parse the `mappings` field from Source Maps (because [as of Source Maps v3](https://sourcemaps.info/spec.html), the position offsets of each mapping are relative to the one before). |
DevTools/SourceMaps/explainer.md
Outdated
1. A value representing starting line, relative to the previous starting line value. If this is the first value, it is an absolute value. | ||
2. A value representing the absolute starting column number. | ||
3. A value representing the ending line, **relative to the starting line** value. | ||
4. A value representing the ending starting column number. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think this is a typo, and should be one of:
A value representing the ending column number, **relative to the starting column number** value.
A value representing the absolute ending column number.
Or, the approach currently taken by pasta-sourcemaps:
A value representing the ending column, relative to the previous mapping's ending column number
pasta-sourcemaps encoding spec
Some code samples and how these affect the ending column number values (for each we list the values for each of the above strategies on PackageJson
, parseFooBar
and reasonablyLongFunctionName
, respectively):
Sample Code 1 (Prettier default)
class PackageJson {
parseFooBar() {
// foo
}
static reasonablyLongFunctionName() {
// bar
}
}
-18, -14, -36
0, 2, 2
0, 2, 0
Sample Code 2
class PackageJson
{
parseFooBar()
{
// foo
}
static reasonablyLongFunctionName()
{
// bar
}
}
0, 0, 0
0, 2, 2
0, 2, 0
With Base64 VLQ mappings, smaller is better. Numbers [-15, 15] encode as a single Base64 character.
- Suffers from large starting/ending column numbers for Prettier-style code, bounded only by developer apetite for long function names.
- Performs reasonable, will be multiples of "tab-width", deep nesting will dip outside [-15, 15]
- Closing braces will tend to jump by positive/negative "tab-width", usually 2, 4 or 8, so typically always falls in [-15, 15] range
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be similar considerations for the other fields too, I just opted to go a bit deeper on this one in particular
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relative ending column may be preferable here to absolute. Now that I have the TypeScript compiler change, I'm trying to get it to compile itself to see how the performance characteristics change, and we can use iteration here to check which produces smaller values. My supposition, however, is that most often, the absolute values of ending column will generally be < the absolute values of the starting column, because that's usually how we write JS:
function doSomething() { // <-- starting column
// ...
} // <-- ending column
That's less true for transforming already-minified files, e.g., react-dom.production.min.js.
I'll do some experiments with the TypeScript compiler and report back before we call this "completed"
|
||
The following recommendations are **recommendations only** and not part of the "standard". They are, rather, a best attempt to identify how common examples can produce high-fidelity diagnostics experiences. | ||
|
||
* **Class names included**: Class member functions should indicate that they are members of classes. Although a common JavaScript convention has been to use `#` as a shorthand for `Class.prototype.member`, the inclusion of `#` as valid syntax for `private` field shorthand suggests that this should not be used. This spec recommends that the expansion `ClassName.functionName` should be used for instance functions, and `static ClassName.functionName` for static functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Contrasting this with pasta-sourcemaps (from here):
method: ClassName.prototype.methodName
getter: ClassName.prototype.get getterName
(similar for setter)
static method: ClassName.staticMethodName
static getter: ClassName.get staticGetterName
(similar for setter)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in the first pass of the TS compiler, I rendered these as:
method: ClassName.methodName
getter: ClassName.get getterName
static method: static ClassName.staticMethodName
static getter: Not sure that I handled this specific one, but it would have been rendered static ClassName.get getterName
I'm open to changing though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think not including a prototype
in the instance methods will lead to confusion with static methods, because people call the static methods as Foo.bar()
matching the ClassName.methodName
pattern.
} | ||
``` | ||
|
||
* In the above example, we would want to see `static Example.[Symbol.species]` and `Example.['to' + 'String']` in the `scopeNames` field of the source map. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Contrasting with pasta-sourcemaps (from here):
Looks like it yields
<computed: identifier>
<computed: stringLiteral>
<computed: numericLiteral>
<computed>
(else this)
1. Line 1 Col 1 -> Line 1 Col 34 (captures `function foo(bar: number): never ` including the trailing space): `Global code` | ||
1. Line 1 Col 35 -> Line 3 Col 1 (captures the entire body of `foo`): `foo` | ||
1. Line 3 Col 2 -> Line 4 Col 7 (captures the newline + invocation of `foo`): `Global code` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Contrasting with pasta-sourcemaps:
Pasta uses "nested" scopes. Referring to ^ as "listed" scopes for this thread, equivalent would be:
- Line 1 Col 1 -> Line 4 Col 7 (captures until end of invocation of
foo
):<top-level>
- Line 1 Col 35 -> Line 3 Col 1 (captures the entire body of
foo
):foo
The spec doesn't seem to dwell on this, but some observations:
- "nested" is smaller
- Can't recursively bisect the "nested"
scopes
to find position- VLQ doesn't permit this anyway, must be linearly unpacked
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand what "nested" scopes are here, but regarding the second comment: the approach to bisecting the scopes
:
- Parse all of them
- Sort by start line, then start column
- Now you can binary search, you should look for the last entry that starts before and ends after the mapping position
Doesn't that meet your requirements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cross-post from webpack/webpack#14996 (comment):
// *nested* // *linear* | "<top-level>" | "<top-level>" | | | | "causeError" vs | "causeError" | | | | | "<top-level>"
"Nested" scopes have one "entry" per scope in the source code.
"Linear" scopes have an entry each time you transition from one scope to another, left to right, top to bottom.
Nested will tend to result in fewer mappings, and a data structure that more semantically corresponds to the original source
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh, I understand! Actually, the way I implemented it in TypeScript was the "nested" version, and while I think I was meaning for this to indicate the "logical" view of the scopes list, but you're right -- I did put this into the expected list of these in the spec. I'll get that fixed.
* Add two additional field to the source map: `scopes` and `scopeNames` | ||
* The `scopes` field should be a list of ranges and pointers into `scopeNames` | ||
* The ranges in the field should always point to **Original Source** locations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Untangling these from names
is probably cleaner. Might be be worth speccing here the expectation of the sourcemap "apply" operation needed to combine sourcemaps? (considering that a sourcemap combine stage may be combining N earlier maps with 1 new map in the case of a transformation like inlining).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I was thinking about this, too. Since the original change, I pivoted that there's only one scopes
field -- not an array of scopes
fields -- but the bundler will need to process these either way. I'm working on that with webpack/webpack#14996.
Source Maps v4 standard proposal is an addendum to source maps to improve post-hoc debugging and analysis of call stacks. This proposal is being put forward in [https://github.com/MicrosoftEdge/MSEdgeExplainers/pull/538](Microsoft Edge) and is designed to enable an analysis of stack traces to reconstruct an unminified stack trace. To do so, two additional fields are added to the source maps output: - `scopeNames` - `scopes` The latter field behaves similarly to the existing `mappings` field. The former list are names that should be displayed for a given function once it is determined to be responsible. The format of the `scopes` field is a Base64 VLQ with 6 fields, which are, in order: - Source file index - Source line (relative offset) - Source character (relative offset) - End line (relative offset, should always be positive) - End character (relative offset) - Index into `scopeNames` (This first commit does not implement the VLQ encoding; rather, all number values are directly passed in and are neither relative nor are they encoded as VLQs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've proposed a couple changes to help reviewers. One to elaborate a bit on the differences between the stacks since all of them didn't jump out to me immediately, and another to link to the related conversations since there's good discussions happening in other repos.
at game (file4.ts:39:5) | ||
``` | ||
|
||
Using the source map to navigate the source code by hand, I can reconstruct the original call stack: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the source map to navigate the source code by hand, I can reconstruct the original call stack: | |
The source line information here is improved and allows me to more easily find the source files and lines, but the function names are modified and now incorrect. Using the source map to navigate the source code by hand, I can reconstruct the original call stack including the class full names and anonymous functions: | |
I had to jump back and forth between these three stacks a few times to fully grock what was shifting (both good and bad) between them, so trying to add a touch more detail explaining the differences.
``` | ||
|
||
* In the above example, we would want to see `static Example.[Symbol.species]` and `Example.['to' + 'String']` in the `scopeNames` field of the source map. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## Related Discussions | |
This proposal has a handful of discussions across related GitHub repos: | |
- [TypeScript Issue](https://github.com/microsoft/TypeScript/issues/46695) | |
- [TypeScript Prototype PR](https://github.com/microsoft/TypeScript/pull/47152) | |
- [WebPack Proposal](https://github.com/webpack/webpack/issues/14996) |
There's a few related discussions that are in-depth enough to ensure readers of this explainer are aware of.
2. A value representing starting line, relative to the previous starting line value. If this is the first value, it is an absolute value. | ||
3. A value representing the absolute starting column number. | ||
4. A value representing the ending line, **relative to the starting line** value. | ||
5. A value representing the ending starting column number. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming this is also an absolute column number.
5. A value representing the ending starting column number. | |
5. A value representing the absolute ending column number. |
3. A value representing the absolute starting column number. | ||
4. A value representing the ending line, **relative to the starting line** value. | ||
5. A value representing the ending starting column number. | ||
6. An index into the `scopeNames` array. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this relative to the previous scopeNames
index?
{ | ||
"sources": ["example.js"], | ||
"scopeNames": ["Global code", "foo"], | ||
"scopes": "AAAAiCA,AAkCEAC,AECCMA" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If scopeName
index is relative to the previous one, this should decrement:
"scopes": "AAAAiCA,AAkCEAC,AECCMA" | |
"scopes": "AAAAiCA,AAkCEAC,AECCMD" | |
|
||
The following recommendations are **recommendations only** and not part of the "standard". They are, rather, a best attempt to identify how common examples can produce high-fidelity diagnostics experiences. | ||
|
||
* **Class names included**: Class member functions should indicate that they are members of classes. Although a common JavaScript convention has been to use `#` as a shorthand for `Class.prototype.member`, the inclusion of `#` as valid syntax for `private` field shorthand suggests that this should not be used. This spec recommends that the expansion `ClassName.functionName` should be used for instance functions, and `static ClassName.functionName` for static functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think not including a prototype
in the instance methods will lead to confusion with static methods, because people call the static methods as Foo.bar()
matching the ClassName.methodName
pattern.
Finally, `scopes` will be a comma-delimited list of Base64-encoded VLQs with the following set of entries (please note, all offsets are 0-based, in concurrence with the previous Source Maps spec): | ||
|
||
1. A value representing the current source index, relative to the previous source index value. If this is the first value, it is an absolute value. | ||
2. A value representing starting line, relative to the previous starting line value. If this is the first value, it is an absolute value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the source line/column values necessary for something? Are you just trying to translate from the unminified frame's line/column into a scope name? Or is there something more?
I think you're trying to create start and end regions of the source file. You then take an unminified line/coumn and lookup the region that contains it. You use that to translate into a scope name.
If so, this can be made much smaller by index ranges of the mappings
field. Eg, running your TypeScript input through babel generates
{
"version": 3,
"sources": [ "input.js" ],
"names": [
"foo",
"bar",
"Error"
],
"mappings": "AAAA,QAASA,CAAAA,GAAT,CAAaC,GAAb,CAAiC,CAC7B,KAAM,IAAIC,CAAAA,KAAJ,CAAU,cAAV,CAAN,CACH,CACDF,GAAG",
"sourcesContent": [
"function foo(bar: number): never {\n throw new Error('Intentional.');\n}\nfoo();"
]
}
Decoding the mappings gives us:
[
[
// This represents `Global Code` scope
[ 0, 0, 0, 0 ], // `function`
[ 8, 0, 0, 9, 0 ], // ` `
[ 9, 0, 0, 9, 0 ], // `foo`
[ 12, 0, 0, 0 ], // `(`
[ 13, 0, 0, 13, 1 ], // `bar`
[ 16, 0, 0, 0 ], // `)`
// This represents `foo` scope
[ 17, 0, 0, 33 ], // `{`
[ 18, 0, 1, 4 ], // `throw`
[ 23, 0, 1, 10 ], // ` new`
[ 27, 0, 1, 14, 2 ], // ` `
[ 28, 0, 1, 14, 2 ], // `Error`
[ 33, 0, 1, 10 ], // `(`
[ 34, 0, 1, 20 ], // `'Intentional.'`
[ 48, 0, 1, 10 ], // `)`
[ 49, 0, 1, 4 ], // `;`
[ 50, 0, 2, 1 ], // `}`
// We're back to `Global Code` scope
[ 51, 0, 3, 0, 0 ], // `foo`
[ 54, 0, 3, 3 ], // `();`
]
]
We have 3 ranges: Indexes 0
through 5
(inclusive) represent the Global Code
scope of function foo(bar)
, 6
through 15
represent foo
scope of {throw new Error('Intentional.');}
, and 16
till the end represent Global Code
scope of foo();
.
(This example is unfortunately just a single line of output. When there are multiple lines, the indexes should still be considered contiguous. Eg, 5 mappings on output line 1 means the first mapping of line 2 is at index 5
.)
This can easily be represented with just 6 numbers:
// Decoded format is using absolute indices, when encoding into VLQ make it relatives.
{
scopes: [
// First item is `mappings` index, second is `scopeNames` index
// mapping 0 till the next mapping point at scopeName 0
[0, 0],
// mapping 6 till the next mapping point at scopeName 1
[6, 1],
// mapping 16 and all later mappings point at scopeName 0
[16, 0],
],
scopeNames: ['Global Code', 'foo'],
}
Encoding this into VLQ (using relative to previous values) gives us:
{
scopes: "AA,MC,UD",
scopeNames: ['Global Code', 'foo'],
}
This has the benefit of not requiring a binary search to find the correct region, since it can be extracted while decoding the mappings
field.
* **Class names included**: Class member functions should indicate that they are members of classes. Although a common JavaScript convention has been to use `#` as a shorthand for `Class.prototype.member`, the inclusion of `#` as valid syntax for `private` field shorthand suggests that this should not be used. This spec recommends that the expansion `ClassName.functionName` should be used for instance functions, and `static ClassName.functionName` for static functions. | ||
* **Anonymous callback name inference**: Callback functions are often anonymous, but the name of the function being invoked is typically interesting. In the following example, an interesting function name might be something along the lines of `anonymous function passed to myArray.map`: | ||
|
||
```ts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style Nit: Indenting the code blocks with 4 spaces will bring them in line with the <li>
's padding.
@robpaveza has suggested that we close this PR and focus our collected efforts on tc39/ecma426#16 where more of a critical mass of relevant community members are engaging and where discussion on the details of the proposals is continuing. |
This public enhancement proposal updates the existing Source Maps spec to improve stack trace decoding capabilities.
I would like to keep this PR open for a period of time. I believe that we will be interested in implementing this in Edge and Chromium DevTools, but I would like to keep the discussion open on our PR for public comment here (as well as in the TypeScript repo).