Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source Maps v4 Proposal #538

Closed
wants to merge 2 commits into from
Closed

Conversation

robpaveza
Copy link
Contributor

This public enhancement proposal updates the existing Source Maps spec to improve stack trace decoding capabilities.

I would like to keep this PR open for a period of time. I believe that we will be interested in implementing this in Edge and Chromium DevTools, but I would like to keep the discussion open on our PR for public comment here (as well as in the TypeScript repo).

This public enhancement proposal updates the existing Source Maps spec
to improve stack trace decoding capabilities.
Copy link

@tchetwin tchetwin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really excited by these developments! I like the clear examples in the explainer, they make it very easy to ramp up on the problem space.

I've added various inline comments drawing comparisons with the existing pasta design. Regarding the naming recommendations, pasta was inspired mostly by representations seen in places in the devtools. I don't believe there to be one "right" answer though! Looking in the parser fixtures folder you can see other choices that were made, including cases where functions were nested (including deeply) in objects.

Pasta is currently a little lacking in the experimentation/playground/REPL department, but we could iterate on that. It could be a useful starting point for iterating on different designs as it contains most of the parsing/business-logic needed already.

at [anonymous function passed to addEventListener] (file4.ts:39:5)
```

But doing this required that I dump the mappings and the source files from the source map and manually inspect the source files. In general, this requires that I use a library to parse the `mappings` field from Source Maps (because [as of Source Maps v3](https://sourcemaps.info/spec.html), this field is stored in a stateful way).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is what is meant by "stateful", but this wasn't immediately obvious to me when first reading.

Suggested change
But doing this required that I dump the mappings and the source files from the source map and manually inspect the source files. In general, this requires that I use a library to parse the `mappings` field from Source Maps (because [as of Source Maps v3](https://sourcemaps.info/spec.html), this field is stored in a stateful way).
But doing this required that I dump the mappings and the source files from the source map and manually inspect the source files. In general, this requires that I use a library to parse the `mappings` field from Source Maps (because [as of Source Maps v3](https://sourcemaps.info/spec.html), this field is encoded as [VLQ](https://en.wikipedia.org/wiki/Variable-length_quantity)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not just that it's VLQ-encoded, as VLQs can be stateless. It's that many of the mappings in V3 are relative to the previous line entry. The comma- and semicolon-separation also are required. The new proposed scopes VLQs are a bit less stateful, but are still stateful from VLQ to VLQ entry.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
But doing this required that I dump the mappings and the source files from the source map and manually inspect the source files. In general, this requires that I use a library to parse the `mappings` field from Source Maps (because [as of Source Maps v3](https://sourcemaps.info/spec.html), this field is stored in a stateful way).
But doing this required that I dump the mappings and the source files from the source map and manually inspect the source files. In general, this requires that I use a library to parse the `mappings` field from Source Maps (because [as of Source Maps v3](https://sourcemaps.info/spec.html), the position offsets of each mapping are relative to the one before).

1. A value representing starting line, relative to the previous starting line value. If this is the first value, it is an absolute value.
2. A value representing the absolute starting column number.
3. A value representing the ending line, **relative to the starting line** value.
4. A value representing the ending starting column number.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this is a typo, and should be one of:

  1. A value representing the ending column number, **relative to the starting column number** value.
  2. A value representing the absolute ending column number.

Or, the approach currently taken by pasta-sourcemaps:

  1. A value representing the ending column, relative to the previous mapping's ending column number

pasta-sourcemaps encoding spec

Some code samples and how these affect the ending column number values (for each we list the values for each of the above strategies on PackageJson, parseFooBar and reasonablyLongFunctionName, respectively):

Sample Code 1 (Prettier default)

class PackageJson {
  parseFooBar() {
    // foo
  }

  static reasonablyLongFunctionName() {
    // bar
  }
}
  1. -18, -14, -36
  2. 0, 2, 2
  3. 0, 2, 0

Sample Code 2

class PackageJson
{
  parseFooBar()
  {
    // foo
  }

  static reasonablyLongFunctionName()
  {
    // bar
  }
}
  1. 0, 0, 0
  2. 0, 2, 2
  3. 0, 2, 0

With Base64 VLQ mappings, smaller is better. Numbers [-15, 15] encode as a single Base64 character.

  1. Suffers from large starting/ending column numbers for Prettier-style code, bounded only by developer apetite for long function names.
  2. Performs reasonable, will be multiples of "tab-width", deep nesting will dip outside [-15, 15]
  3. Closing braces will tend to jump by positive/negative "tab-width", usually 2, 4 or 8, so typically always falls in [-15, 15] range

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be similar considerations for the other fields too, I just opted to go a bit deeper on this one in particular

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relative ending column may be preferable here to absolute. Now that I have the TypeScript compiler change, I'm trying to get it to compile itself to see how the performance characteristics change, and we can use iteration here to check which produces smaller values. My supposition, however, is that most often, the absolute values of ending column will generally be < the absolute values of the starting column, because that's usually how we write JS:

function doSomething() { // <-- starting column
 // ...
} // <-- ending column

That's less true for transforming already-minified files, e.g., react-dom.production.min.js.

I'll do some experiments with the TypeScript compiler and report back before we call this "completed"


The following recommendations are **recommendations only** and not part of the "standard". They are, rather, a best attempt to identify how common examples can produce high-fidelity diagnostics experiences.

* **Class names included**: Class member functions should indicate that they are members of classes. Although a common JavaScript convention has been to use `#` as a shorthand for `Class.prototype.member`, the inclusion of `#` as valid syntax for `private` field shorthand suggests that this should not be used. This spec recommends that the expansion `ClassName.functionName` should be used for instance functions, and `static ClassName.functionName` for static functions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contrasting this with pasta-sourcemaps (from here):

method: ClassName.prototype.methodName
getter: ClassName.prototype.get getterName (similar for setter)
static method: ClassName.staticMethodName
static getter: ClassName.get staticGetterName (similar for setter)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in the first pass of the TS compiler, I rendered these as:

method: ClassName.methodName
getter: ClassName.get getterName
static method: static ClassName.staticMethodName
static getter: Not sure that I handled this specific one, but it would have been rendered static ClassName.get getterName

I'm open to changing though.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not including a prototype in the instance methods will lead to confusion with static methods, because people call the static methods as Foo.bar() matching the ClassName.methodName pattern.

}
```

* In the above example, we would want to see `static Example.[Symbol.species]` and `Example.['to' + 'String']` in the `scopeNames` field of the source map.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contrasting with pasta-sourcemaps (from here):

Looks like it yields

  • <computed: identifier>
  • <computed: stringLiteral>
  • <computed: numericLiteral>
  • <computed> (else this)

Comment on lines +146 to +148
1. Line 1 Col 1 -> Line 1 Col 34 (captures `function foo(bar: number): never ` including the trailing space): `Global code`
1. Line 1 Col 35 -> Line 3 Col 1 (captures the entire body of `foo`): `foo`
1. Line 3 Col 2 -> Line 4 Col 7 (captures the newline + invocation of `foo`): `Global code`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contrasting with pasta-sourcemaps:

Pasta uses "nested" scopes. Referring to ^ as "listed" scopes for this thread, equivalent would be:

  1. Line 1 Col 1 -> Line 4 Col 7 (captures until end of invocation of foo): <top-level>
  2. Line 1 Col 35 -> Line 3 Col 1 (captures the entire body of foo): foo

The spec doesn't seem to dwell on this, but some observations:

  • "nested" is smaller
  • Can't recursively bisect the "nested" scopes to find position
    • VLQ doesn't permit this anyway, must be linearly unpacked

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what "nested" scopes are here, but regarding the second comment: the approach to bisecting the scopes:

  • Parse all of them
  • Sort by start line, then start column
  • Now you can binary search, you should look for the last entry that starts before and ends after the mapping position

Doesn't that meet your requirements?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cross-post from webpack/webpack#14996 (comment):

// *nested*                            // *linear*
| "<top-level>"                        | "<top-level>"
|                                      |
|  |  "causeError"         vs            |  "causeError"
|  |                                     |
|                                      | "<top-level>"

"Nested" scopes have one "entry" per scope in the source code.
"Linear" scopes have an entry each time you transition from one scope to another, left to right, top to bottom.

Nested will tend to result in fewer mappings, and a data structure that more semantically corresponds to the original source

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh, I understand! Actually, the way I implemented it in TypeScript was the "nested" version, and while I think I was meaning for this to indicate the "logical" view of the scopes list, but you're right -- I did put this into the expected list of these in the spec. I'll get that fixed.

Comment on lines +76 to +78
* Add two additional field to the source map: `scopes` and `scopeNames`
* The `scopes` field should be a list of ranges and pointers into `scopeNames`
* The ranges in the field should always point to **Original Source** locations.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Untangling these from names is probably cleaner. Might be be worth speccing here the expectation of the sourcemap "apply" operation needed to combine sourcemaps? (considering that a sourcemap combine stage may be combining N earlier maps with 1 new map in the case of a transformation like inlining).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I was thinking about this, too. Since the original change, I pivoted that there's only one scopes field -- not an array of scopes fields -- but the bundler will need to process these either way. I'm working on that with webpack/webpack#14996.

robpaveza referenced this pull request in robpaveza/TypeScript Dec 13, 2021
Source Maps v4 standard proposal is an addendum to source maps to
improve post-hoc debugging and analysis of call stacks. This proposal is
being put forward in
[https://github.com/MicrosoftEdge/MSEdgeExplainers/pull/538](Microsoft Edge)
and is designed to enable an analysis of stack traces to reconstruct an
unminified stack trace.

To do so, two additional fields are added to the source maps output:
 - `scopeNames`
 - `scopes`

The latter field behaves similarly to the existing `mappings` field. The
former list are names that should be displayed for a given function once
it is determined to be responsible. The format of the `scopes` field is
a Base64 VLQ with 6 fields, which are, in order:
 - Source file index
 - Source line (relative offset)
 - Source character (relative offset)
 - End line (relative offset, should always be positive)
 - End character (relative offset)
 - Index into `scopeNames`

(This first commit does not implement the VLQ encoding; rather, all
number values are directly passed in and are neither relative nor are
they encoded as VLQs).
@robpaveza robpaveza changed the title First pass at v3.1 enhancement of source maps. Source Maps v4 Proposal Dec 15, 2021
Copy link
Contributor

@aluhrs13 aluhrs13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've proposed a couple changes to help reviewers. One to elaborate a bit on the differences between the stacks since all of them didn't jump out to me immediately, and another to link to the related conversations since there's good discussions happening in other repos.

at game (file4.ts:39:5)
```

Using the source map to navigate the source code by hand, I can reconstruct the original call stack:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Using the source map to navigate the source code by hand, I can reconstruct the original call stack:
The source line information here is improved and allows me to more easily find the source files and lines, but the function names are modified and now incorrect. Using the source map to navigate the source code by hand, I can reconstruct the original call stack including the class full names and anonymous functions:

I had to jump back and forth between these three stacks a few times to fully grock what was shifting (both good and bad) between them, so trying to add a touch more detail explaining the differences.

```

* In the above example, we would want to see `static Example.[Symbol.species]` and `Example.['to' + 'String']` in the `scopeNames` field of the source map.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Related Discussions
This proposal has a handful of discussions across related GitHub repos:
- [TypeScript Issue](https://github.com/microsoft/TypeScript/issues/46695)
- [TypeScript Prototype PR](https://github.com/microsoft/TypeScript/pull/47152)
- [WebPack Proposal](https://github.com/webpack/webpack/issues/14996)

There's a few related discussions that are in-depth enough to ensure readers of this explainer are aware of.

2. A value representing starting line, relative to the previous starting line value. If this is the first value, it is an absolute value.
3. A value representing the absolute starting column number.
4. A value representing the ending line, **relative to the starting line** value.
5. A value representing the ending starting column number.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming this is also an absolute column number.

Suggested change
5. A value representing the ending starting column number.
5. A value representing the absolute ending column number.

3. A value representing the absolute starting column number.
4. A value representing the ending line, **relative to the starting line** value.
5. A value representing the ending starting column number.
6. An index into the `scopeNames` array.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this relative to the previous scopeNames index?

{
"sources": ["example.js"],
"scopeNames": ["Global code", "foo"],
"scopes": "AAAAiCA,AAkCEAC,AECCMA"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If scopeName index is relative to the previous one, this should decrement:

Suggested change
"scopes": "AAAAiCA,AAkCEAC,AECCMA"
"scopes": "AAAAiCA,AAkCEAC,AECCMD"


The following recommendations are **recommendations only** and not part of the "standard". They are, rather, a best attempt to identify how common examples can produce high-fidelity diagnostics experiences.

* **Class names included**: Class member functions should indicate that they are members of classes. Although a common JavaScript convention has been to use `#` as a shorthand for `Class.prototype.member`, the inclusion of `#` as valid syntax for `private` field shorthand suggests that this should not be used. This spec recommends that the expansion `ClassName.functionName` should be used for instance functions, and `static ClassName.functionName` for static functions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not including a prototype in the instance methods will lead to confusion with static methods, because people call the static methods as Foo.bar() matching the ClassName.methodName pattern.

Finally, `scopes` will be a comma-delimited list of Base64-encoded VLQs with the following set of entries (please note, all offsets are 0-based, in concurrence with the previous Source Maps spec):

1. A value representing the current source index, relative to the previous source index value. If this is the first value, it is an absolute value.
2. A value representing starting line, relative to the previous starting line value. If this is the first value, it is an absolute value.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the source line/column values necessary for something? Are you just trying to translate from the unminified frame's line/column into a scope name? Or is there something more?

I think you're trying to create start and end regions of the source file. You then take an unminified line/coumn and lookup the region that contains it. You use that to translate into a scope name.

If so, this can be made much smaller by index ranges of the mappings field. Eg, running your TypeScript input through babel generates

{
  "version": 3,
  "sources": [ "input.js" ],
  "names": [
    "foo",
    "bar",
    "Error"
  ],
  "mappings": "AAAA,QAASA,CAAAA,GAAT,CAAaC,GAAb,CAAiC,CAC7B,KAAM,IAAIC,CAAAA,KAAJ,CAAU,cAAV,CAAN,CACH,CACDF,GAAG",
  "sourcesContent": [
    "function foo(bar: number): never {\n    throw new Error('Intentional.');\n}\nfoo();"
  ]
}

Decoding the mappings gives us:

[
  [
    // This represents `Global Code` scope
    [ 0, 0, 0, 0 ],       // `function`
    [ 8, 0, 0, 9, 0 ],    // ` `
    [ 9, 0, 0, 9, 0 ],    // `foo`
    [ 12, 0, 0, 0 ],      // `(`
    [ 13, 0, 0, 13, 1 ],  // `bar`
    [ 16, 0, 0, 0 ],      // `)`

    // This represents `foo` scope
    [ 17, 0, 0, 33 ],     // `{`
    [ 18, 0, 1, 4 ],      // `throw`
    [ 23, 0, 1, 10 ],     // ` new`
    [ 27, 0, 1, 14, 2 ],  // ` `
    [ 28, 0, 1, 14, 2 ],  // `Error`
    [ 33, 0, 1, 10 ],     // `(`
    [ 34, 0, 1, 20 ],     // `'Intentional.'`
    [ 48, 0, 1, 10 ],     // `)`
    [ 49, 0, 1, 4 ],      // `;`
    [ 50, 0, 2, 1 ],      // `}`

    // We're back to `Global Code` scope
    [ 51, 0, 3, 0, 0 ],   // `foo`
    [ 54, 0, 3, 3 ],      // `();`
  ]
]

We have 3 ranges: Indexes 0 through 5 (inclusive) represent the Global Code scope of function foo(bar), 6 through 15 represent foo scope of {throw new Error('Intentional.');}, and 16 till the end represent Global Code scope of foo();.

(This example is unfortunately just a single line of output. When there are multiple lines, the indexes should still be considered contiguous. Eg, 5 mappings on output line 1 means the first mapping of line 2 is at index 5.)

This can easily be represented with just 6 numbers:

// Decoded format is using absolute indices, when encoding into VLQ make it relatives.
{
  scopes: [
    // First item is `mappings` index, second is `scopeNames` index
    // mapping 0 till the next mapping point at scopeName 0
    [0, 0],
    // mapping 6 till the next mapping point at scopeName 1
    [6, 1],
    // mapping 16 and all later mappings point at scopeName 0
    [16, 0],
  ],
  scopeNames: ['Global Code', 'foo'],
}

Encoding this into VLQ (using relative to previous values) gives us:

{
  scopes: "AA,MC,UD",
  scopeNames: ['Global Code', 'foo'],
}

This has the benefit of not requiring a binary search to find the correct region, since it can be extracted while decoding the mappings field.

* **Class names included**: Class member functions should indicate that they are members of classes. Although a common JavaScript convention has been to use `#` as a shorthand for `Class.prototype.member`, the inclusion of `#` as valid syntax for `private` field shorthand suggests that this should not be used. This spec recommends that the expansion `ClassName.functionName` should be used for instance functions, and `static ClassName.functionName` for static functions.
* **Anonymous callback name inference**: Callback functions are often anonymous, but the name of the function being invoked is typically interesting. In the following example, an interesting function name might be something along the lines of `anonymous function passed to myArray.map`:

```ts

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style Nit: Indenting the code blocks with 4 spaces will bring them in line with the <li>'s padding.

@travisleithead
Copy link
Contributor

@robpaveza has suggested that we close this PR and focus our collected efforts on tc39/ecma426#16 where more of a critical mass of relevant community members are engaging and where discussion on the details of the proposals is continuing.

@bmathwig bmathwig deleted the user/ropaveza/sourcemaps-v3.1 branch July 7, 2023 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants