Skip to content

Latest commit

 

History

History
129 lines (68 loc) · 8.49 KB

10-06.md

File metadata and controls

129 lines (68 loc) · 8.49 KB

October 6, 2020 Incubator Call Notes (JSON.parse)

Attendees:

  • Shu-yu Guo (SYG)
  • Richard Gibson (RGN)
  • Jordan Harband (JHD)
  • Iain Ireland (IID)
  • Caroline Cullen (CUN)

Recap

(Presents https://docs.google.com/presentation/d/1MGJhUvrWl4dE4otjUm8jXDrhaZLh9g7dnasnfK-VyZg/edit?usp=sharing)

Keys

SYG: Clarifying question. Is the key like breadcrumbs? But is it unique?

RGN: Yes, it's like breadcrumbs, and yes it is unique. There’s an open question about whether to include the leading empty string key, but that doesn’t affect uniqueness.

Exposing source text

RGN: The most controversial so far is which values we expose source text for. Is it limited to primitives or do we expand it all the way out? RGN: If we're parsing { "foo":"bar", "n": 42}, when the reviver gets called, it'll first get the "foo":"bar" pair, then the "n":42 pair, then the entire object. The question is do we expose the source text for the entire object. There's concern from Michael Ficarra that we don't want users to "smuggle" info via the source text like whitespace. I'm not sure how much of a concern this actually is. Want to understand the viewpoints around this before deciding how to proceed.

IID: Are there use cases where you want the source text for a non-primitive anyway? I'm struggling with that a bit.

RGN: Yeah, the biggest argument I can think of in favor for it is we're just providing a kind of consistency.

JHD: I use duplicate keys in JSON for comments, so with this proposal how would I access the duplicate key? I certainly won't block this proposal on this weird use case, but it's one I have.

RGN: Interesting. I like the thought of that, but providing the whole string for the entire object literal doesn't seem to be that useful for this use case?

JHD: No, it's not ergonomic but it makes it possible. An alternative is to provide the entries instead of the source text.

RGN: With primitives, the source text is guaranteed to correspond with the in-memory structure. If we're doing it for the values that wrap the primitives, that no longer holds—primitives have been replaced, but the source text is static and reflects their original source.

SYG: I'm leaning primitives only because I don't see any use cases for the source text for the wrapper objects.

IID: Another thing is the first thing you do when you get a non-primitive source text is you have to parse it again. And… why are we here? I also softly lean towards primitives only.

SYG: IID's point is a good one. JSON parsing has extra rules you'd have to know, like duplicate keys are resolved in favor of the last one.

RGN: Sounds like a resolution actually.

Exposing positions

RGN: Exposing positions may help the pseudo-comment use case, and also lets you get at the raw key before normalization into a regular string. My inclination, in line with the previous topic I'm leaning against since these use cases are pretty weak. I'm calling for reasons to change that position.

IID: I don't have any.

RGN: Anyone think they should be included in the first pass of this proposal?

IID: I don't think they should be included without compelling use.

JHD: What would happen with lone surrogates? Is that a situation where you might care?

RGN: This is a situation you have now. (Demonstrates). Lone surrogates is the only plausible use case we can come up with, and we don't need positional information to handle it.

SYG: The usual use case for positional information is printing a caret for e.g. parse errors. That use case seems overkill here, right?

RGN: Yeah.

SYG: If anything on error you probably want to print the keys breadcrumbs, not print a caret into some 2MB input string.

Exposing keys

RGN: Now we segue into the keys. I think this is useful. Can someone give me reasons to change this stance?

IID: From internal Moz discussions, the more power you're giving people to parse things that are not JSON, the more it fragments. Having specific control makes it more likely that people will be writing something particularly weird and non-interoperable JSON. Whether that's a problem is unclear, we don't feel particularly strongly about this but not wanting people to get too fancy would be a reason to hold back on this for now.

JHD: One thing that's nice about this with this approach keys.join is that it mirrors the JQ syntax. It'd be really intuitive to people. If you had keys information, you might be able to use the keys as a tag. It gives you more power.

RGN: The use case I think about is that if I have a { "type": "bigint", "value": 42 } that's repeated somewhere else inside another object like { "tags": { "type": "bigint", "value": 42 } }, I want to upgrade the former but preserve the latter. Keys lets me do this. This is not core to the proposal but is another papercut that could be addressed by it.

IID: Having given the Mozilla feedback to this, this is still probably fine.

SYG: This is a bunch of extra memory allocations for the keys array that the frontend is unlikely to be able to optimize out. Are there sites that JSON.parse with a reviver function that parse giant JSON blobs that might suddenly break due to OOMs because of the extra spike in memory?

IID: I'm inclined to keep it out given the implementation concerns.

SYG: I'm actually fine with Stage 3 with this if we go into it knowing this is an implementation concern we need to figure out.

IID: I think the data gathering should be done before Stage 3.

RGN: I'm fine with leaving it out.

Serialization

RGN: Serialization seems important to many people.

JHD: I don't know how to do it but I'm one of the people who think it's very important to do, e.g. being able to produce Twitter tweet JSON containing an id that overflows Number precision.

SYG: This is part of the proposal that proposes to extend JSON.stringify as well as JSON.parse?

RGN: Roundtripping seems so linked to being able to parse it that committee consensus was that serialization should be part of this proposal.

IID: The use case seems value but it's unclear it needs to be part of the same proposal. Particularly since it doesn't seem like we have a plan on how to accomplish it. The only way I can see wanting to put stringify into this proposal is if it required us to do something in parse that we're already doing. We should solve the stringify issue but at a different proposal.

JHD: Fragmentation makes feature detection harder.

IID: Feature detection is a good point.

SYG: Lack of vision a common criticism of TC39. I'd rather we don't overincrementalize and lose cohesion of design even at per-proposal level. Why is the solution so unclear here anyhow?

RGN: There's nothing we can return in the replacer that would let us return digits that isn't treated in an existing path. One idea is if an object is returned with a distinguished symbol on it, then it can be subject to a new code path that doesn't exist now. That works but feels clumsy.

JHD: An alternative to wrapping the thing in an object is there's currently a toJSON method. We can add a separate Symbol-keyed method to supplant toJSON

RGN: Instead of a well-known global symbol, the replacer function could be provided with something unique-per-invocation. Within this group, is there a preference between the two patterns?

JHD: Unique to the invocation means less likely people will use it in unrelated use cases. Doesn't even have to be unique per stringify call, just not having it on the global would prevent misuse.

RGN: Is implementer concern for per-stringify unique symbol?

SYG: Doesn't seem like it to me, it's constant.

IID: Are there concerns where the wrong symbol gets used and it's confusing?

RGN: You'd have to be doing something really weird. If you cached it… but I can't come up a reason for why you would cache it.

IID: I'm imagining that somebody's doing something fancy and they've built up the object they want to return but they're caching that object instead of constructing new ones. Just wondering about the capacity for deeply confusing bugs.

RGN: The reason for doing it the unique per-stringify way is what if somebody gave you an object with the global symbol on it already? Avoiding that seems like a good idea.

IID: I don't mind Jordan's idea of having a global symbol that's just not globally accessible.

JHD: Comes down to do we care about the same symbol working across multiple stringify calls?

RGN: If there's no implementation concern the unique per stringify invocation is probably the right approach. If we get to stage 3 and if there's concern it can always be changed.