Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

serialize top-level undefined as empty string? #1661

Open
warner opened this issue Sep 1, 2020 · 3 comments
Open

serialize top-level undefined as empty string? #1661

warner opened this issue Sep 1, 2020 · 3 comments
Assignees
Labels
enhancement New feature or request marshal package: marshal

Comments

@warner
Copy link
Member

warner commented Sep 1, 2020

What is the Problem Being Solved?

Given how common it is for our methods to resolve their results to undefined, it might be a useful performance improvement to have marshal represent a top-level undefined as simply an empty string. This would reduce the size of the serialized messages and the transcripts that hold them.

This is unambiguous (empty strings are not legal JSON), and would only be interpreted at the top level.

It looks like JSON.stringify([undefined]) is equivalent to JSON.stringify([null]), so perhaps it's not even possible to create a situation where undefined exists (and can survive a marshal roundtrip) anywhere except at the top level.

Description of the Design

marshal.serialize would start by comparing its argument against undefined, and if equal, it would return ''. Else, it would do the normal JSON.stringify with the replacers as usual.

marshal.unserialize would start by comparing its argument against an empty string. If equal, it would return undefined. Else it would do the normal JSON.parse.

We should probably hold off doing this until we have a clear idea of what it would save, with some benchmarks.

We might hold off doing this long enough for it to be obsoleted by a more-efficient binary marshalling scheme.

cc @erights @FUDCo

@warner warner added enhancement New feature or request marshal package: marshal labels Sep 1, 2020
@erights
Copy link
Member

erights commented Sep 1, 2020

Once we want to shrink the size of the marshaled form, we should do something very different than the current naive JSON serialization. There are small steps we can take there that would have a far bigger effect than this one. I would start by an analog of the v8 "hidden classes" optimization --- where we recognize the occurrence of a particular set of field names, give it an internal "hidden class" name the first time we see it, and then encode all matching records by referring to the class and listing the field values in an array. I suspect this plus a few other minor tweaks would get us close to binary serialization sizes.

I agree that we should get some benchmarks in place before we start fiddling with such experiments. Big unknown variable for all these uncompressed size improvements: How do they affect how well a generic fast compression compresses them?

@FUDCo
Copy link
Contributor

FUDCo commented Sep 1, 2020

Empty strings are legal in JSON. Am I missing something?

@warner
Copy link
Member Author

warner commented Sep 1, 2020

Empty strings are legal in JSON. Am I missing something?

I meant an empty string cannot be the output of JSON serialization, and is not accepted by a JSON parser:

$ node
Welcome to Node.js v14.8.0.
Type ".help" for more information.
> JSON.parse('')
Uncaught SyntaxError: Unexpected end of JSON input
>

so we could use it as a special case of the serialized form (the swingset "capdata" .body) without ambiguity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request marshal package: marshal
Projects
None yet
Development

No branches or pull requests

3 participants