feat(marshal): encode capData in 1 level of JSON #1804

dckc · 2023-10-04T00:49:07Z

refs: #1558 , Agoric/agoric-sdk#7999

Description

encode capData to 1 level of JSON, much like #1558, but

using lastIndexOf rather than a regex
fastcheck testing

motivation: senders pay by the byte etc.

wallet action, inter protocol data is wrapped in 2 JSON.stringify layers Agoric/agoric-sdk#7999

Security Considerations

careful review for confusion vulnerability is in order

Scaling Considerations

double-backslashes cost storage space

Documentation Considerations

This flatter format is easier to read, and so easier to document in some senses, though there's a mixing of levels that's somewhat subtle.

Testing Considerations

This has unit tests for specific examples plus fastcheck tests. Whether I stated the property exactly quite right is worth careful review.

Upgrade Considerations

This is a DRAFT, pending:

figure out the whole upgrade story

cc @erights @gibson042

mhofman

I would so much rather we properly split encoding from serialization for marshal, as discussed in #1478.

Also I would really prefer if we could find a way to partial parse JSON instead of relying of undocumented serialization constraints (body first, no spaces, etc.) I remember have a discussion with @gibson042 about what API we would need from JS to allow this.

mhofman · 2023-10-04T01:04:25Z

packages/marshal/src/capDataJSON.js

+  assert(Array.isArray(slots));
+  const slotj = JSON.stringify(slots);
+  slotj.indexOf(':[') < 0 || Fail`expected simple slots`;
+  const body1 = body.replace(/^#/, '');


why not check body[0] === '#' and do body.slice(1), I think that's a lot more efficient.

This seems to assume that the argument is a CapData record whose body is a "#"-prefixed JSON serialization of SmallCaps-encoded data, which would need a lot more explanation than appears here (and a better name).

body.replace(/^#/, '') handles both smallCaps and qclass, no? (I haven't tested it, though).

Why is .slice(1) significantly more efficient?

body.replace(/^#/, '') handles both smallCaps and qclass, no? (I haven't tested it, though).

I guess that depends upon what this function is expected to return. Regardless of the answer to that, though, CapData like { body: `{"@qclass":"bigint","digits":"0"}`, slots: [] } and { body: `#"+0"`, slots: [] } represent exactly the same data (0n) but would have distinct String('{"$body":{"@qclass":"bigint","digits":"0"},"slots":[]}') and String('{"$body":"+0","slots":[]}') return values (respectively) from the current implementation—which seems like a problem because there's no remaining signal differentiating smallcaps from the legacy encoding.

Why is .slice(1) significantly more efficient?

The answer is implementation-specific, but basically comes down to being zero-copy.

$ esbench --eshost-option '-h V8,*XS*' \ 'const unprefixed="a".repeat(1000), prefixed = "#" + unprefixed' '{ "unprefixed.replace": `result = unprefixed.replace(/^#/, "")`, "unprefixed.slice": `result = unprefixed.startsWith("#") ? unprefixed.slice(1) : unprefixed`, "prefixed.replace": `result = prefixed.replace(/^#/, "")`, "prefixed.slice": `result = prefixed.startsWith("#") ? prefixed.slice(1) : prefixed`, }' #### Moddable XS unprefixed.replace: 0.06 ops/ms unprefixed.slice: 3.56 ops/ms prefixed.replace: 0.07 ops/ms prefixed.slice: 0.95 ops/ms #### V8 unprefixed.replace: 29.41 ops/ms unprefixed.slice: 100.00 ops/ms prefixed.replace: 19.61 ops/ms prefixed.slice: 62.50 ops/ms

...

Why is .slice(1) significantly more efficient?

The answer is implementation-specific, but basically comes down to being zero-copy.

I guess I trained my regex intuitions in perl where such things are optimized out the wazoo.

Thanks for the esbench details.

mhofman · 2023-10-04T01:06:07Z

packages/marshal/src/capDataJSON.js

+  slotj.indexOf(':[') < 0 || Fail`expected simple slots`;
+  const body1 = body.replace(/^#/, '');
+  assertJSON(body1);
+  const json = `{"$body":${body1},"slots":${slotj}}`;


In #1478 (comment) I suggest body#

mhofman · 2023-10-04T01:07:06Z

packages/marshal/src/capDataJSON.js

+
+export const JSONToCapData = json => {
+  assert.typeof(json, 'string');
+  json.startsWith('{"$body":') || Fail`expected $body`;


I guess this only works when this body is first in the serialized JSON, not second?

gibson042

This is just far too brittle for comfort, and doesn't feel like the right way to solve a "too much escaping" problem (assuming that is in fact what motivates it).

I would so much rather we properly split encoding from serialization for marshal, as discussed in #1478.

I agree.

Also I would really prefer if we could find a way to partial parse JSON instead of relying of undocumented serialization constraints (body first, no spaces, etc.) I remember have a discussion with @gibson042 about what API we would need from JS to allow this.

Yeah, but I don't know if we wrote it down (https://github.com/Agoric/agoric-private/issues/31#issuecomment-1494853056 is related but definitely distinct, as is Go-style hybrid decoding). At any rate, it's not difficult, although it would require going beyond the standard library.

gibson042 · 2023-10-04T03:57:05Z

packages/marshal/src/capDataJSON.js

+  assert(Array.isArray(slots));
+  const slotj = JSON.stringify(slots);
+  slotj.indexOf(':[') < 0 || Fail`expected simple slots`;
+  const body1 = body.replace(/^#/, '');


This seems to assume that the argument is a CapData record whose body is a "#"-prefixed JSON serialization of SmallCaps-encoded data, which would need a lot more explanation than appears here (and a better name).

dckc added 2 commits October 3, 2023 19:04

feat(marshal): encode capData in 1 level of JSON

bc1b5fb

test: include example offer from agoric #7999

0667b1d

dckc mentioned this pull request Oct 4, 2023

wallet action, inter protocol data is wrapped in 2 JSON.stringify layers Agoric/agoric-sdk#7999

Open

mhofman reviewed Oct 4, 2023

View reviewed changes

gibson042 reviewed Oct 4, 2023

View reviewed changes

gibson042 mentioned this pull request Apr 1, 2024

Create a cross-implementation microbenchmarking tool #2197

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(marshal): encode capData in 1 level of JSON #1804

feat(marshal): encode capData in 1 level of JSON #1804

dckc commented Oct 4, 2023

mhofman left a comment

mhofman Oct 4, 2023

gibson042 Oct 4, 2023

dckc Oct 4, 2023

gibson042 Oct 4, 2023

dckc Oct 4, 2023

mhofman Oct 4, 2023

mhofman Oct 4, 2023

gibson042 left a comment •

edited

Loading

gibson042 Oct 4, 2023

feat(marshal): encode capData in 1 level of JSON #1804

Are you sure you want to change the base?

feat(marshal): encode capData in 1 level of JSON #1804

Conversation

dckc commented Oct 4, 2023

Description

Security Considerations

Scaling Considerations

Documentation Considerations

Testing Considerations

Upgrade Considerations

mhofman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gibson042 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gibson042 left a comment •

edited

Loading