encoding/json: Decoder internally buffers full input #11046

kurin · 2015-06-03T19:25:08Z

When using the JSON package, if I encode a struct like

type Data struct {
    Count int
    Names []string
}

and then decode it into

type SmallData struct {
    Count int
}

it will still allocate memory for the list of names, even though it just gets thrown away. This becomes an annoyance when I have several multigigabyte JSON files like this. It would be neat if the JSON parser could identify what fields it cares about, or somehow be told what fields to ignore, and chuck them.

rsc · 2015-11-05T02:14:22Z

I don't believe this is true. Specifically, I don't believe it allocates any memory for fields being discarded. If you think it does, please explain why you think that or point to the allocation. Thanks.

kurin · 2015-11-05T16:44:52Z

I wrote a small test that writes a json file of Data and then (as a new process) reads it into SmallData and prints runtime.MemStats.TotalAlloc: http://play.golang.org/p/5CB3FUL86m

I ran it with --make=10 to --make=1e8, stepping powers of ten.

The resulting plot is
, which indicates that the larger the json file, the more memory is consumed reading into SmallData. It's not obvious, but each datapoint is actually a collection of 3 runs; the variance between runs was very small.

ALTree · 2015-11-06T12:58:09Z

Pprof says 99% of bytes are allocated here.

rsc · 2015-11-25T16:14:48Z

The memory here is for holding the JSON input as read in from the file, not for decoding unused fields.

cespare · 2016-04-13T20:31:35Z

The title says "Decoder internally buffers full input" but it might be better phrased as "Decoder buffers an entire value at a time".

We introduced Decoder.Token last cycle so it is technically possible for the user to use that and avoid buffering a whole value at once. Admittedly that would take a bunch of code.

It would also be possible for the decoder to stop decoding a whole value at once and instead read from the stream into the target structure incrementally. That would be a big refactoring of the decoder. Is that what this bug requires, or is there some simpler option I'm overlooking?

rsc · 2016-04-13T21:02:43Z

The new decoder.token should let people build incremental parsers customized to a particular use case. We cannot change the default behavior: right now if encoding/json consumes a very large but ultimately malformed JSON value, nothing is written to the destination. Incremental decoding would change those semantics by writing to the destination before realizing the value was malformed.

It might be possible to have a different opt-in mode in the Decoder, but certainly not at this point in the Go 1.7 cycle.

cespare · 2016-04-14T02:19:34Z

The corresponding change for an Encoder is #7872.

ianlancetaylor · 2017-10-26T16:30:14Z

Related to #14140 which mentions some possible API changes.

dsnet · 2023-10-06T06:06:02Z

Hi all, we kicked off a discussion for a possible "encoding/json/v2" package that addresses the spirit of this proposal.
The prototype v2 implementation is truly streaming when writing to an io.Writer or reading from an io.Reader.
See https://github.com/go-json-experiment/jsonbench#streaming for details.

bradfitz added the Performance label Jun 3, 2015

bradfitz added this to the Go1.6 milestone Jun 3, 2015

bradfitz changed the title ~~Decoding JSON allocates memory for fields that aren't used.~~ encoding/json: decoding allocates memory for fields that aren't used. Jun 3, 2015

rsc changed the title ~~encoding/json: decoding allocates memory for fields that aren't used.~~ encoding/json: Decoder internally buffers full input Nov 25, 2015

rsc modified the milestones: Go1.7, Go1.6 Nov 25, 2015

rsc modified the milestones: Unplanned, Go1.7 Apr 13, 2016

cespare mentioned this issue Apr 14, 2016

encoding/json: Encoder internally buffers full output #7872

Open

ianlancetaylor mentioned this issue Oct 26, 2017

encoding/gob, encoding/json: add streaming interface #14140

Closed

flimzy mentioned this issue Aug 19, 2019

proposal: encoding/json: opt-in for true streaming support #33714

Open

mvdan mentioned this issue Nov 24, 2020

encoding/json: Marshaler/Unmarshaler not stream friendly #12001

Closed

ekzhang mentioned this issue Jun 29, 2023

Fix my.harvard term querying ekzhang/classes.wtf#12

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

encoding/json: Decoder internally buffers full input #11046

encoding/json: Decoder internally buffers full input #11046

kurin commented Jun 3, 2015

rsc commented Nov 5, 2015

kurin commented Nov 5, 2015

ALTree commented Nov 6, 2015

rsc commented Nov 25, 2015

cespare commented Apr 13, 2016

rsc commented Apr 13, 2016

cespare commented Apr 14, 2016

ianlancetaylor commented Oct 26, 2017

dsnet commented Oct 6, 2023

encoding/json: Decoder internally buffers full input #11046

encoding/json: Decoder internally buffers full input #11046

Comments

kurin commented Jun 3, 2015

rsc commented Nov 5, 2015

kurin commented Nov 5, 2015

ALTree commented Nov 6, 2015

rsc commented Nov 25, 2015

cespare commented Apr 13, 2016

rsc commented Apr 13, 2016

cespare commented Apr 14, 2016

ianlancetaylor commented Oct 26, 2017

dsnet commented Oct 6, 2023