-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add read-values and write-values #53
Conversation
read-values dispatches to the ReadValues protocol. It returns an iterator via an ObjectReader derived from the supplied mapper. The returned iterator is reified in a manner similar to Eduction to support reduction and sequence construction over it. write-values relies on two protocols - WriteValues for the output destination, similarly to WriteValue, and WriteAll for the type being written, which can be an array or an Iterable. It writes an array or iterable to destination via a SequenceWriter. Importantly, write-values distables automatic flushing on serialization to get good performance.
This adds support for reading and writing large sequences without materializing them in memory.
|
@ikitommi do you know why the work flow failed when setting up the environment? The error is from |
[^Iterator iterator] | ||
(when iterator | ||
(reify | ||
Iterable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I'm unsure of is implementing Iterable here. Is it a good idea to return an object which is both an Iterable
and Iterator
?
Is the I recently implemented following example, where I used
Is it possible to use streaming in such cases? If not, maybe preventing creating vectors for big Arrays is an separate issue. |
@Deraen, thanks for giving it a look
I think these solutions are fundamentally different. I don't think lazy streaming could be generalized beyond 3, but partial laziness like your solution is an avenue to explore. These are separate issues, use cases and requirements, in my estimation. |
Yeah, that makes sense. I'll try to look a bit more into case 2, if there is still something that will be shared with this case. Before introducing new API here, I want to understand if we could cover both cases with similar functions. Maybe I'll need to read JsonNode impl, or profile memory use with readTree. |
It is possible to also use stream reading to read values from an array inside an object: https://github.com/metosin/jsonista/compare/stream-testing One just needs to navigate the parser to the array start token first. I guess lazy-seq is doing some caching so the example is not optimal, but didn't quickly find better way to call .readValueAs until the END_ARRAY token is found. I don't think we need to provide functions to move the parser, but maybe something to make easier to efficiently read array values once the parser is in correct position? |
Wrap-values is currently private, and that would be useful if a user wants to call e.g. What's the difference with wrap-values and |
@Deraen not exposing a seq api over |
read-values dispatches to the ReadValues protocol. It returns an
iterator via an ObjectReader derived from the supplied mapper.
The returned iterator is reified in a manner similar to Eduction to
support reduction and sequence construction over it.
write-values relies on two protocols - WriteValues for the output
destination, similarly to WriteValue, and WriteAll for the type being
written, which can be an array or an Iterable.
It writes an array or iterable to destination via a SequenceWriter.
Importantly, write-values distables automatic flushing on serialization
to get good performance.