Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Utf8JsonReader should read from JsonNode #106047

Open
ygoe opened this issue Aug 6, 2024 · 5 comments
Open

[API Proposal]: Utf8JsonReader should read from JsonNode #106047

ygoe opened this issue Aug 6, 2024 · 5 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Text.Json
Milestone

Comments

@ygoe
Copy link

ygoe commented Aug 6, 2024

Background and motivation

I have deserialisation code that reads data from JSON text, usually files on disk. It's using the Utf8JsonReader to read the source. It already exists and is somewhat complex so I don't want to duplicate the whole thing to also read from JsonNode. Another API that receives JSON-formatted messages from a network connection gives me the payload part as JsonNode because it has already read the whole message and needed to process parts of it.

So what I'm doing now is this:

string dataJson = message.PayloadData.ToJsonString();
byte[] dataJsonBytes = Encoding.UTF8.GetBytes(dataJson);
var reader = new Utf8JsonReader(dataJsonBytes);

This converts me the already parsed payload from JsonNode to serialised text, copies it over a few times for the encoding and finally I have something that Utf8JsonReader can read from.

Since Utf8JsonReader is a ref struct, I can't simply derive from it and change the behaviour. It needs to be the exact thing that can read from JsonNode as well internally.

API Proposal

using System.Text.Json.Nodes;

namespace System.Text.Json;

public ref struct Utf8JsonReader
{
    public Utf8JsonReader(JsonNode source);
}

API Usage

Save the text work and read tokens from the nodes directly. It should be easy since the tokens are already known. The nodes can simply be iterated and returned in the reader API.

var reader = new Utf8JsonReader(message.PayloadData);

Alternative Designs

Extra processing and memory allocation through the intermediate serialisation step. This is the current workaround.

Risks

I don't see any. It's a new API that has never existed before and nobody needs to use it if they don't need to.

@ygoe ygoe added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Aug 6, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Aug 6, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis
See info in area-owners.md if you want to be subscribed.

@am11
Copy link
Member

am11 commented Aug 6, 2024

So what I'm doing now is this:

It can be improved with something like:

using System.Buffers;
using System.Text.Json;
using System.Text.Json.Nodes;

JsonNode node = GetNode();
Utf8JsonReader reader = node.AsUtf8JsonReader();

Use(reader);

public static class JsonNodeExtensions
{
    public static Utf8JsonReader AsUtf8JsonReader(this JsonNode payloadData)
    {
        ArrayBufferWriter<byte> bufferWriter = new();

        using Utf8JsonWriter writer = new(bufferWriter);
        payloadData.WriteTo(writer);

        return new(bufferWriter.WrittenMemory.Span);
    }
}

@eiriktsarpalis
Copy link
Member

So what I'm doing now is this:

string dataJson = message.PayloadData.ToJsonString();
byte[] dataJsonBytes = Encoding.UTF8.GetBytes(dataJson);
var reader = new Utf8JsonReader(dataJsonBytes);

This converts me the already parsed payload from JsonNode to serialised text, copies it over a few times for the encoding and finally I have something that Utf8JsonReader can read from.

This is a known restriction emanating from Utf8JsonReader being a ref struct. In fact, STJ itself is forced to perform this inefficient conversion in many places in its own implementation.

I don't believe extending the Utf8JsonReader type itself to support all possible sources of JSON (UTF-16, JsonNode, JsonDocument, etc.) is a sustainable solution: it isn't pay-for-play and even the name of the type itself suggests that it's restricted to UTF-8 JSON decoding responsibilities. Longer-term, we need a reader abstraction, e.g. an IJsonReader interface akin to the abstract JsonReader class in Json.NET.

Note that adding such an interface is dependent on the language adding support for ref struct interface implementations, but even then this would require duplicating most of the serialization API surface to support polymorphic readers. For starters, JsonConverter<T> would need to add a new method:

public partial class JsonConverter<T>
{
    public virtual T? Read<TReader>(ref TReader reader, Type type, JsonSerializerOptions options) where TReader : allow ref struct, IJsonReader => throw new NotImplementedException();
}

which converter authors would then need to implement. Needless to say, this has cost and complexity comparable to implementing a serializer from scratch.

cc @stephentoub @davidfowl

@eiriktsarpalis eiriktsarpalis removed the untriaged New issue has not been triaged by the area owner label Aug 7, 2024
@eiriktsarpalis eiriktsarpalis added this to the Future milestone Aug 7, 2024
@stephentoub
Copy link
Member

Note that adding such an interface is dependent on the language dotnet/csharplang#7608

For completeness, this exists now in C# 13 and .NET 9.

@ygoe
Copy link
Author

ygoe commented Aug 7, 2024

It can be improved with something like:

@am11 FYI, a writer.Flush() call is still needed after the WriteTo call, or the written bytes will be incomplete. I know this type of problem already. ;-) But thanks for the suggestion, it looks better than mine and I can live with it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Text.Json
Projects
None yet
Development

No branches or pull requests

4 participants