Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Support Custom Data on JsonSerializerOptions #71718

Open
stevejgordon opened this issue Jul 6, 2022 · 23 comments
Open

[API Proposal]: Support Custom Data on JsonSerializerOptions #71718

stevejgordon opened this issue Jul 6, 2022 · 23 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Text.Json
Milestone

Comments

@stevejgordon
Copy link
Contributor

stevejgordon commented Jul 6, 2022

Background and motivation

While working on a client library for Elasticsearch, making heavy use of System.Text.Json, a common issue I have run into is the need to access some additional state inside custom converters. The Elasticsearch client has a settings class ElasticsearchClientSettings which includes state used to infer values for certain types at runtime.

Currently, this forces an instance of each converter which requires access to the settings to be created in advance and added to the JsonSerializerOptions (example). These converters can then accept an ElasticsearchClientSettings instance via their constructor. Some of these instances may never be needed if the types they (de)serialize are not used by consumers of our library. Additionally, we have some converters which we code generate which makes creating such instances and adding them to the JsonSerializerOptions more complicated.

I have come up with a rather nasty workaround where I register a converter purely to hold state, which can then be retrieved from the options to gain access to the settings.

internal sealed class ExtraSerializationData : JsonConverter<ExtraSerializationData>
{
    public ExtraSerializationData(IElasticsearchClientSettings settings) => Settings = settings;

    public IElasticsearchClientSettings Settings { get; }

    public override ExtraSerializationData? Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options) => throw new NotImplementedException();
    public override void Write(Utf8JsonWriter writer, ExtraSerializationData value, JsonSerializerOptions options) => throw new NotImplementedException();
}

This can then be accessed in the Write method of a custom converter:

public override void Write(Utf8JsonWriter writer, SortOptions value, JsonSerializerOptions options)
{
    writer.WriteStartObject();

    if (value.AdditionalPropertyName is IUrlParameter urlParameter)
    {
        var extraData = options.GetConverter(typeof(ExtraSerializationData)) as ExtraSerializationData;
        var propertyName = urlParameter.GetString(extraData.Settings);
        writer.WritePropertyName(propertyName);
    }
    else
    {
        throw new JsonException();
    }
    
    JsonSerializer.Serialize(writer, value.Variant, value.Variant.GetType(), options);
    writer.WriteEndObject();
}

API Proposal

I would like to propose adding support to attach extra custom data (a property bag) to JsonSerializerOptions, to be accessible inside the Read and Write methods of custom converters derived from JsonConveter<T> by accessing the JsonSerializerOptions.

public sealed class JsonSerializerOptions
{
    public IReadOnlyDictionary<string, object> CustomData { get; set; } = new Dictionary<string, object>(0); // some cached default empty value
}

This adds a property to act as a property bag for user-provided data. I'm proposing this be exposed as IReadOnlyDictionary to avoid new items being added/modified after the options become immutable. It could be IDictionary to support scenarios where some converters need to add data for subsequent use, although that sounds risky. I imagine this should be initialised with a cached empty dictionary in cases where the user does not explicitly set this.

API Usage

Define options with custom data:

var options = new JsonSerializerOptions
{
    CustomData = new Dictionary<string, object> {{ "settings", new Settings() }} 
}

Access within a custom converter:

public override void Write(Utf8JsonWriter writer, SortOptions value, JsonSerializerOptions options)
{
    writer.WriteStartObject();

    if (value.AdditionalPropertyName is IUrlParameter urlParameter && options.CustomData.TryGetValue("settings", out var settings))
    {
        var propertyName = urlParameter.GetString(settings);
        writer.WritePropertyName(propertyName);
    }
    else
    {
        throw new JsonException();
    }
    
    JsonSerializer.Serialize(writer, value.Variant, value.Variant.GetType(), options);
    writer.WriteEndObject();
}

Alternative Designs

If the JsonSerializerOptions were unsealed, that could also support my scenario. We could define a subclass of JsonSerializerOptions with extra properties. Inside our converters that require that data, they could try casting to the derived type and access any necessary extra properties.

- public sealed class JsonSerializerOptions
+ public class JsonSerializerOptions
{
}

Risks

This could potentially be misused to hold mutable types which could cause thread-safety issues. Documentation could be added to guide the intended usage.

@stevejgordon stevejgordon added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Jul 6, 2022
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jul 6, 2022
@ghost
Copy link

ghost commented Jul 6, 2022

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

While working on a client library for Elasticsearch, making heavy use of System.Text.Json, a common issue I have run into is the need to access some additional state inside custom converters. The Elasticsearch client has a settings class ElasticsearchClientSettings which includes state used to infer values for certain types at runtime.

Currently, this forces an instance of each converter which requires access to the settings to be created in advance and added to the JsonSerializerOptions (example). These converters can then accept an ElasticsearchClientSettings instance via their constructor. Some of these instances may never be needed if the types they (de)serialize are not used by consumers of our library. Additionally, we have some converters which we code generate which makes creating such instances and adding them to the JsonSerializerOptions more complicated.

I have come up with a rather nasty workaround where I register a converter purely to hold state, which can then be retrieved from the options to gain access to the settings.

internal sealed class ExtraSerializationData : JsonConverter<ExtraSerializationData>
{
	public ExtraSerializationData(IElasticsearchClientSettings settings) => Settings = settings;

	public IElasticsearchClientSettings Settings { get; }

	public override ExtraSerializationData? Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options) => throw new NotImplementedException();
	public override void Write(Utf8JsonWriter writer, ExtraSerializationData value, JsonSerializerOptions options) => throw new NotImplementedException();
}

This can then be accessed in the Write method of a custom converter:

public override void Write(Utf8JsonWriter writer, SortOptions value, JsonSerializerOptions options)
{
    writer.WriteStartObject();

    if (value.AdditionalPropertyName is IUrlParameter urlParameter)
    {
        var extraData = options.GetConverter(typeof(ExtraSerializationData)) as ExtraSerializationData;
        var propertyName = urlParameter.GetString(extraData.Settings);
        writer.WritePropertyName(propertyName);
    }
    else
    {
        throw new JsonException();
    }
    
    JsonSerializer.Serialize(writer, value.Variant, value.Variant.GetType(), options);
    writer.WriteEndObject();
}

API Proposal

I would like to propose adding support to attach extra custom data (a property bag) to JsonSerializerOptions, to be accessible inside the Read and Write methods of custom converters derived from JsonConveter<T> by accessing the JsonSerializerOptions.

public sealed class JsonSerializerOptions
{
    public IReadOnlyDictionary<string, object> CustomData { get; set; } = new Dictionary<string, object>(0); // some cached default empty value
}

This adds a property to act as a property bag for user-provided data. I'm proposing this be exposed as IReadOnlyDictionary to avoid new items being added/modified after the options become immutable. It could be IDictionary to support scenarios where some converters need to add data for subsequent use, although that sounds risky. I imagine this should be initialised with a cached empty dictionary in cases where the user does not explicitly set this.

API Usage

Define options with custom data:

var options = new JsonSerializerOptions
{
    CustomData = new Dictionary<string, object> {{ "settings", new Settings() }} 
}

Access within a custom converter:

public override void Write(Utf8JsonWriter writer, SortOptions value, JsonSerializerOptions options)
{
    writer.WriteStartObject();

    if (value.AdditionalPropertyName is IUrlParameter urlParameter && options.CustomData.TryGetValue("settings", out var settings))
    {
        var propertyName = urlParameter.GetString(settings);
        writer.WritePropertyName(propertyName);
    }
    else
    {
        throw new JsonException();
    }
    
    JsonSerializer.Serialize(writer, value.Variant, value.Variant.GetType(), options);
    writer.WriteEndObject();
}

Alternative Designs

Alternative Design

If the JsonSerializerOptions were unsealed, that could also support my scenario. We could define a subclass of JsonSerializerOptions with extra properties. Inside our converters that require that data, they could try casting to the derived type and access any necessary extra properties.

Risks

This could potentially be misused to hold mutable types which could cause thread-safety issues. Documentation could be added to guide the intended usage.

Author: stevejgordon
Assignees: -
Labels:

api-suggestion, area-System.Text.Json

Milestone: -

@eiriktsarpalis
Copy link
Member

It's a reasonable requirement, but I don't think JsonSerializerOptions is the right type to attach state scoped to a particular serialization operation. As I think you're already pointing out, JsonSerializerOptions is meant as a singleton configuration/caching context, as such attaching state there would make it susceptible to race conditions (or encourage the anti-pattern of creating a new options instance per serialization operation).

Related to #63795, we've been thinking about exposing async converter primitives which would also involve exposing the internal WriteState/ReadState types. It should be possible to let users attach custom state to these types a la StreamingContext.

@eiriktsarpalis eiriktsarpalis removed the untriaged New issue has not been triaged by the area owner label Jul 6, 2022
@eiriktsarpalis eiriktsarpalis added this to the 8.0.0 milestone Jul 6, 2022
@stevejgordon
Copy link
Contributor Author

Thanks for the feedback @eiriktsarpalis. To be clear, in our use case, our ElasticsearchClientSettings are also immutable and required across many converters for many serialization operations for the lifetime of a client instance (which is reused for the life of consumer apps). We create and use a single instance of JsonSerializerOptions.

There's already potential for misuse if developers create new instances of the options per serialization operation. I'm not sure this promotes that much more for the common scenario of attaching additional context needed for many/all serialization in converters.

I'll have to dig into the proposal for async converters to understand if custom state there would meet our need. Would we replace the existing converters with async ones? Would those apply to all scenarios, even synchronous serializer methods?

@eiriktsarpalis
Copy link
Member

eiriktsarpalis commented Jul 7, 2022

Would we replace the existing converters with async ones?

It would most likely involve exposing a couple of virtual methods in JsonConverter<T> that are currently marked internal. We haven't produced any prototypes yet, so that plan might be subject to change.

Would those apply to all scenarios, even synchronous serializer methods?

Yes, the converter methods that pass serialization state are used in both synchronous and asynchronous serialization. It just happens that async converters are implemented by passing their state machines via the serialization state objects.

@davidfowl
Copy link
Member

It should be possible to let users attach custom state to these types a la StreamingContext.

Do you have an example of what this would look like?

@eiriktsarpalis
Copy link
Member

eiriktsarpalis commented Jul 7, 2022

Do you have an example of what this would look like?

Feature needs prototyping before we can propose something concrete, but roughly speaking it might look as follows:

var writer = new Utf8JsonWriter();
var state = new WriteState { UserState = new MyState() };
string json = JsonSerializer.Serialize(writer, new Foo(), ref state);

public class MyConverter : JsonResumableConverter<Foo>
{
    public bool TryWrite(Utf8JsonWriter writer, Foo value,  JsonSerializerOptions options, ref WriteState state)
    {
         var myState = (MyState)state.UserState;
         /* serialization logic using state */
    }
}

where JsonResumableConverter<T> would be roughly equivalent to the existing internal class of the same name.

@davidfowl
Copy link
Member

What happens when you don't control the callsite responsible for calling Serialize/Deserailize (like when using a framework)? Attaching this to options seems like the logical place since it:

  • Already flows everywhere
  • Is user defined options state for these calls.

@eiriktsarpalis
Copy link
Member

That might be an option, although given the nature of JsonSerializerOptions I would guess a factory (i.e. Func<object>) might be more appropriate. It would still be less flexible than an argument being passed to serialization methods directly though.

@eiriktsarpalis
Copy link
Member

@stevejgordon in the meantime, have you considered using a ConditionalWeakTable to dynamically attach data to options instances?

@gregsdennis
Copy link
Contributor

I've often wanted the inheritance model, using JsonSerializerOptions as a base class.

@stevejgordon
Copy link
Contributor Author

stevejgordon commented Jul 8, 2022

@eiriktsarpalis I hadn't considered that, only because I was completely unaware of its existence! :-) Looks like it could work for our requirement based on a small sample I put together this morning. Is this essentially how you'd expect a solution to look in such a case?

using System.Runtime.CompilerServices;
using System.Text.Json;
using System.Text.Json.Serialization;
using Test;

var clientOne = new Client(new ClientSettings { Value = 1 });
var clientTwo = new Client(new ClientSettings { Value = 2 });

var data = new MyTypeToSerialize { Name = "Test" };

var resultOne = clientOne.Serializer.Serialize(data);
var resultTwo = clientTwo.Serializer.Serialize(data);

Console.WriteLine(resultOne);
Console.WriteLine(resultTwo);
Console.ReadKey();

namespace Test
{
    public class Client
    {
        internal static ConditionalWeakTable<JsonSerializerOptions, ClientSettings> SettingsTable { get; } = new();

        public Client(ClientSettings settings)
        {
            Settings = settings;
            Serializer = new MySerializer();
            SettingsTable.Add(Serializer.Options, Settings);
        }

        public ClientSettings Settings { get; }
        public MySerializer Serializer { get; }
    }

    public class ClientSettings
    {
        public int Value { get; init; }
    }

    public class MySerializer
    {
        public MySerializer() => Options = new JsonSerializerOptions();

        public JsonSerializerOptions Options { get; }

        public string Serialize<T>(T value) => JsonSerializer.Serialize(value, Options);
    }

    [JsonConverter(typeof(MyTypeToSerializeConverter))]
    public class MyTypeToSerialize
    {
        public string? Name { get; set; }
    }

    internal sealed class MyTypeToSerializeConverter : JsonConverter<MyTypeToSerialize>
    {
        public override MyTypeToSerialize? Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options) => 
            throw new NotImplementedException();

        public override void Write(Utf8JsonWriter writer, MyTypeToSerialize value, JsonSerializerOptions options)
        {
            writer.WriteStartObject();
            writer.WritePropertyName("name");
            writer.WriteStringValue(value.Name ?? string.Empty);

            if (Client.SettingsTable.TryGetValue(options, out var settings))
            {
                writer.WritePropertyName("settingsValue");
                writer.WriteNumberValue(settings.Value);
            }

            writer.WriteEndObject();
        }
    }
}

Output:

{"name":"Test","settingsValue":1}
{"name":"Test","settingsValue":2}

@eiriktsarpalis
Copy link
Member

That's it more or less. I would probably also hide some of the implementation details behind extension methods:

    public static class Client
    {
        private static ConditionalWeakTable<JsonSerializerOptions, ClientSettings> SettingsTable { get; } = new();

        public static ClientSettings? GetClientSettings(this JsonSerializerOptions options)
             => SettingsTable.TryGetValue(options, out var settings) ? settings : null;

        public static void SetClientSettings(this JsonSerializerOptions options, ClientSettings value) => SettingsTable.Add(options, value);
    }

@stevejgordon
Copy link
Contributor Author

stevejgordon commented Jul 8, 2022

Cool, thanks for the suggestion @eiriktsarpalis. Our client can't be static though. I'll likely switch to something based on ConditionalWeakTable because it's nicer than the hack I've done so far. I still think this proposal or one of the alternatives on the table would be useful as ConditionalWeakTable is certainly not an obvious workaround. This is a good candidate for my next blog post!

@eiriktsarpalis
Copy link
Member

Related to #59892 (comment). While the callback interfaces do not handle state, we might want to consider extending the callbacks in the contract model so that any user-defined state can be handled appropriately.

@krwq
Copy link
Member

krwq commented Sep 28, 2022

Closing this in favor of #63795

@krwq krwq closed this as completed Sep 28, 2022
@stevejgordon
Copy link
Contributor Author

@eiriktsarpalis Finally completed the blog post which describes the path to the final solution for my use case.

@thomaslevesque
Copy link
Member

Related: #34773

@thomaslevesque
Copy link
Member

thomaslevesque commented Oct 19, 2022

Closing this in favor of #63795

I don't think #63795 addresses the requirements described in this issue, as explained in this comment.

It's a reasonable requirement, but I don't think JsonSerializerOptions is the right type to attach state scoped to a particular serialization operation.

Maybe not for a particular serialization operation, but what about "all serialization operations using this JsonSerializationOptions instance"? That's a legitimate use case, and currently the only ways to do it are cumbersome, to say the least (see @stevejgordon's blog post, or mine). The ConditionalWeakTable approach is a bit better, but still arguably a hack...

@eiriktsarpalis eiriktsarpalis modified the milestones: 8.0.0, Future Oct 19, 2022
@eiriktsarpalis
Copy link
Member

@thomaslevesque thanks, I hadn't read your article before. It seems like the primary use case preventing you from injecting state directly to converters is JsonConverterAttribute annotations? Is there anything else I'm missing?

I would guess that unsealing JsonSerializerOptions would not address your use case, so presumably something like the following would work?

public class JsonSerializerOptions
{
    public object? UserData { get; set; } // As with other options properties, gets locked once used
}

This should work fine assuming the app author owns all custom converters, but we might want to consider a design where multiple converter designs interoperate:

public class JsonSerializerOptions
{
    public IDictionary<string, object?> UserData { get; } // As with other options properties, gets locked once used
}

@eiriktsarpalis
Copy link
Member

@stevejgordon you might want to consider using a JsonConverterFactory in your examples to avoid the overhead of ConditionalWeakTable lookup on every serialization operation:

internal sealed class WildcardQueryConverter : JsonConverterFactory
{
     public override bool CanConvert(Type type) => type == typeof(WildcardQuery);
     public override JsonConverter? CreateConverter(Type type, JsonSerializerOptions options)
     {
           options.TryGetClientSettings(out var settings);
           // same implementation as the original converter accepting `settings` parametrically
           // the new instance is scoped to the current `JsonSerializerOptions` so it is safe to hardcode.
           return new WildcardQueryConverterImplementation(settings);
     }
}

@stevejgordon
Copy link
Contributor Author

@eiriktsarpalis That's a good shout, thanks. I will review how easy that is to achieve with our code-gen.

@thomaslevesque
Copy link
Member

Thanks for your reply @eiriktsarpalis!

It seems like the primary use case preventing you from injecting state directly to converters is JsonConverterAttribute annotations?

In this case, yes. I can't just add the converter to the JsonSerializerOptions, because it would then apply to all compatible types; I only want to apply it to some properties. And in that scenario, I don't control the converter's instantiation, so I can't inject state or services into it.

I would guess that unsealing JsonSerializerOptions would not address your use case

No, because I don't control the instantiation of the JsonSerializerOptions (at least in ASP.NET Core).

so presumably something like the following would work?

public class JsonSerializerOptions
{
    public object? UserData { get; set; } // As with other options properties, gets locked once used
}

It's not perfect, but it would work. Your next suggestion with a property bag is probably better.

@stratdev3
Copy link

stratdev3 commented Nov 13, 2023

Interesting thread as it represents the long waiting option before i can fully remove the Newtonsoft package.

My use case : i want to modify some information depending context. Example : hide emails for non essentials partners.

Newtonsoft

Newtonsoft support the OnSerializing attribute with a context passed into the SerializeObject settings.

sample code (copy/paste to roslynpad)

#r "nuget:Newtonsoft.Json/13.0.3"

using System.Runtime.Serialization;
using Newtonsoft.Json;

var user = new User() {
    usr_mail = "login@domain.tld",
    usr_nickname = "nickname"
};

var context = "GDPR"; // pass serialization context
var settings = new JsonSerializerSettings {
    Context = new StreamingContext(StreamingContextStates.Other, context)
};
JsonConvert.SerializeObject(user, settings).Dump();
// {"usr_mail":null,"usr_nickname":"nickname"}


public class User {

    public string usr_mail { get; set; }
    public string usr_nickname { get; set; }

    [OnSerializing]
    internal void OnSerializedMethod(StreamingContext context) {
        // retrieve serialization context and do stuff
        if (context.Context is string ctx && ctx == "GDPR") {
            usr_mail = null;
        }
    }

}

System.Text.Json

The current best effort is in NET7 : i can hide same field but not depending the context as i does not exist.

sample code (copy/paste to roslynpad)

#r "nuget: System.Text.Json, 7.0.3"

using System.Runtime.Serialization;
using System.Text.Json.Serialization;
using System.Text.Json;

var user = new User() {
    usr_mail = "login@domain.tld",
    usr_nickname = "nickname"
};

JsonSerializer.Serialize(user).Dump();
// {"usr_mail":null,"usr_nickname":"nickname"}


public class User : IJsonOnSerializing {

    public string usr_mail { get; set; }
    public string usr_nickname { get; set; }

    void IJsonOnSerializing.OnSerializing() {
        usr_mail = null;
    }

}

If some of you have a workaround, i will appreciate 😀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Text.Json
Projects
None yet
Development

No branches or pull requests

7 participants