Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Allow independent control of auto Union vs Typesafe Enum conversion in UnionOrTypeSafeEnumConverterFactory #71

Closed
bartelink opened this issue Jan 28, 2022 · 9 comments · Fixed by #73

Comments

@bartelink
Copy link
Collaborator

Via dotnet/runtime#55744 (comment)

for autoTypeSafeEnum I would need to split/duplicate the UnionOrTypeSafeEnumConverterFactory as well though - is this what you suggested?

I meant to supply 2 flags to the factory as ctor args which individually control whether to
a) allow automatic Type Safe Enum conversion
b) allow automatic Union conversion

In general, I'd assume that if you like convention based things, you want to start with both behaviors in place.

However, you might want to be able to disable individual behaviors.

@bartelink
Copy link
Collaborator Author

bartelink commented Jan 28, 2022

For Serdes.Create, the autoUnion=true currently turns on both.

It may make sense to consider a potentially breaking API change in 2.3.1 if it makes things clearer given usage is low and the time is right; perhaps:
a) autoTypeSafeEnum = true could trigger the DU -> string conversion where possible but reject unions
b) autoUnion = true could trigger conversion to an object with a discriminator but reject DUs without any bodies, resulting in an exception
c) autoUnionOrTypeSafeEnum = true could do the current behavior (string if possible)

However some combinations don't make sense and/or are mutually exclusive... :( Maybe this needs to be a [<Flags>] enum? I'm hoping the API can be kept boring for the 90% of cases.

🤔 Maybe autoUnion = true can be the fallback mode as it is, but support disabling the individual options via autoTypeSafeEnums=false and autoUnions=false (and complain if you ask for autoUnion and then disable both, or don't request autoUnion but attempt to disable either)

@bartelink bartelink changed the title Feature: Allow indepdent control of auto Union vs Typesafe Enum conversion in UnionOrTypeSafeEnumConverterFactory Feature: Allow independent control of auto Union vs Typesafe Enum conversion in UnionOrTypeSafeEnumConverterFactory Jan 28, 2022
@deyanp
Copy link

deyanp commented Feb 1, 2022

@bartelink , I managed to optimize the code for Simple DUs (with cases without fields, TypeSafeEnums as you call them) to run as fast as fast as for Enums, however I ripped the code off your codebase and simplified it a bit so that I can understand it myself.

The benchmark results now are:

serialization:
image

deserialization
image

The simplified code is:

module EnumVsDUJsonTests.JsonConverters2

open System
open System.Collections.Generic
open System.Linq.Expressions
open System.Text.Json.Serialization
open System.Text.Json
open Microsoft.FSharp.Reflection

[<AutoOpen>]
module private Prelude =
    /// Provides a thread-safe memoization wrapper for supplied function
    let memoize : ('T -> 'S) -> 'T -> 'S =
        fun f ->
            let cache = System.Collections.Concurrent.ConcurrentDictionary<'T, 'S>()
            fun t -> cache.GetOrAdd(t, f)

type IDUWithExplicitTags =
    abstract member Tag : int with get 

module DU =
    let isUnion : Type -> bool =
        memoize (fun t -> FSharpType.IsUnion(t, true))

    let getCases : Type -> UnionCaseInfo[] =
        memoize (fun t -> FSharpType.GetUnionCases(t, true))

    let hasOnlyCasesWithoutFields : Type -> bool =
        memoize (fun t ->
            t
            |> getCases
            |> Seq.forall (fun case -> case.GetFields().Length = 0))
        
    let isTypeSafeEnum (typ : Type) =
        isUnion typ
        && hasOnlyCasesWithoutFields typ        
        
    let getCaseNamesByTag : Type -> IDictionary<int, string> =
            memoize (fun t -> 
                FSharpType.GetUnionCases(t) 
                |> Seq.map(fun x -> (x.Tag, x.Name)) 
                |> dict)

    let getValuesByCaseName<'T> : Type -> IDictionary<string, 'T> =
        memoize (fun t ->
            FSharpType.GetUnionCases(t) 
            |> Seq.map(fun x -> (x.Name, FSharpValue.MakeUnion(x, Array.empty) :?> 'T)) 
            |> dict)
        
    let getValuesByTag<'T> : Type -> IDictionary<int, 'T> =
            memoize (fun t -> 
                FSharpType.GetUnionCases(t) 
                |> Seq.map(fun x -> (x.Tag, FSharpValue.MakeUnion(x, Array.empty) :?> 'T)) 
                |> dict)
            
    let getValuesByExplicitTag<'T> : Type -> IDictionary<int, 'T> =
            memoize (fun t -> 
                FSharpType.GetUnionCases(t) 
                |> Seq.map(fun x -> FSharpValue.MakeUnion(x, Array.empty) :?> 'T)
                |> Seq.map (fun v  -> (box v :?> IDUWithExplicitTags).Tag, v)
                |> dict)               

    let toString duObj =
        let t = duObj.GetType()
        let caseNamesByTag = getCaseNamesByTag t
        caseNamesByTag.[duObj.GetHashCode()]  // Warning: Assumption is that Tag = GetHashCode()!

    let tryParse<'T> fieldName (stringValue:string) : Result<'T, string> =
        let t = typeof<'T>
        let valuesByCaseName = getValuesByCaseName t
        let found,value = valuesByCaseName.TryGetValue(stringValue)
        if found then Ok value else Error $"Field {fieldName} value {stringValue} could not be parsed into type {t.Name}"
                                
    let parse<'T> fieldName (stringValue: string) : 'T =
        let value = tryParse<'T> fieldName stringValue
        match value with 
        | Ok value -> value
        | Error e -> failwith e
        
/// Maps strings to/from Union cases; refuses to convert for values not in the Union
type TypeSafeEnumConverter<'T>() =
    inherit Serialization.JsonConverter<'T>()
    
    let caseNamesByTag = DU.getCaseNamesByTag typeof<'T>
    let valuesByCaseName = DU.getValuesByCaseName typeof<'T>

//    override _.CanConvert(t : Type) =
//        t = typedefof<'T> // && TypeSafeEnum.isTypeSafeEnum t

    override _.Write(writer, value, _options) =
        let str = caseNamesByTag.[(box value).GetHashCode()]
        writer.WriteStringValue str

    override _.Read(reader, _t, _options) =
        if reader.TokenType <> JsonTokenType.String then
            sprintf "Unexpected token when reading TypeSafeEnum: %O" reader.TokenType |> JsonException |> raise
        let caseName = reader.GetString()
        valuesByCaseName.[caseName]

type internal ConverterActivator = delegate of unit -> JsonConverter

type TypeSafeEnumConverterFactory() =
    inherit JsonConverterFactory()

    override _.CanConvert(t : Type) =
        DU.isTypeSafeEnum t

    override _.CreateConverter(typ, _options) =
        let openConverterType = typedefof<TypeSafeEnumConverter<_>>
        let constructor = openConverterType.MakeGenericType(typ).GetConstructors() |> Array.head
        let newExpression = Expression.New(constructor)
        let lambda = Expression.Lambda(typeof<ConverterActivator>, newExpression)

        let activator = lambda.Compile() :?> ConverterActivator
        activator.Invoke()

I did a number of changes and focused entirely on Simple DUs only (with cases without fields) so I removed some of the additional code, so not sure if it makes sense to find out how FsCodec can be changed ...

P.S: Did the same for Simple DUs <> Bson (MongoDB) and Simple DUs <> string, and only string -> Simple DU is 2 times slower than Enum, all the rest are as fast or even faster :)

@deyanp
Copy link

deyanp commented Feb 1, 2022

Here is also the benchmark code:

open System.Text.Json
open System.Text.Json.Serialization
open EnumVsDUJsonTests
// open FsCodec.SystemTextJson
open BenchmarkDotNet.Attributes
open BenchmarkDotNet.Running
open BenchmarkDotNet.Jobs

type AccountingEntryTypeEnum = 
| Debit = 0
| Credit = 1

type AccountingEntryEnum1 = {
    Id : string
    Type : AccountingEntryTypeEnum
}

type AccountingEntryEnum5 = {
    Id : string
    Type : AccountingEntryTypeEnum
    Type2 : AccountingEntryTypeEnum
    Type3 : AccountingEntryTypeEnum
    Type4 : AccountingEntryTypeEnum
    Type5 : AccountingEntryTypeEnum
}

type AccountingEntryEnum10 = {
    Id : string
    Type : AccountingEntryTypeEnum
    Type2 : AccountingEntryTypeEnum
    Type3 : AccountingEntryTypeEnum
    Type4 : AccountingEntryTypeEnum
    Type5 : AccountingEntryTypeEnum
    Type6 : AccountingEntryTypeEnum
    Type7 : AccountingEntryTypeEnum
    Type8 : AccountingEntryTypeEnum
    Type9 : AccountingEntryTypeEnum
    Type10 : AccountingEntryTypeEnum
}

type AccountingEntryTypeEnum10m = 
| Val1 = 1
| Val2 = 2
| Val3 = 3
| Val4 = 4
| Val5 = 5
| Val6 = 6
| Val7 = 7
| Val8 = 8
| Val9 = 9
| Val10 = 10


type AccountingEntryEnum10m = {
    Id : string
    Type : AccountingEntryTypeEnum10m
    Type2 : AccountingEntryTypeEnum10m
    Type3 : AccountingEntryTypeEnum10m
    Type4 : AccountingEntryTypeEnum10m
    Type5 : AccountingEntryTypeEnum10m
    Type6 : AccountingEntryTypeEnum10m
    Type7 : AccountingEntryTypeEnum10m
    Type8 : AccountingEntryTypeEnum10m
    Type9 : AccountingEntryTypeEnum10m
    Type10 : AccountingEntryTypeEnum10m
}

//[<JsonConverter(typeof<TypeSafeEnumConverter<AccountingEntryTypeDU>>)>]
[<Struct>]
type AccountingEntryTypeDU = 
| Debit
| Credit

type AccountingEntryDU1 = {
    Id : string
    Type : AccountingEntryTypeDU
}

type AccountingEntryDU5 = {
    Id : string
    Type : AccountingEntryTypeDU
    Type2 : AccountingEntryTypeDU
    Type3 : AccountingEntryTypeDU
    Type4 : AccountingEntryTypeDU
    Type5 : AccountingEntryTypeDU
}

type AccountingEntryDU10 = {
    Id : string
    Type : AccountingEntryTypeDU
    Type2 : AccountingEntryTypeDU
    Type3 : AccountingEntryTypeDU
    Type4 : AccountingEntryTypeDU
    Type5 : AccountingEntryTypeDU
    Type6 : AccountingEntryTypeDU
    Type7 : AccountingEntryTypeDU
    Type8 : AccountingEntryTypeDU
    Type9 : AccountingEntryTypeDU
    Type10 : AccountingEntryTypeDU
}

type AccountingEntryTypeDU10m = 
| Val1
| Val2
| Val3
| Val4
| Val5
| Val6
| Val7
| Val8
| Val9
| Val10

type AccountingEntryDU10m = {
    Id : string
    Type : AccountingEntryTypeDU10m
    Type2 : AccountingEntryTypeDU10m
    Type3 : AccountingEntryTypeDU10m
    Type4 : AccountingEntryTypeDU10m
    Type5 : AccountingEntryTypeDU10m
    Type6 : AccountingEntryTypeDU10m
    Type7 : AccountingEntryTypeDU10m
    Type8 : AccountingEntryTypeDU10m
    Type9 : AccountingEntryTypeDU10m
    Type10 : AccountingEntryTypeDU10m
}

//[<JsonConverter(typeof<TypeSafeEnumConverter<AccountingEntryTypeDUStruct>>)>]
[<Struct>]
type AccountingEntryTypeDUStruct = 
| Debit
| Credit

type AccountingEntryDUStruct1 = {
    Id : string
    Type : AccountingEntryTypeDUStruct
}

type AccountingEntryDUStruct5 = {
    Id : string
    Type : AccountingEntryTypeDUStruct
    Type2 : AccountingEntryTypeDUStruct
    Type3 : AccountingEntryTypeDUStruct
    Type4 : AccountingEntryTypeDUStruct
    Type5 : AccountingEntryTypeDUStruct
}

type AccountingEntryDUStruct10 = {
    Id : string
    Type : AccountingEntryTypeDUStruct
    Type2 : AccountingEntryTypeDUStruct
    Type3 : AccountingEntryTypeDUStruct
    Type4 : AccountingEntryTypeDUStruct
    Type5 : AccountingEntryTypeDUStruct
    Type6 : AccountingEntryTypeDUStruct
    Type7 : AccountingEntryTypeDUStruct
    Type8 : AccountingEntryTypeDUStruct
    Type9 : AccountingEntryTypeDUStruct
    Type10 : AccountingEntryTypeDUStruct
}


[<Struct>]
type AccountingEntryTypeDUStruct10m = 
| Val1
| Val2
| Val3
| Val4
| Val5
| Val6
| Val7
| Val8
| Val9
| Val10

type AccountingEntryDUStruct10m = {
    Id : string
    Type : AccountingEntryTypeDUStruct10m
    Type2 : AccountingEntryTypeDUStruct10m
    Type3 : AccountingEntryTypeDUStruct10m
    Type4 : AccountingEntryTypeDUStruct10m
    Type5 : AccountingEntryTypeDUStruct10m
    Type6 : AccountingEntryTypeDUStruct10m
    Type7 : AccountingEntryTypeDUStruct10m
    Type8 : AccountingEntryTypeDUStruct10m
    Type9 : AccountingEntryTypeDUStruct10m
    Type10 : AccountingEntryTypeDUStruct10m
}


let serOptions = JsonSerializerOptions()
serOptions.DefaultIgnoreCondition <- JsonIgnoreCondition.WhenWritingNull
serOptions.PropertyNamingPolicy <- JsonNamingPolicy.CamelCase
serOptions.WriteIndented <- true // not for prod maybe!?
serOptions.Converters.Add(JsonStringEnumConverter())
serOptions.Converters.Add(JsonConverters2.TypeSafeEnumConverterFactory())

let deserOptions = JsonSerializerOptions()
deserOptions.PropertyNamingPolicy <- JsonNamingPolicy.CamelCase
deserOptions.Converters.Add(JsonStringEnumConverter())
deserOptions.Converters.Add(JsonConverters2.TypeSafeEnumConverterFactory())

[<SimpleJob (RuntimeMoniker.Net60)>]
[<MemoryDiagnoser>]
type SerBenchmarks() =
    member val serOptions =
        let serOptions = JsonSerializerOptions()
        serOptions.DefaultIgnoreCondition <- JsonIgnoreCondition.WhenWritingNull
        serOptions.PropertyNamingPolicy <- JsonNamingPolicy.CamelCase
        serOptions.WriteIndented <- true // not for prod maybe!?
        serOptions.Converters.Add(JsonStringEnumConverter())
        serOptions.Converters.Add(JsonConverters2.TypeSafeEnumConverterFactory())    
        serOptions
    
    // [<Params(100, 1000, 10000, 100000, 1000000)>]
    [<Params(1000)>]
    member val size = 0 with get, set
    member val x1: AccountingEntryEnum1 = { 
            Id = "aa"
            Type = AccountingEntryTypeEnum.Credit 
        }    
    member val x5: AccountingEntryEnum5 = { 
            Id = "aa"
            Type = AccountingEntryTypeEnum.Credit 
            Type2 = AccountingEntryTypeEnum.Debit 
            Type3 = AccountingEntryTypeEnum.Credit 
            Type4 = AccountingEntryTypeEnum.Debit 
            Type5 = AccountingEntryTypeEnum.Credit 
        }    
    member val x10 : AccountingEntryEnum10 = { 
            Id = "aa"
            Type = AccountingEntryTypeEnum.Credit 
            Type2 = AccountingEntryTypeEnum.Debit 
            Type3 = AccountingEntryTypeEnum.Credit 
            Type4 = AccountingEntryTypeEnum.Debit 
            Type5 = AccountingEntryTypeEnum.Credit 
            Type6 = AccountingEntryTypeEnum.Debit 
            Type7 = AccountingEntryTypeEnum.Credit 
            Type8 = AccountingEntryTypeEnum.Debit 
            Type9 = AccountingEntryTypeEnum.Credit 
            Type10 = AccountingEntryTypeEnum.Debit 
        }
    
    member val x10m : AccountingEntryEnum10m = { 
            Id = "aa"
            Type = AccountingEntryTypeEnum10m.Val10 
            Type2 = AccountingEntryTypeEnum10m.Val10 
            Type3 = AccountingEntryTypeEnum10m.Val10
            Type4 = AccountingEntryTypeEnum10m.Val10
            Type5 = AccountingEntryTypeEnum10m.Val10
            Type6 = AccountingEntryTypeEnum10m.Val10
            Type7 = AccountingEntryTypeEnum10m.Val10
            Type8 = AccountingEntryTypeEnum10m.Val10
            Type9 = AccountingEntryTypeEnum10m.Val10
            Type10 = AccountingEntryTypeEnum10m.Val10
        }
    
    member val xd1: AccountingEntryDU1 = { 
            Id = "aa"
            Type = AccountingEntryTypeDU.Credit 
        }    
    member val xd5: AccountingEntryDU5 = { 
            Id = "aa"
            Type = AccountingEntryTypeDU.Credit 
            Type2 = AccountingEntryTypeDU.Debit 
            Type3 = AccountingEntryTypeDU.Credit 
            Type4 = AccountingEntryTypeDU.Debit 
            Type5 = AccountingEntryTypeDU.Credit 
        }    
    member val xd10 : AccountingEntryDU10 = { 
            Id = "aa"
            Type = AccountingEntryTypeDU.Credit 
            Type2 = AccountingEntryTypeDU.Debit 
            Type3 = AccountingEntryTypeDU.Credit 
            Type4 = AccountingEntryTypeDU.Debit 
            Type5 = AccountingEntryTypeDU.Credit 
            Type6 = AccountingEntryTypeDU.Debit 
            Type7 = AccountingEntryTypeDU.Credit 
            Type8 = AccountingEntryTypeDU.Debit 
            Type9 = AccountingEntryTypeDU.Credit 
            Type10 = AccountingEntryTypeDU.Debit 
        }
    
    member val xd10m : AccountingEntryDU10m = { 
        Id = "aa"
        Type = AccountingEntryTypeDU10m.Val10 
        Type2 = AccountingEntryTypeDU10m.Val10 
        Type3 = AccountingEntryTypeDU10m.Val10 
        Type4 = AccountingEntryTypeDU10m.Val10 
        Type5 = AccountingEntryTypeDU10m.Val10 
        Type6 = AccountingEntryTypeDU10m.Val10 
        Type7 = AccountingEntryTypeDU10m.Val10 
        Type8 = AccountingEntryTypeDU10m.Val10 
        Type9 = AccountingEntryTypeDU10m.Val10 
        Type10 = AccountingEntryTypeDU10m.Val10 
    }
    
    member val xds1: AccountingEntryDUStruct1 = { 
            Id = "aa"
            Type = AccountingEntryTypeDUStruct.Credit 
        }    
    member val xds5: AccountingEntryDUStruct5 = { 
            Id = "aa"
            Type = AccountingEntryTypeDUStruct.Credit 
            Type2 = AccountingEntryTypeDUStruct.Debit 
            Type3 = AccountingEntryTypeDUStruct.Credit 
            Type4 = AccountingEntryTypeDUStruct.Debit 
            Type5 = AccountingEntryTypeDUStruct.Credit 
        }    
    member val xds10 : AccountingEntryDUStruct10 = { 
            Id = "aa"
            Type = AccountingEntryTypeDUStruct.Credit 
            Type2 = AccountingEntryTypeDUStruct.Debit 
            Type3 = AccountingEntryTypeDUStruct.Credit 
            Type4 = AccountingEntryTypeDUStruct.Debit 
            Type5 = AccountingEntryTypeDUStruct.Credit 
            Type6 = AccountingEntryTypeDUStruct.Debit 
            Type7 = AccountingEntryTypeDUStruct.Credit 
            Type8 = AccountingEntryTypeDUStruct.Debit 
            Type9 = AccountingEntryTypeDUStruct.Credit 
            Type10 = AccountingEntryTypeDUStruct.Debit 
        }
    
    member val xds10m : AccountingEntryDUStruct10m = { 
            Id = "aa"
            Type = AccountingEntryTypeDUStruct10m.Val10 
            Type2 = AccountingEntryTypeDUStruct10m.Val10 
            Type3 = AccountingEntryTypeDUStruct10m.Val10 
            Type4 = AccountingEntryTypeDUStruct10m.Val10 
            Type5 = AccountingEntryTypeDUStruct10m.Val10 
            Type6 = AccountingEntryTypeDUStruct10m.Val10 
            Type7 = AccountingEntryTypeDUStruct10m.Val10 
            Type8 = AccountingEntryTypeDUStruct10m.Val10 
            Type9 = AccountingEntryTypeDUStruct10m.Val10 
            Type10 = AccountingEntryTypeDUStruct10m.Val10 
        }
            

    [<Benchmark>]
    member this.Enum1 () = 
        JsonSerializer.Serialize(this.x1, this.serOptions)

    [<Benchmark>]
    member this.DU1 () = 
        JsonSerializer.Serialize(this.xd1, this.serOptions)
        
    [<Benchmark>]
    member this.DUStruct1 () = 
        JsonSerializer.Serialize(this.xds1, this.serOptions)        

    [<Benchmark>]
    member this.Enum5 () = 
        JsonSerializer.Serialize(this.x5, this.serOptions)

    [<Benchmark>]
    member this.DU5 () = 
        JsonSerializer.Serialize(this.xd5, this.serOptions)
        
    [<Benchmark>]
    member this.DUStruct5 () = 
        JsonSerializer.Serialize(this.xds5, this.serOptions)        

    [<Benchmark>]
    member this.Enum10 () = 
        JsonSerializer.Serialize(this.x10, this.serOptions)

    [<Benchmark>]
    member this.DU10 () = 
        JsonSerializer.Serialize(this.xd10, this.serOptions)
        
    [<Benchmark>]
    member this.DUStruct10 () = 
        JsonSerializer.Serialize(this.xds10, this.serOptions)        

    [<Benchmark>]
    member this.Enum10m () = 
        JsonSerializer.Serialize(this.x10m, this.serOptions)      

    [<Benchmark>]
    member this.DU10m () = 
        JsonSerializer.Serialize(this.xd10m, this.serOptions)        
        
    [<Benchmark>]
    member this.DUStruct10m () = 
        JsonSerializer.Serialize(this.xds10m, this.serOptions)        


BenchmarkRunner.Run<SerBenchmarks>() |> ignore

[<SimpleJob (RuntimeMoniker.Net60)>]
[<MemoryDiagnoser>]
type DeserBenchmarks() =
    member val deserOptions =
        let deserOptions = JsonSerializerOptions()
        deserOptions.PropertyNamingPolicy <- JsonNamingPolicy.CamelCase
        deserOptions.Converters.Add(JsonStringEnumConverter())
        deserOptions.Converters.Add(JsonConverters2.TypeSafeEnumConverterFactory())
        deserOptions
    
    // [<Params(100, 1000, 10000, 100000, 1000000)>]
    [<Params(1000)>]
    member val size = 0 with get, set

    member val xs1 : string = """{
    "id": "aa",
    "type": "Credit"
}"""
    member val xs5 : string = """{
    "id": "aa",
    "type": "Credit",
    "type2": "Credit",
    "type3": "Credit",
    "type4": "Credit",
    "type5": "Credit"
}"""
    member val xs10 : string = """{
    "id": "aa",
    "type": "Credit",
    "type2": "Credit",
    "type3": "Credit",
    "type4": "Credit",
    "type5": "Credit",
    "type6": "Credit",
    "type7": "Credit",
    "type8": "Credit",
    "type9": "Credit",
    "type10": "Credit"
}"""

    [<Benchmark>]
    member this.Enum1 () = 
        JsonSerializer.Deserialize<AccountingEntryEnum1>(this.xs1, this.deserOptions)

    [<Benchmark>]
    member this.DU1 () = 
        JsonSerializer.Deserialize<AccountingEntryDU1>(this.xs1, this.deserOptions)
        
    [<Benchmark>]
    member this.DUStruct1 () = 
        JsonSerializer.Deserialize<AccountingEntryDUStruct1>(this.xs1, this.deserOptions)        

    [<Benchmark>]
    member this.Enum5 () = 
        JsonSerializer.Deserialize<AccountingEntryEnum5>(this.xs5, this.deserOptions)

    [<Benchmark>]
    member this.DU5 () = 
        JsonSerializer.Deserialize<AccountingEntryDU5>(this.xs5, this.deserOptions)

    [<Benchmark>]
    member this.DUStruct5 () = 
        JsonSerializer.Deserialize<AccountingEntryDUStruct5>(this.xs5, this.deserOptions)

    [<Benchmark>]
    member this.Enum10 () = 
        JsonSerializer.Deserialize<AccountingEntryEnum10>(this.xs10, this.deserOptions)

    [<Benchmark>]
    member this.DU10 () = 
        JsonSerializer.Deserialize<AccountingEntryDU10>(this.xs10, this.deserOptions)

    [<Benchmark>]
    member this.DUStruct10 () = 
        JsonSerializer.Deserialize<AccountingEntryDUStruct10>(this.xs10, this.deserOptions)
        
BenchmarkRunner.Run<DeserBenchmarks>() |> ignore

@bartelink
Copy link
Collaborator Author

I'd have to see this as a PR to be sure, but here's some thoughts:

  • There seems to be a lot of memoize calls sprinkled around - while I can imagine some provide significant value, I am willing to bet that there are cases where they can actually make perf worse when you have multiple types in the mix.
  • the commenting out of the CanConvert is probably valuable perf-wise - the Factory can probably pass a flag to inhibit the check (and that should also be the default when it is applied as an explicitly tagged converter - the only time it needs to be there is when it is used as a global converter, and the reason it works like that is probably because it has to per the Json.net protocol.).
  • // Warning: Assumption is that Tag = GetHashCode()! sounds like a step too far

It should be possible to arrive at a result that:

  • does not add too much code - for me the terseness and the fact that the UnionConverter shares some of the logic is important (perf is too, but for me perf at all costs is no the high order bit for this lib)
  • gets the majority of the perf improvement
  • allows independent control of wether Type Safe Enums vs stateful unions get converted implicitly
  • gives a good path for removing the TypeSafeEnumConverter and/or having it delegate to an in the box impl in STJ if/when that happens

I guess its down to whether you see sufficient value in the legwork necessary to get it into FsCodec...

@deyanp
Copy link

deyanp commented Feb 1, 2022

There seems to be a lot of memoize calls sprinkled around

I did not see any negative side-effects of this, on the contrary

the commenting out of the CanConvert is probably valuable perf-wise

I don't think this improves performance, as the per-type converters get anyway cached by the System.Text.Json framework

// Warning: Assumption is that Tag = GetHashCode()! sounds like a step too far

As per https://fsharpforfunandprofit.com/posts/fsharp-decompiled/#enum-style-unions the tags are hardcoded integers starting from 0, and GetHashCode returns the Tag. I doubt this will change ever ...

I guess its down to whether you see sufficient value in the legwork necessary to get it into FsCodec...

I am actually a bit hesitant to start suggesting PR for FsCodec, as the codebase is quite complicated and covering many additional cases. I just wanted to share the benchmark results (I am pretty happy the perf is now on par with Enum!) and the code changes I made to reach that ...

I think that the majority of the perf improvement can be achieved by just changing the lookups to use the fact that Tag = GetHashCode, and using int keys for the dictionaries (string or object keys are much much slower)

@bartelink
Copy link
Collaborator Author

I did not see any negative side-effects of this, on the contrary

I'm not doubting that each and every one yields you a benefit in the context of the benchmark. My questioning is whether the benefit degrades as there are more types in play. The secondary concern is whether the benefit is worth it in every case - i.e. for me there needs to be a significant benefit for every usage.

Part of this is based on me wanting to keep the code easy to traverse and/or delete; I see wringing out the last 1% of perf at the cost of lots more LOC as something that would make sense for STJ itself but less for FsCodec - the idea being that someone can figure out what it does by reading 55 + 122 LOC.

I don't think this improves performance, as the per-type converters get anyway cached by the System.Text.Json framework

👍 sounds right

I doubt this will change ever

That sounds reasonable; there'd need to be a more explicit (and less dodgy-sounding!) comment to that effect though ;)

FSharpType.GetUnionCases(t)

this bit does not seem to be shared/memoized?

start suggesting PR for FsCodec, as the codebase is quite complicated

I'll let you be the judge; not sure is there's a specific thing that can be fixed with regard to this?

If you can get in the key memoize changes and it still passes the tests, I'd trust the PR

Failing that, another thing to explore might be (assuming its not already covered, and Tarmil is interested/thinks it makes sense) to contribute your remix to FSharp.SystemTextJson ?

In conclusion, even if you don't think this is easy/necessary to get into FsCodec and/or don't have the time:

  • your benchmark suite is useful, and it would be nice to get the key perf improvements you've identified in - thanks for sharing it, it will help anyone that eventually does this work
  • it would be good to have a way in FsCodec to only opt into the TypeSafeEnumConverter without the UnionConverter coming along for the ride

@bartelink
Copy link
Collaborator Author

bartelink commented Mar 10, 2022

@deyanp About to cut a release with #73 fairly soon - would appreciate feedback (even if you feel it's not directly usable for you - perhaps we can make it so?)

@deyanp
Copy link

deyanp commented Mar 10, 2022

@bartelink , thanks for getting back to me, but I am not using FsCodec, as I mentioned I copied and adapted your code, and this is what I have now (not many more lines of code, but benchmarked and on par with enums):

// Copied/Adapted from
// https://github.com/jet/FsCodec/blob/master/src/FsCodec.SystemTextJson/TypeSafeEnumConverter.fs
namespace Framework.SystemTextJson.Converters

open System
open System.Collections.Generic
open System.Linq.Expressions
open System.Text.Json.Serialization
open System.Text.Json
open Microsoft.FSharp.Reflection

[<AutoOpen>]
module Memoization =
    /// Provides a thread-safe memoization wrapper for supplied function
    let memoize : ('T -> 'S) -> 'T -> 'S =
        fun f ->
            let cache = System.Collections.Concurrent.ConcurrentDictionary<'T, 'S>()
            fun t -> cache.GetOrAdd(t, f)

module DU =
    let isUnion : Type -> bool =
        memoize (fun t -> FSharpType.IsUnion(t, true))

    let getCases : Type -> UnionCaseInfo[] =
        memoize (fun t -> FSharpType.GetUnionCases(t, true))

    let hasOnlyCasesWithoutFields : Type -> bool =
        memoize (fun t ->
            t
            |> getCases
            |> Seq.forall (fun case -> case.GetFields().Length = 0))
        
    let isTypeSafeEnum (typ : Type) =
        isUnion typ
        && hasOnlyCasesWithoutFields typ
        
    let getCaseNamesByTag : Type -> IDictionary<int, string> =
            memoize (fun t -> 
                FSharpType.GetUnionCases(t) 
                |> Seq.map(fun x -> (x.Tag, x.Name)) 
                |> dict)
        
    let getValuesByCaseName<'T> : Type -> IDictionary<string, 'T> =
        memoize (fun t ->
            FSharpType.GetUnionCases(t)
            |> Seq.map(fun x -> (x.Name, FSharpValue.MakeUnion(x, Array.empty) :?> 'T)) 
            |> dict)

/// Maps strings to/from Union cases; refuses to convert for values not in the Union
type SimpleDUJsonConverter<'T>() =
    inherit Serialization.JsonConverter<'T>()
    
    let caseNamesByTag = DU.getCaseNamesByTag typeof<'T>
    let valuesByCaseName = DU.getValuesByCaseName typeof<'T>

//    override _.CanConvert(t : Type) =
//        t = typedefof<'T> // && TypeSafeEnum.isTypeSafeEnum t

    override _.Write(writer, value, _options) =
        let valueHashCode = (box value).GetHashCode()
        let found, caseName = caseNamesByTag.TryGetValue(valueHashCode)
        if found then
            writer.WriteStringValue caseName
        else
            sprintf "Could not serialize value %A of Simple DU %s" value typeof<'T>.FullName |> JsonException |> raise

    override _.Read(reader, _t, _options) =
        if reader.TokenType <> JsonTokenType.String then
            sprintf "Unexpected token type %O found upon deserializing into Simple DU %s" reader.TokenType typeof<'T>.FullName
            |> JsonException |> raise
        let caseName = reader.GetString()
        let found, value = valuesByCaseName.TryGetValue(caseName)
        if found then
            value
        else
            sprintf "Could not deserialize string %s into Simple DU %s" caseName typeof<'T>.FullName
            |> JsonException |> raise

type internal ConverterActivator = delegate of unit -> JsonConverter

type SimpleDUJsonConverterFactory() =
    inherit JsonConverterFactory()

    override _.CanConvert(t : Type) =
        DU.isTypeSafeEnum t

    override _.CreateConverter(typ, _options) =
        let openConverterType = typedefof<SimpleDUJsonConverter<_>>
        let constructor = openConverterType.MakeGenericType(typ).GetConstructors() |> Array.head
        let newExpression = Expression.New(constructor)
        let lambda = Expression.Lambda(typeof<ConverterActivator>, newExpression)

        let activator = lambda.Compile() :?> ConverterActivator
        activator.Invoke()

@bartelink
Copy link
Collaborator Author

@deyanp Cool - thanks for sharing - it may help someone in the future
If you or anyone wants to make a PR to bring the perf improvements into FsCodec, that'd be welcome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants