Performance degradation of `Expression.Lambda<TDelegate>` in netcore #32767

vchirikov · 2020-02-24T23:09:49Z

Hi, after the migration from netfx to netcore we observed a performance degradation in places with intensive usage of Expression.
After profiling and research we found a reason of this, it was Expression.Lambda<TDelegate> call.
I assume that this PR is the reason of this performance degradation.
To summarize, before this changes we called the generic constructor. After the changes firstly we search in the cache (cache size is 50) if we have cache-miss we will use reflection and will replace one row in the cache.
I think the cache isn't necessary for generic Expression.Lambda<TDelegate> method.

I have prepared and built my own runtime fork with a fix and also did a benchmark for a cache-miss situation.

Benchmark:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Linq.Expressions;
using System.Reflection;
using System.Text.RegularExpressions;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
using BenchmarkDotNet.Diagnostics.Windows.Configs;
using BenchmarkDotNet.Environments;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNet.Order;
using BenchmarkDotNet.Running;
using BenchmarkDotNet.Toolchains.CoreRun;
using BenchmarkDotNet.Toolchains.CsProj;
using BenchmarkDotNet.Toolchains.DotNetCli;

namespace expression_lambda_regression_issue
{
    public static class Program
    {
        [Orderer(SummaryOrderPolicy.Declared, MethodOrderPolicy.Declared)]
        [MarkdownExporterAttribute.GitHub]
        [GcServer(true)]
        [MinColumn, MaxColumn]
        [MemoryDiagnoser]
        [NativeMemoryProfiler]
        public class ExpressionLambdaBenchmark
        {
            public static readonly List<Expression> bodies = new List<Expression>();
            public static readonly List<ParameterExpression[]> parameters = new List<ParameterExpression[]>();
            public static readonly List<Func<Expression, IEnumerable<ParameterExpression>, LambdaExpression>> exprLambdaCallers = new List<Func<Expression, IEnumerable<ParameterExpression>, LambdaExpression>>();

            [GlobalSetup]
            public void Setup()
            {
                var coreFxAssemblyInfo = FileVersionInfo.GetVersionInfo(typeof(Regex).GetTypeInfo().Assembly.Location);
                var coreClrAssemblyInfo = FileVersionInfo.GetVersionInfo(typeof(object).GetTypeInfo().Assembly.Location);

                Console.WriteLine($"// CoreFx version: {coreFxAssemblyInfo.FileVersion}, location {typeof(Regex).GetTypeInfo().Assembly.Location}, product version {coreFxAssemblyInfo.ProductVersion}");
                Console.WriteLine($"// CoreClr version {coreClrAssemblyInfo.FileVersion}, location {typeof(object).GetTypeInfo().Assembly.Location}, product version {coreClrAssemblyInfo.ProductVersion}");

                // we need generate > 100 Expression.Lambda<TDelegate>, so I take 259
                var list = typeof(List<>).Assembly.GetTypes()
                    .Where(t => !t.IsAbstract && !t.IsGenericType)
                    .Select(t => (Type: t, Property: t.GetProperties(BindingFlags.Public | BindingFlags.Instance).FirstOrDefault(x => x.CanRead && x.PropertyType == typeof(string))))
                    .Where(t => t.Property != null)
                    .ToList();

                if (list.Count <= 100)
                    throw new Exception("Not enough for reproducing regression. Introduced cache size is 50");

                for (var i = 0; i < list.Count; i++)
                {
                    var parameter = Expression.Parameter(list[i].Type, "foo");
                    bodies.Add(Expression.Constant(null, typeof(object)));
                    parameters.Add(new ParameterExpression[] { parameter });
                }

                var mi = typeof(Expression).GetMethods()
                    .Single(m => m.IsGenericMethod && m.Name == "Lambda" && m.GetParameters()[1].ParameterType == typeof(IEnumerable<ParameterExpression>));

                foreach (var item in list)
                {
                    var exprLambdaReturnType = typeof(Func<,>).MakeGenericType(item.Type, typeof(object));
                    var exprLambda = mi.MakeGenericMethod(exprLambdaReturnType);
                    exprLambdaCallers.Add((Func<Expression, IEnumerable<ParameterExpression>, LambdaExpression>)Delegate.CreateDelegate(typeof(Func<Expression, IEnumerable<ParameterExpression>, LambdaExpression>), exprLambda));
                }
            }

            [Benchmark]
            public LambdaExpression Expression_Lambda_TDelegate()
            {
                LambdaExpression lambda = default;
                for (int i = 0; i < 1_000; ++i)
                {
                    for (int j = 0; j < exprLambdaCallers.Count; ++j)
                    {
                        lambda = exprLambdaCallers[j](bodies[j], parameters[j]);
                    }
                }
                return lambda;
            }
        }

        public static void Main(string[] _)
        {
            const int warmup = 1;
            const int invocation = 1;
            const int launch = 1;
            const int iteration = 1;
            var config = DefaultConfig.Instance
                .With(Job.Dry
                    .With(CsProjClassicNetToolchain.Net48)
                    .WithId("net48")
                    .WithBaseline(true)
                    .WithWarmupCount(warmup)
                    .WithLaunchCount(launch)
                    .WithIterationCount(iteration)
                    .WithInvocationCount(invocation)
                )
                .With(Job.Dry
                    .With(new CoreRunToolchain(new FileInfo(@"C:\work\dotnet\runtime-upstream\artifacts\bin\testhost\netcoreapp5.0-Windows_NT-Release-x64\shared\Microsoft.NETCore.App\5.0.0\CoreRun.exe"), true, "netcoreapp5.0", displayName: "netcoreapp50"))
                    .WithId("netcoreapp50")
                    .WithWarmupCount(warmup)
                    .WithLaunchCount(launch)
                    .WithInvocationCount(invocation)
                    .WithIterationCount(iteration)
                )
                .With(Job.Dry
                    .With(new CoreRunToolchain(new FileInfo(@"C:\work\dotnet\runtime-upstream\artifacts\bin\testhost\netcoreapp5.0-Windows_NT-Release-x64\shared\Microsoft.NETCore.App\5.0.0\CoreRun.exe"), true, "netcoreapp5.0", displayName: "netcoreapp50"))
                    .WithId("netcoreapp50")
                    .WithWarmupCount(warmup)
                    .WithLaunchCount(launch)
                    .WithInvocationCount(invocation)
                    .WithIterationCount(iteration)
                    .WithEnvironmentVariable("COMPlus_TieredCompilation", "0")
                )
                .With(Job.Dry
                    .With(new CoreRunToolchain(new FileInfo(@"C:\work\dotnet\runtime\artifacts\bin\testhost\netcoreapp5.0-Windows_NT-Release-x64\shared\Microsoft.NETCore.App\5.0.0\CoreRun.exe"), true, "netcoreapp5.0", displayName: "netcoreapp50_vchirikov"))
                    .WithId("netcoreapp50_vchirikov")
                    .WithWarmupCount(warmup)
                    .WithLaunchCount(launch)
                    .WithInvocationCount(invocation)
                    .WithIterationCount(iteration)
                )
                .With(Job.Dry
                    .With(new CoreRunToolchain(new FileInfo(@"C:\work\dotnet\runtime\artifacts\bin\testhost\netcoreapp5.0-Windows_NT-Release-x64\shared\Microsoft.NETCore.App\5.0.0\CoreRun.exe"), true, "netcoreapp5.0", displayName: "netcoreapp50_vchirikov"))
                    .WithId("netcoreapp50_vchirikov")
                    .WithWarmupCount(warmup)
                    .WithLaunchCount(launch)
                    .WithInvocationCount(invocation)
                    .WithIterationCount(iteration)
                    .WithEnvironmentVariable("COMPlus_TieredCompilation", "0")
                )
                ;
            BenchmarkRunner.Run<ExpressionLambdaBenchmark>(config);
        }
    }
}

BenchmarkDotNet=v0.12.0, OS=Windows 10.0.18363
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
.NET Core SDK=5.0.100-preview.2.20120.11
  [Host]     : .NET Core 5.0.0 (CoreCLR 5.0.20.11909, CoreFX 5.0.20.11909), X64 RyuJIT
  Job-OOGNGR : .NET Core 5.0.0 (CoreCLR 42.42.42.42424, CoreFX 42.42.42.42424), X64 RyuJIT
  Job-ZDXXHT : .NET Core 5.0.0 (CoreCLR 42.42.42.42424, CoreFX 42.42.42.42424), X64 RyuJIT
  Job-AZQMMA : .NET Framework 4.8 (4.8.4121.0), X64 RyuJIT
  Job-MHZNLL : .NET Core 5.0.0 (CoreCLR 42.42.42.42424, CoreFX 42.42.42.42424), X64 RyuJIT
  Job-THTYGA : .NET Core 5.0.0 (CoreCLR 42.42.42.42424, CoreFX 42.42.42.42424), X64 RyuJIT

Server=True  InvocationCount=1  IterationCount=1  
LaunchCount=1  RunStrategy=ColdStart  UnrollFactor=1  
WarmupCount=1

Method	EnvironmentVariables	Toolchain	Mean	Error	Min	Max	Ratio	Gen 0	Gen 1	Gen 2	Allocated	Allocated native memory	Native memory leak
Expression_Lambda_TDelegate	COMPlus_TieredCompilation=0	netcoreapp50	905.8 ms	NA	905.8 ms	905.8 ms	1.97	20000.0000	1000.0000	-	164.21 MB	1.13 MB	0 MB
Expression_Lambda_TDelegate	COMPlus_TieredCompilation=0	netcoreapp50_vchirikov	172.6 ms	NA	172.6 ms	172.6 ms	0.38	11000.0000	-	-	94.58 MB	0.62 MB	0 MB
Expression_Lambda_TDelegate	Empty	net48	459.2 ms	NA	459.2 ms	459.2 ms	1.00	143000.0000	-	-	226.48 MB	0.43 MB	0.01 MB
Expression_Lambda_TDelegate	Empty	netcoreapp50	883.4 ms	NA	883.4 ms	883.4 ms	1.92	20000.0000	1000.0000	-	164.14 MB	1.55 MB	0.22 MB
Expression_Lambda_TDelegate	Empty	netcoreapp50_vchirikov	202.2 ms	NA	202.2 ms	202.2 ms	0.44	11000.0000	-	-	94.73 MB	1.02 MB	0.22 MB

Fix here: #32768

--
p.s. I fixed the benchmark, now it shows the cache-miss situation better

The text was updated successfully, but these errors were encountered:

Unnecessary cache call in Expression.Lambda<TDelegate> (with reflection in worst scenario) has been removed. Fix dotnet#32767

vchirikov · 2020-02-24T23:39:18Z

@stephentoub, @karelz please take a look

karelz · 2020-02-25T00:16:00Z

This package was archived: https://github.com/dotnet/runtime/blob/master/docs/area-owners.md see #27790 for details.
@cston @333fred any thoughts?

vchirikov · 2020-02-25T10:41:40Z

We would consider changes that address significant bugs or regressions

@cston @333fred @jaredpar this is a huge performance regression for our app (as you can see 2x slower than net framework), can you take a look at this issue, maybe the dotnet team can accept the fix or make their own?

cston · 2020-02-25T16:12:01Z

@vchirikov thanks for investigating this issue and providing a fix. We will consider fixes for performance regressions.

vchirikov · 2020-02-25T16:43:18Z

@cston , thanks.
What do you think about next 3.1 release? Could we include the fix to LTS branch too?

Unnecessary cache call in Expression.Lambda<TDelegate> (with reflection in worst scenario) has been removed. Fix #32767

botinko · 2020-02-25T21:50:06Z

I also worked with this issue. I created benchmark, which reproduces the slowdown in our use case.
It's just calls some method, which call 1000 different compiler-generated Expression.Lambda<TDelegate> .

ExpressionLambdaBench.zip

On my machine degradation is very significant.

This is very common scenario, when we use ORM and IQueriable.

As a workaround in version 3.1 we can

replace CacheDict with our own, with larger size by reflection
remove signatures from runtime and apply @vchirikov patch
compile our own runtime
patch our code, replacing all Expression.Lambda<TDelegate> calls with fast ones

As a quick fix we choose option 1 and hope, for patch in net core 3.1.x

Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Feb 24, 2020

vchirikov pushed a commit to vchirikov/runtime that referenced this issue Feb 24, 2020

Improve Expression.Lambda<TDelegate> performance

8d80b50

Unnecessary cache call in Expression.Lambda<TDelegate> (with reflection in worst scenario) has been removed. Fix dotnet#32767

vchirikov mentioned this issue Feb 24, 2020

Improve Expression.Lambda<TDelegate> performance #32768

Merged

karelz added area-System.Linq.Expressions tenet-performance Performance related issue labels Feb 25, 2020

cston added this to the 5.0 milestone Feb 25, 2020

cston removed the untriaged New issue has not been triaged by the area owner label Feb 25, 2020

cston closed this as completed in #32768 Feb 25, 2020

cston pushed a commit that referenced this issue Feb 25, 2020

Improve Expression.Lambda<TDelegate> performance (#32768)

634b348

Unnecessary cache call in Expression.Lambda<TDelegate> (with reflection in worst scenario) has been removed. Fix #32767

cston mentioned this issue Feb 26, 2020

[release/3.1] Improve Expression.Lambda<TDelegate> performance (dotnet/runtime#32768) dotnet/corefx#42874

Closed

AlexUstinov mentioned this issue Apr 18, 2020

Workaround of Expression.Lambda performance degradation DataObjects-NET/dataobjects-net#23

Merged

ghost locked as resolved and limited conversation to collaborators Dec 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance degradation of `Expression.Lambda<TDelegate>` in netcore #32767

Performance degradation of `Expression.Lambda<TDelegate>` in netcore #32767

vchirikov commented Feb 24, 2020 •

edited

Loading

vchirikov commented Feb 24, 2020

karelz commented Feb 25, 2020

vchirikov commented Feb 25, 2020 •

edited

Loading

cston commented Feb 25, 2020

vchirikov commented Feb 25, 2020

botinko commented Feb 25, 2020 •

edited

Loading

Performance degradation of Expression.Lambda<TDelegate> in netcore #32767

Performance degradation of Expression.Lambda<TDelegate> in netcore #32767

Comments

vchirikov commented Feb 24, 2020 • edited Loading

vchirikov commented Feb 24, 2020

karelz commented Feb 25, 2020

vchirikov commented Feb 25, 2020 • edited Loading

cston commented Feb 25, 2020

vchirikov commented Feb 25, 2020

botinko commented Feb 25, 2020 • edited Loading

Performance degradation of `Expression.Lambda<TDelegate>` in netcore #32767

Performance degradation of `Expression.Lambda<TDelegate>` in netcore #32767

vchirikov commented Feb 24, 2020 •

edited

Loading

vchirikov commented Feb 25, 2020 •

edited

Loading

botinko commented Feb 25, 2020 •

edited

Loading