From 9fb0c94528f98dbc155c204ba60cceb1ad528467 Mon Sep 17 00:00:00 2001 From: Bruce Forstall Date: Wed, 31 Mar 2021 17:50:09 -0700 Subject: [PATCH] Generalize loop inversion Starting with an experimental branch from @AndyAyersMS, generalize the loop inversion `optInvertWhileLoop` code to consider duplicating an entire conditional block, not just a conditional block that contains exactly one JTRUE tree. Since the JTRUE itself wasn't considered before in the costing, bump up the maximum cost allowed by 2 to account for that. (Note that there is a lot of room here for tuning the cost/benefit analysis.) Additionally, the code already bumped the allowed cost if the condition tree contained calls to a shared static helper. Add another boost if the tree contains array.Length expressions, which are likely to be CSE'ed. Loop inversion by itself doesn't buy much, but our downstream phases, like natural loop recognition, loop hoisting, and loop cloning, depend on an inverted loop structure with a zero trip test. There are many diffs, typically size regressions because we are duplicating code. Some notable cases: 1. BenchI Puzzle. This was a motivating example, from #6569. It ends up with multiple new CSEs and BDN shows a 24% run-time improvement. 2. RegexRedux_5 and ReverseComplement_1 also report faster 3. On the contrary, FannkuchRedux_5 reports 23% slower. It appears we create fewer CSEs and end up with a couple more instructions inside some tight loops. 4. Other things typically seen in diffs: (a) removed null and bounds checks, (b) additional CSEs, (c) loop alignment kicks in. The loop matching at the stage inversion runs is lexical and very simplistic, as it is before natural loop recognition. As a result, for some loops, like multiple-condition `while` loops (e.g., `while (--digits >= 0 || value != 0)` in `System.Number:UInt32ToDecChars`), the inversion ends up creating some pretty weird flow. (That's actually not new to this change.) I added a `COMPlus_JitDoLoopInversion` config to allow turning off the phase to do experiments. Here are the diffs for SPMI on the benchmarks: ``` Summary of Code Size diffs: (Lower is better) Total bytes of base: 306240 Total bytes of diff: 306022 Total bytes of delta: -218 (-0.07% of base) diff is an improvement. ```
Detail diffs ``` Top file regressions (bytes): 204 : 18210.dasm (13.98% of base) 123 : 17149.dasm (3.84% of base) 115 : 12617.dasm (1.95% of base) 106 : 19256.dasm (5.92% of base) 105 : 8717.dasm (8.19% of base) 94 : 5445.dasm (9.49% of base) 85 : 16452.dasm (1.48% of base) 82 : 19234.dasm (12.01% of base) 76 : 4626.dasm (20.05% of base) 64 : 18620.dasm (6.97% of base) 52 : 19202.dasm (16.20% of base) 50 : 19210.dasm (3.15% of base) 48 : 25971.dasm (0.66% of base) 46 : 18490.dasm (6.05% of base) 46 : 19131.dasm (8.44% of base) 36 : 17812.dasm (1.84% of base) 36 : 10409.dasm (4.20% of base) 36 : 15922.dasm (17.48% of base) 35 : 17997.dasm (11.18% of base) 34 : 14621.dasm (4.82% of base) Top file improvements (bytes): -557 : 18190.dasm (-19.94% of base) -525 : 18412.dasm (-12.80% of base) -260 : 18400.dasm (-19.89% of base) -202 : 17991.dasm (-16.67% of base) -201 : 20540.dasm (-18.91% of base) -200 : 18417.dasm (-9.96% of base) -159 : 17736.dasm (-7.87% of base) -128 : 18326.dasm (-7.15% of base) -90 : 18182.dasm (-4.71% of base) -67 : 18177.dasm (-3.94% of base) -59 : 18359.dasm (-5.13% of base) -53 : 20672.dasm (-8.02% of base) -53 : 18280.dasm (-10.77% of base) -45 : 18575.dasm (-8.08% of base) -42 : 16930.dasm (-7.89% of base) -40 : 17935.dasm (-4.77% of base) -38 : 17814.dasm (-2.11% of base) -38 : 18458.dasm (-4.87% of base) -36 : 18222.dasm (-3.07% of base) -36 : 18440.dasm (-6.75% of base) 376 total files with Code Size differences (153 improved, 223 regressed), 5 unchanged. Top method regressions (bytes): 204 (13.98% of base) : 18210.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceNamedTypeSymbol:MakeAcyclicInterfaces(Roslyn.Utilities.ConsList`1[[Microsoft.CodeAnalysis.CSharp.Symbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],Microsoft.CodeAnalysis.DiagnosticBag):System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.NamedTypeSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]:this 123 ( 3.84% of base) : 17149.dasm - Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.LanguageParser:ParseNamespaceBody(byref,byref,byref,ushort):this 115 ( 1.95% of base) : 12617.dasm - Utf8Json.Formatters.DictionaryFormatterBase`5[Int32,__Canon,__Canon,Enumerator,__Canon][System.Int32,System.__Canon,System.__Canon,System.Collections.Generic.Dictionary`2+Enumerator[System.Int32,System.__Canon],System.__Canon]:Serialize(byref,System.__Canon,Utf8Json.IJsonFormatterResolver):this 106 ( 5.92% of base) : 19256.dasm - d__31:MoveNext():bool:this 105 ( 8.19% of base) : 8717.dasm - System.String:SplitWithPostProcessing(System.ReadOnlySpan`1[Int32],System.ReadOnlySpan`1[Int32],int,int,int):System.String[]:this 94 ( 9.49% of base) : 5445.dasm - System.Uri:CreateUriInfo(long):this 85 ( 1.48% of base) : 16452.dasm - DynamicClass:_DynamicMethod9(System.IO.TextReader,int):MicroBenchmarks.Serializers.MyEventsListerViewModel 82 (12.01% of base) : 19234.dasm - d__53:MoveNext():bool:this 76 (20.05% of base) : 4626.dasm - System.Collections.Generic.PriorityQueue`2[__Canon,__Canon][System.__Canon,System.__Canon]:MoveDownCustomComparer(System.ValueTuple`2[__Canon,__Canon],int):this 64 ( 6.97% of base) : 18620.dasm - Microsoft.CodeAnalysis.CSharp.Binder:BindSimpleBinaryOperator(Microsoft.CodeAnalysis.CSharp.Syntax.BinaryExpressionSyntax,Microsoft.CodeAnalysis.DiagnosticBag):Microsoft.CodeAnalysis.CSharp.BoundExpression:this 52 (16.20% of base) : 19202.dasm - d__36:MoveNext():bool:this 50 ( 3.15% of base) : 19210.dasm - d__23[__Canon,__Canon,__Canon,__Canon,__Canon,__Canon,__Canon,__Canon,__Canon][System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon]:MoveNext():bool:this 48 ( 0.66% of base) : 25971.dasm - DynamicClass:_DynamicMethod1(System.IO.TextReader,int):MicroBenchmarks.Serializers.CollectionsOfPrimitives 46 ( 6.05% of base) : 18490.dasm - Microsoft.CodeAnalysis.CSharp.ClsComplianceChecker:VisitAssembly(Microsoft.CodeAnalysis.CSharp.Symbols.AssemblySymbol):this 46 ( 8.44% of base) : 19131.dasm - d__200:MoveNext():bool:this 36 ( 1.84% of base) : 17812.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceNamespaceSymbol:MakeNameToMembersMap(Microsoft.CodeAnalysis.DiagnosticBag):System.Collections.Generic.Dictionary`2[[System.String, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.NamespaceOrTypeSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]], System.Collections.Immutable, Version=6.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a]]:this 36 ( 4.20% of base) : 10409.dasm - System.Net.Security.SslStream:FillHandshakeBufferAsync(System.Net.Security.AsyncReadWriteAdapter,int):System.Threading.Tasks.ValueTask`1[Int32]:this 36 (17.48% of base) : 15922.dasm - System.Number:DecimalToNumber(byref,byref) 35 (11.18% of base) : 17997.dasm - d__126:MoveNext():bool:this 34 ( 4.82% of base) : 14621.dasm - BenchmarksGame.KNucleotide_1:Bench(System.IO.Stream,BenchmarksGame.NucleotideHelpers,bool):bool Top method improvements (bytes): -557 (-19.94% of base) : 18190.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceNamedTypeSymbol:MakeDeclaredBases(Roslyn.Utilities.ConsList`1[[Microsoft.CodeAnalysis.CSharp.Symbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],Microsoft.CodeAnalysis.DiagnosticBag):System.Tuple`2[[Microsoft.CodeAnalysis.CSharp.Symbols.NamedTypeSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35],[System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.NamedTypeSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]], System.Collections.Immutable, Version=6.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a]]:this -525 (-12.80% of base) : 18412.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceMemberContainerTypeSymbol:ComputeInterfaceImplementations(Microsoft.CodeAnalysis.DiagnosticBag,System.Threading.CancellationToken):System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.SynthesizedExplicitImplementationForwardingMethod, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]:this -260 (-19.89% of base) : 18400.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.OverriddenOrHiddenMembersHelpers:FindOverriddenOrHiddenMembersInType(Microsoft.CodeAnalysis.CSharp.Symbol,bool,Microsoft.CodeAnalysis.CSharp.Symbols.NamedTypeSymbol,Microsoft.CodeAnalysis.CSharp.Symbols.NamedTypeSymbol,byref,byref,byref) -202 (-16.67% of base) : 17991.dasm - Microsoft.CodeAnalysis.CSharp.CSharpCompilation:GetRuntimeMember(Microsoft.CodeAnalysis.CSharp.Symbols.NamedTypeSymbol,byref,Microsoft.CodeAnalysis.RuntimeMembers.SignatureComparer`5[[Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35],[Microsoft.CodeAnalysis.CSharp.Symbols.FieldSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35],[Microsoft.CodeAnalysis.CSharp.Symbols.PropertySymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35],[Microsoft.CodeAnalysis.CSharp.Symbols.TypeSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35],[Microsoft.CodeAnalysis.CSharp.Symbols.ParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],Microsoft.CodeAnalysis.CSharp.Symbols.AssemblySymbol):Microsoft.CodeAnalysis.CSharp.Symbol -201 (-18.91% of base) : 20540.dasm - Microsoft.CodeAnalysis.CSharp.SyntaxTreeSemanticModel:GetDeclaredMember(Microsoft.CodeAnalysis.CSharp.Symbols.NamespaceOrTypeSymbol,Microsoft.CodeAnalysis.Text.TextSpan,System.String):Microsoft.CodeAnalysis.CSharp.Symbol:this -200 (-9.96% of base) : 18417.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceMemberContainerTypeSymbol:CheckMemberNameConflicts(Microsoft.CodeAnalysis.DiagnosticBag):this -159 (-7.87% of base) : 17736.dasm - Microsoft.CodeAnalysis.CSharp.CSharpCompilation:GetDiagnostics(int,bool,Microsoft.CodeAnalysis.DiagnosticBag,System.Threading.CancellationToken):this -128 (-7.15% of base) : 18326.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceMemberContainerTypeSymbol:MakeTypeMembers(Microsoft.CodeAnalysis.DiagnosticBag):System.Collections.Generic.Dictionary`2[[System.String, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.NamedTypeSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]], System.Collections.Immutable, Version=6.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a]]:this -90 (-4.71% of base) : 18182.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceNamedTypeSymbol:PostDecodeWellKnownAttributes(System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.CSharpAttributeData, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Syntax.AttributeSyntax, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],Microsoft.CodeAnalysis.DiagnosticBag,short,Microsoft.CodeAnalysis.WellKnownAttributeData):this -67 (-3.94% of base) : 18177.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceMemberContainerTypeSymbol:ForceComplete(Microsoft.CodeAnalysis.SourceLocation,System.Threading.CancellationToken):this -59 (-5.13% of base) : 18359.dasm - Microsoft.CodeAnalysis.CSharp.Binder:ValidateParameterNameConflicts(System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.TypeParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.ParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],Microsoft.CodeAnalysis.DiagnosticBag):this -53 (-8.02% of base) : 20672.dasm - Microsoft.CodeAnalysis.CSharp.Imports:LookupSymbolInUsings(System.Collections.Immutable.ImmutableArray`1[NamespaceOrTypeAndUsingDirective],Microsoft.CodeAnalysis.CSharp.Binder,Microsoft.CodeAnalysis.CSharp.LookupResult,System.String,int,Roslyn.Utilities.ConsList`1[[Microsoft.CodeAnalysis.CSharp.Symbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],int,bool,byref) -53 (-10.77% of base) : 18280.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.NamespaceOrTypeSymbol:GetSourceTypeMember(System.String,int,ushort,Microsoft.CodeAnalysis.CSharp.CSharpSyntaxNode):Microsoft.CodeAnalysis.CSharp.Symbols.SourceNamedTypeSymbol:this -45 (-8.08% of base) : 18575.dasm - Microsoft.CodeAnalysis.CSharp.ExecutableCodeBinder:ValidateIteratorMethods(Microsoft.CodeAnalysis.DiagnosticBag):this -42 (-7.89% of base) : 16930.dasm - V8.Crypto.BigInteger:fromByteArray(System.Byte[]):this -40 (-4.77% of base) : 17935.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.NamespaceOrTypeSymbol:LookupMetadataType(byref):Microsoft.CodeAnalysis.CSharp.Symbols.NamedTypeSymbol:this -38 (-2.11% of base) : 17814.dasm - Microsoft.CodeAnalysis.CSharp.MergedNamespaceDeclaration:MakeChildren():System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.MergedNamespaceOrTypeDeclaration, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]:this -38 (-4.87% of base) : 18458.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceMemberMethodSymbol:ForceComplete(Microsoft.CodeAnalysis.SourceLocation,System.Threading.CancellationToken):this -36 (-3.07% of base) : 18222.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceMemberContainerTypeSymbol:MakeAllMembers(Microsoft.CodeAnalysis.DiagnosticBag):System.Collections.Generic.Dictionary`2[[System.String, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]], System.Collections.Immutable, Version=6.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a]]:this -36 (-6.75% of base) : 18440.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceMemberContainerTypeSymbol:CheckTypeParameterNameConflicts(Microsoft.CodeAnalysis.DiagnosticBag):this Top method regressions (percentages): 17 (36.17% of base) : 22943.dasm - System.Collections.IndexerSetReverse`1[Int32][System.Int32]:Array():System.Int32[]:this 17 (36.17% of base) : 24496.dasm - System.Collections.IndexerSetReverse`1[__Canon][System.__Canon]:Array():System.__Canon[]:this 18 (27.27% of base) : 13842.dasm - System.Collections.IterateFor`1[__Canon][System.__Canon]:Get(System.Collections.Generic.IList`1[__Canon]):System.__Canon:this 15 (26.79% of base) : 13055.dasm - System.Number:UInt32ToDecChars(long,int,int):long 15 (25.86% of base) : 975.dasm - System.Number:UInt32ToDecChars(long,int,int):long 16 (25.81% of base) : 20092.dasm - System.Collections.IndexerSet`1[__Canon][System.__Canon]:Set(System.Collections.Generic.IList`1[__Canon]):System.Collections.Generic.IList`1[__Canon]:this 22 (25.58% of base) : 9613.dasm - System.Xml.XmlLoader:LoadDocSequence(System.Xml.XmlDocument):this 18 (25.35% of base) : 11926.dasm - EMFloatClass:normalize(InternalFPF) 14 (22.95% of base) : 25097.dasm - V8.Richards.Packet:addTo(V8.Richards.Packet):V8.Richards.Packet:this 16 (21.92% of base) : 4954.dasm - System.IO.Compression.DeflateStream:WriteDeflaterOutput():this 10 (20.41% of base) : 27331.dasm - System.Collections.IterateFor`1[Int32][System.Int32]:ImmutableList():int:this 10 (20.41% of base) : 9944.dasm - System.Collections.IterateFor`1[Int32][System.Int32]:ImmutableSortedSet():int:this 76 (20.05% of base) : 4626.dasm - System.Collections.Generic.PriorityQueue`2[__Canon,__Canon][System.__Canon,System.__Canon]:MoveDownCustomComparer(System.ValueTuple`2[__Canon,__Canon],int):this 10 (20.00% of base) : 14834.dasm - System.Collections.IterateFor`1[__Canon][System.__Canon]:ImmutableList():System.__Canon:this 10 (20.00% of base) : 15485.dasm - System.Collections.IterateFor`1[__Canon][System.__Canon]:ImmutableSortedSet():System.__Canon:this 24 (19.05% of base) : 8484.dasm - System.Collections.IterateForEach`1[__Canon][System.__Canon]:ImmutableArray():System.__Canon:this 17 (18.09% of base) : 11992.dasm - Newtonsoft.Json.Utilities.DateTimeUtils:CopyIntToCharArray(System.Char[],int,int,int) 9 (18.00% of base) : 19831.dasm - System.Reflection.Metadata.Ecma335.MetadataSizes:CalculateTableStreamHeaderSize():int:this 16 (17.98% of base) : 4541.dasm - System.Collections.IndexerSetReverse`1[__Canon][System.__Canon]:Span():System.__Canon:this 36 (17.48% of base) : 15922.dasm - System.Number:DecimalToNumber(byref,byref) Top method improvements (percentages): -19 (-28.79% of base) : 18465.dasm - Microsoft.CodeAnalysis.CSharp.MergedTypeDeclaration:get_AnyMemberHasAttributes():bool:this -557 (-19.94% of base) : 18190.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceNamedTypeSymbol:MakeDeclaredBases(Roslyn.Utilities.ConsList`1[[Microsoft.CodeAnalysis.CSharp.Symbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],Microsoft.CodeAnalysis.DiagnosticBag):System.Tuple`2[[Microsoft.CodeAnalysis.CSharp.Symbols.NamedTypeSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35],[System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.NamedTypeSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]], System.Collections.Immutable, Version=6.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a]]:this -260 (-19.89% of base) : 18400.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.OverriddenOrHiddenMembersHelpers:FindOverriddenOrHiddenMembersInType(Microsoft.CodeAnalysis.CSharp.Symbol,bool,Microsoft.CodeAnalysis.CSharp.Symbols.NamedTypeSymbol,Microsoft.CodeAnalysis.CSharp.Symbols.NamedTypeSymbol,byref,byref,byref) -30 (-19.35% of base) : 19588.dasm - Microsoft.Cci.MetadataWriter:MayUseSmallExceptionHeaders(System.Collections.Immutable.ImmutableArray`1[[Microsoft.Cci.ExceptionHandlerRegion, Microsoft.CodeAnalysis, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]):bool -201 (-18.91% of base) : 20540.dasm - Microsoft.CodeAnalysis.CSharp.SyntaxTreeSemanticModel:GetDeclaredMember(Microsoft.CodeAnalysis.CSharp.Symbols.NamespaceOrTypeSymbol,Microsoft.CodeAnalysis.Text.TextSpan,System.String):Microsoft.CodeAnalysis.CSharp.Symbol:this -11 (-16.92% of base) : 18912.dasm - Microsoft.CodeAnalysis.CSharp.CodeGen.CodeGenerator:EmitStatements(System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.BoundStatement, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]):this -11 (-16.92% of base) : 18800.dasm - Microsoft.CodeAnalysis.CSharp.DataFlowPass:DeclareVariables(System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.LocalSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]):this -11 (-16.92% of base) : 18812.dasm - Microsoft.CodeAnalysis.CSharp.DataFlowPass:ReportUnusedVariables(System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.LocalFunctionSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]):this -202 (-16.67% of base) : 17991.dasm - Microsoft.CodeAnalysis.CSharp.CSharpCompilation:GetRuntimeMember(Microsoft.CodeAnalysis.CSharp.Symbols.NamedTypeSymbol,byref,Microsoft.CodeAnalysis.RuntimeMembers.SignatureComparer`5[[Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35],[Microsoft.CodeAnalysis.CSharp.Symbols.FieldSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35],[Microsoft.CodeAnalysis.CSharp.Symbols.PropertySymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35],[Microsoft.CodeAnalysis.CSharp.Symbols.TypeSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35],[Microsoft.CodeAnalysis.CSharp.Symbols.ParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],Microsoft.CodeAnalysis.CSharp.Symbols.AssemblySymbol):Microsoft.CodeAnalysis.CSharp.Symbol -11 (-16.18% of base) : 20764.dasm - Microsoft.CodeAnalysis.CSharp.AbstractRegionDataFlowPass:MakeSlots(System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.ParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]):this -11 (-15.94% of base) : 19532.dasm - Microsoft.Cci.MetadataVisitor:Visit(System.Collections.Immutable.ImmutableArray`1[[Microsoft.Cci.IParameterDefinition, Microsoft.CodeAnalysis, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]):this -11 (-15.71% of base) : 19548.dasm - Microsoft.Cci.MetadataVisitor:Visit(System.Collections.Immutable.ImmutableArray`1[[Microsoft.Cci.ILocalDefinition, Microsoft.CodeAnalysis, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]):this -11 (-15.71% of base) : 19553.dasm - Microsoft.Cci.MetadataVisitor:Visit(System.Collections.Immutable.ImmutableArray`1[[Microsoft.Cci.ExceptionHandlerRegion, Microsoft.CodeAnalysis, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]):this -11 (-15.71% of base) : 19457.dasm - Microsoft.Cci.MetadataVisitor:Visit(System.Collections.Immutable.ImmutableArray`1[[Microsoft.Cci.IParameterTypeInformation, Microsoft.CodeAnalysis, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]):this -11 (-15.71% of base) : 19459.dasm - Microsoft.Cci.MetadataVisitor:Visit(System.Collections.Immutable.ImmutableArray`1[[Microsoft.Cci.ICustomModifier, Microsoft.CodeAnalysis, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]):this -11 (-15.49% of base) : 18811.dasm - Microsoft.CodeAnalysis.CSharp.DataFlowPass:ReportUnusedVariables(System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.LocalSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]):this -11 (-15.07% of base) : 18797.dasm - Microsoft.CodeAnalysis.CSharp.DataFlowPass:EnterParameters(System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.ParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]):this -11 (-15.07% of base) : 20750.dasm - Microsoft.CodeAnalysis.CSharp.PreciseAbstractFlowPass`1[LocalState][Microsoft.CodeAnalysis.CSharp.DataFlowPass+LocalState]:VisitMultipleLocalDeclarations(Microsoft.CodeAnalysis.CSharp.BoundMultipleLocalDeclarations):Microsoft.CodeAnalysis.CSharp.BoundNode:this -11 (-15.07% of base) : 18768.dasm - Microsoft.CodeAnalysis.CSharp.PreciseAbstractFlowPass`1[LocalState][Microsoft.CodeAnalysis.CSharp.ControlFlowPass+LocalState]:VisitStatements(System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.BoundStatement, Microsoft.CodeAnalysis.CSharp, Version=2.10.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]):this -12 (-13.64% of base) : 18415.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceMemberContainerTypeSymbol:CheckMemberNamesDistinctFromType(Microsoft.CodeAnalysis.DiagnosticBag):this 376 total methods with Code Size differences (153 improved, 223 regressed), 5 unchanged. ```
-------------------------------------------------------------------------------- Additionally, I removed some dead "optimization enabling" code. --- src/coreclr/jit/compiler.h | 12 +- src/coreclr/jit/compiler.hpp | 21 --- src/coreclr/jit/jitconfigvalues.h | 1 + src/coreclr/jit/optimizer.cpp | 281 ++++++++++++++++++------------ 4 files changed, 182 insertions(+), 133 deletions(-) diff --git a/src/coreclr/jit/compiler.h b/src/coreclr/jit/compiler.h index 26d98aa5c6af1..953a390b0a669 100644 --- a/src/coreclr/jit/compiler.h +++ b/src/coreclr/jit/compiler.h @@ -6545,6 +6545,15 @@ class Compiler } } + // Struct used in optInvertWhileLoop to count interesting constructs to boost the profitability score. + struct OptInvertCountTreeInfoType + { + int sharedStaticHelperCount; + int arrayLengthCount; + }; + + static fgWalkResult optInvertCountTreeInfo(GenTree** pTree, fgWalkData* data); + void optInvertWhileLoop(BasicBlock* block); bool optComputeLoopRep(int constInit, @@ -6573,9 +6582,6 @@ class Compiler * Optimization conditions *************************************************************************/ - bool optFastCodeOrBlendedLoop(BasicBlock::weight_t bbWeight); - bool optPentium4(void); - bool optAvoidIncDec(BasicBlock::weight_t bbWeight); bool optAvoidIntMult(void); protected: diff --git a/src/coreclr/jit/compiler.hpp b/src/coreclr/jit/compiler.hpp index c70bcf253dca7..50a39a4d071f7 100644 --- a/src/coreclr/jit/compiler.hpp +++ b/src/coreclr/jit/compiler.hpp @@ -3614,27 +3614,6 @@ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX */ -// are we compiling for fast code, or are we compiling for blended code and -// inside a loop? -// We return true for BLENDED_CODE if the Block executes more than BB_LOOP_WEIGHT_SCALE/2 -inline bool Compiler::optFastCodeOrBlendedLoop(BasicBlock::weight_t bbWeight) -{ - return (compCodeOpt() == FAST_CODE) || - ((compCodeOpt() == BLENDED_CODE) && (bbWeight > ((BB_LOOP_WEIGHT_SCALE / 2) * BB_UNITY_WEIGHT))); -} - -// are we running on a Intel Pentium 4? -inline bool Compiler::optPentium4(void) -{ - return (info.genCPU == CPU_X86_PENTIUM_4); -} - -// should we use add/sub instead of inc/dec? (faster on P4, but increases size) -inline bool Compiler::optAvoidIncDec(BasicBlock::weight_t bbWeight) -{ - return optPentium4() && optFastCodeOrBlendedLoop(bbWeight); -} - // should we try to replace integer multiplication with lea/add/shift sequences? inline bool Compiler::optAvoidIntMult(void) { diff --git a/src/coreclr/jit/jitconfigvalues.h b/src/coreclr/jit/jitconfigvalues.h index 055750272e335..37975bdba5b5a 100644 --- a/src/coreclr/jit/jitconfigvalues.h +++ b/src/coreclr/jit/jitconfigvalues.h @@ -372,6 +372,7 @@ CONFIG_INTEGER(JitDoAssertionProp, W("JitDoAssertionProp"), 1) // Perform assert CONFIG_INTEGER(JitDoCopyProp, W("JitDoCopyProp"), 1) // Perform copy propagation on variables that appear redundant CONFIG_INTEGER(JitDoEarlyProp, W("JitDoEarlyProp"), 1) // Perform Early Value Propagation CONFIG_INTEGER(JitDoLoopHoisting, W("JitDoLoopHoisting"), 1) // Perform loop hoisting on loop invariant values +CONFIG_INTEGER(JitDoLoopInversion, W("JitDoLoopInversion"), 1) // Perform loop inversion on "for/while" loops CONFIG_INTEGER(JitDoRangeAnalysis, W("JitDoRangeAnalysis"), 1) // Perform range check analysis CONFIG_INTEGER(JitDoRedundantBranchOpts, W("JitDoRedundantBranchOpts"), 1) // Perform redundant branch optimizations CONFIG_INTEGER(JitDoSsa, W("JitDoSsa"), 1) // Perform Static Single Assignment (SSA) numbering on the variables diff --git a/src/coreclr/jit/optimizer.cpp b/src/coreclr/jit/optimizer.cpp index 4648a7ed2d40a..064fbdc21e80c 100644 --- a/src/coreclr/jit/optimizer.cpp +++ b/src/coreclr/jit/optimizer.cpp @@ -4037,58 +4037,78 @@ bool Compiler::optReachWithoutCall(BasicBlock* topBB, BasicBlock* botBB) return true; } +// static +Compiler::fgWalkResult Compiler::optInvertCountTreeInfo(GenTree** pTree, fgWalkData* data) +{ + OptInvertCountTreeInfoType* o = (OptInvertCountTreeInfoType*)data->pCallbackData; + + if (Compiler::IsSharedStaticHelper(*pTree)) + { + o->sharedStaticHelperCount += 1; + } + + if ((*pTree)->OperGet() == GT_ARR_LENGTH) + { + o->arrayLengthCount += 1; + } + + return WALK_CONTINUE; +} + //----------------------------------------------------------------------------- // optInvertWhileLoop: modify flow and duplicate code so that for/while loops are // entered at top and tested at bottom (aka loop rotation or bottom testing). +// Creates a "zero trip test" condition which guards entry to the loop. +// Enables loop invariant hoisting and loop cloning, which depend on +// `do {} while` format loops. Enables creation of a pre-header block after the +// zero trip test to place code that only runs if the loop is guaranteed to +// run at least once. // // Arguments: // block -- block that may be the predecessor of the un-rotated loop's test block. // // Notes: -// Optimizes "jmp C; do{} C:while(cond);" loops to "if (cond){ do{}while(cond}; }" -// Does not modify every loop +// Uses a simple lexical screen to detect likely loops. +// +// Specifically, we're looking for the following case: +// +// ... +// jmp test // `block` argument +// loop: +// ... +// ... +// test: +// ..stmts.. +// cond +// jtrue loop +// +// If we find this, and the condition is simple enough, we change +// the loop to the following: +// +// ... +// ..stmts.. // duplicated cond block statments +// cond // duplicated cond +// jfalse done +// // else fall-through +// loop: +// ... +// ... +// test: +// ..stmts.. +// cond +// jtrue loop +// done: // // Makes no changes if the flow pattern match fails. // // May not modify a loop if profile is unfavorable, if the cost of duplicating -// code is large (factoring in potential CSEs) +// code is large (factoring in potential CSEs). // void Compiler::optInvertWhileLoop(BasicBlock* block) { assert(opts.OptimizationEnabled()); assert(compCodeOpt() != SMALL_CODE); - /* - Optimize while loops into do { } while loop - Our loop hoisting logic requires do { } while loops. - Specifically, we're looking for the following case: - - ... - jmp test - loop: - ... - ... - test: - cond - jtrue loop - - If we find this, and the condition is simple enough, we change - the loop to the following: - - ... - cond - jfalse done - // else fall-through - loop: - ... - ... - test: - cond - jtrue loop - done: - - */ - /* Does the BB end with an unconditional jump? */ if (block->bbJumpKind != BBJ_ALWAYS || (block->bbFlags & BBF_KEEP_BBJ_ALWAYS)) @@ -4137,49 +4157,47 @@ void Compiler::optInvertWhileLoop(BasicBlock* block) return; } - // Find the loop termination test at the bottom of the loop. - Statement* condStmt = bTest->lastStmt(); - - // bTest must only contain only a jtrue with no other stmts, we will only clone - // the conditional, so any other statements will not get cloned - // TODO-CQ: consider cloning the whole bTest block and inserting it after block. + // It has to be a forward jump. Defer this check until after all the cheap checks + // are done, since it iterates forward in the block list looking for bbJumpDest. + // TODO-CQ: Check if we can also optimize the backwards jump as well. // - if (bTest->bbStmtList != condStmt) + if (!fgIsForwardBranch(block)) { return; } - /* Get to the condition node from the statement tree */ + // Find the loop termination test at the bottom of the loop. + Statement* condStmt = bTest->lastStmt(); - GenTree* condTree = condStmt->GetRootNode(); + // Verify the test block ends with a conditional that we can manipulate. + GenTree* const condTree = condStmt->GetRootNode(); noway_assert(condTree->gtOper == GT_JTRUE); - - condTree = condTree->AsOp()->gtOp1; - - // The condTree has to be a RelOp comparison - // TODO-CQ: Check if we can also optimize the backwards jump as well. - // - if (condTree->OperIsCompare() == false) + if (!condTree->AsOp()->gtOp1->OperIsCompare()) { return; } - // It has to be a forward jump. Defer this check until after all the cheap checks - // are done, since it iterates forward in the block list looking for bbJumpDest. - // TODO-CQ: Check if we can also optimize the backwards jump as well. + // Estimate the cost of cloning the entire test block. // - if (fgIsForwardBranch(block) == false) - { - return; - } - - /* We call gtPrepareCost to measure the cost of duplicating this tree */ + // Note: it would help throughput to compute the maximum cost + // first and early out for large bTest blocks, as we are doing two + // tree walks per tree. But because of this helper call scan, the + // maximum cost depends on the trees in the block. + // + // We might consider flagging blocks with hoistable helper calls + // during importation, so we can avoid the helper search and + // implement an early bail out for large blocks with no helper calls. - gtPrepareCost(condTree); - unsigned estDupCostSz = condTree->GetCostSz(); + unsigned estDupCostSz = 0; - BasicBlock::weight_t loopIterations = BB_LOOP_WEIGHT_SCALE; + for (Statement* stmt : bTest->Statements()) + { + GenTree* tree = stmt->GetRootNode(); + gtPrepareCost(tree); + estDupCostSz += tree->GetCostSz(); + } + BasicBlock::weight_t loopIterations = BB_LOOP_WEIGHT_SCALE; bool allProfileWeightsAreValid = false; BasicBlock::weight_t const weightBlock = block->bbWeight; BasicBlock::weight_t const weightTest = bTest->bbWeight; @@ -4228,10 +4246,8 @@ void Compiler::optInvertWhileLoop(BasicBlock* block) } } - unsigned maxDupCostSz = 32; + unsigned maxDupCostSz = 34; - // optFastCodeOrBlendedLoop(bTest->bbWeight) does not work here as we have not - // set loop weights yet if ((compCodeOpt() == FAST_CODE) || compStressCompile(STRESS_DO_WHILE_LOOPS, 30)) { maxDupCostSz *= 4; @@ -4247,40 +4263,65 @@ void Compiler::optInvertWhileLoop(BasicBlock* block) } } - // If the compare has too high cost then we don't want to dup + // If the compare has too high cost then we don't want to dup. bool costIsTooHigh = (estDupCostSz > maxDupCostSz); - int countOfHelpers = 0; + OptInvertCountTreeInfoType optInvertTotalInfo = {}; if (costIsTooHigh) { // If we already know that the cost is acceptable, then don't waste time walking the tree - // counting shared static helpers. + // counting things to boost the maximum allowed cost. // // If the loop condition has a shared static helper, we really want this loop converted // as not converting the loop will disable loop hoisting, meaning the shared helper will // be executed on every loop iteration. - fgWalkTreePre(&condTree, CountSharedStaticHelper, &countOfHelpers); + // + // If the condition has array.Length operations, also boost, as they are likely to be CSE'd. - if (countOfHelpers > 0) + for (Statement* stmt : bTest->Statements()) { - maxDupCostSz += 24 * min(countOfHelpers, (int)(loopIterations + 1.5)); + GenTree* tree = stmt->GetRootNode(); + + OptInvertCountTreeInfoType optInvertInfo = {}; + fgWalkTreePre(&tree, Compiler::optInvertCountTreeInfo, &optInvertInfo); + optInvertTotalInfo.sharedStaticHelperCount += optInvertInfo.sharedStaticHelperCount; + optInvertTotalInfo.arrayLengthCount += optInvertInfo.arrayLengthCount; - // Is the cost too high now? - costIsTooHigh = (estDupCostSz > maxDupCostSz); + if ((optInvertInfo.sharedStaticHelperCount > 0) || (optInvertTotalInfo.arrayLengthCount > 0)) + { + // Calculate a new maximum cost. We might be able to early exit. + + unsigned newMaxDupCostSz = + maxDupCostSz + 24 * min(optInvertTotalInfo.sharedStaticHelperCount, (int)(loopIterations + 1.5)) + + 8 * optInvertTotalInfo.arrayLengthCount; + + // Is the cost too high now? + costIsTooHigh = (estDupCostSz > newMaxDupCostSz); + if (!costIsTooHigh) + { + // No need counting any more trees; we're going to do the transformation. + JITDUMP("Decided to duplicate loop condition block after counting helpers in tree [%06u] in " + "block " FMT_BB, + dspTreeID(tree), bTest->bbNum); + maxDupCostSz = newMaxDupCostSz; // for the JitDump output below + break; + } + } } } #ifdef DEBUG if (verbose) { - // Note that `countOfHelpers = 0` means either there were zero helpers, or the tree walk to count them was not - // done. - printf("\nDuplication of loop condition [%06u] is %s, because the cost of duplication (%i) is %s than %i," - "\n loopIterations = %7.3f, countOfHelpers = %d, validProfileWeights = %s\n", - dspTreeID(condTree), costIsTooHigh ? "not done" : "performed", estDupCostSz, - costIsTooHigh ? "greater" : "less or equal", maxDupCostSz, loopIterations, countOfHelpers, - dspBool(allProfileWeightsAreValid)); + // Note that `optInvertTotalInfo.sharedStaticHelperCount = 0` means either there were zero helpers, or the + // tree walk to count them was not done. + printf( + "\nDuplication of loop condition [%06u] is %s, because the cost of duplication (%i) is %s than %i," + "\n loopIterations = %7.3f, optInvertTotalInfo.sharedStaticHelperCount >= %d, validProfileWeights = %s\n", + dspTreeID(condTree), costIsTooHigh ? "not done" : "performed", estDupCostSz, + costIsTooHigh ? "greater" : "less or equal", maxDupCostSz, loopIterations, + optInvertTotalInfo.sharedStaticHelperCount, dspBool(allProfileWeightsAreValid)); } #endif @@ -4289,34 +4330,47 @@ void Compiler::optInvertWhileLoop(BasicBlock* block) return; } - /* Looks good - duplicate the condition test */ + bool foundCondTree = false; - condTree->gtFlags |= GTF_RELOP_ZTT; + // Clone each statement in bTest and append to block. + for (Statement* stmt : bTest->Statements()) + { + GenTree* originalTree = stmt->GetRootNode(); + GenTree* clonedTree = gtCloneExpr(originalTree); - condTree = gtCloneExpr(condTree); - gtReverseCond(condTree); + // Special case handling needed for the conditional jump tree + if (originalTree == condTree) + { + foundCondTree = true; - // Make sure clone expr copied the flag - assert(condTree->gtFlags & GTF_RELOP_ZTT); + // Get the compare subtrees + GenTree* originalCompareTree = originalTree->AsOp()->gtOp1; + GenTree* clonedCompareTree = clonedTree->AsOp()->gtOp1; + assert(originalCompareTree->OperIsCompare()); + assert(clonedCompareTree->OperIsCompare()); - condTree = gtNewOperNode(GT_JTRUE, TYP_VOID, condTree); + // Flag compare and cloned copy so later we know this loop + // has a proper zero trip test. + originalCompareTree->gtFlags |= GTF_RELOP_ZTT; + clonedCompareTree->gtFlags |= GTF_RELOP_ZTT; - /* Create a statement entry out of the condition and - append the condition test at the end of 'block' */ + // The original test branches to remain in the loop. The + // new cloned test will branch to avoid the loop. So the + // cloned compare needs to reverse the branch condition. + gtReverseCond(clonedCompareTree); + } - Statement* copyOfCondStmt = fgNewStmtAtEnd(block, condTree); + Statement* clonedStmt = fgNewStmtAtEnd(block, clonedTree); - copyOfCondStmt->SetCompilerAdded(); + if (opts.compDbgInfo) + { + clonedStmt->SetILOffsetX(stmt->GetILOffsetX()); + } - if (condTree->gtFlags & GTF_CALL) - { - block->bbFlags |= BBF_HAS_CALL; // Record that the block has a call + clonedStmt->SetCompilerAdded(); } - if (opts.compDbgInfo) - { - copyOfCondStmt->SetILOffsetX(condStmt->GetILOffsetX()); - } + assert(foundCondTree); // Flag the block that received the copy as potentially having an array/vtable // reference, nullcheck, object/array allocation if the block copied from did; @@ -4341,17 +4395,6 @@ void Compiler::optInvertWhileLoop(BasicBlock* block) fgRemoveRefPred(bTest, block); fgAddRefPred(bTest->bbNext, block); -#ifdef DEBUG - if (verbose) - { - printf("\nDuplicated loop condition in " FMT_BB " for loop (" FMT_BB " - " FMT_BB ")\n", block->bbNum, - block->bbNext->bbNum, bTest->bbNum); - printf("Estimated code size expansion is %d\n", estDupCostSz); - - gtDispStmt(copyOfCondStmt); - } -#endif // DEBUG - // If we have profile data for all blocks and we know that we are cloning the // bTest block into block and thus changing the control flow from block so // that it no longer goes directly to bTest anymore, we have to adjust @@ -4420,6 +4463,18 @@ void Compiler::optInvertWhileLoop(BasicBlock* block) fgDebugCheckIncomingProfileData(block->bbJumpDest); #endif // DEBUG } + +#ifdef DEBUG + if (verbose) + { + printf("\nDuplicated loop exit block at " FMT_BB " for loop (" FMT_BB " - " FMT_BB ")\n", block->bbNum, + block->bbNext->bbNum, bTest->bbNum); + printf("Estimated code size expansion is %d\n", estDupCostSz); + + fgDumpBlock(block); + fgDumpBlock(bTest); + } +#endif // DEBUG } //----------------------------------------------------------------------------- @@ -4433,6 +4488,14 @@ PhaseStatus Compiler::optInvertLoops() noway_assert(opts.OptimizationEnabled()); noway_assert(fgModified == false); +#if defined(OPT_CONFIG) + if (!JitConfig.JitDoLoopInversion()) + { + JITDUMP("Loop inversion disabled\n"); + return PhaseStatus::MODIFIED_NOTHING; + } +#endif // OPT_CONFIG + if (compCodeOpt() == SMALL_CODE) { return PhaseStatus::MODIFIED_NOTHING;