Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[arm64] JIT: Fold "A * B + C" to MADD/MSUB #61037

Merged
merged 9 commits into from
Nov 2, 2021
Merged

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Oct 30, 2021

Closes #49283

This PR folds "A*B+C" into a single instruction for integers on arm64. It can be extended to handle floats too (for both x86 and arm) if we introduce a sort of a "unsafe math" mode.

static int Test1(int a, int b, int c) => a * b + c;
static int Test2(int a, int b, int c) => c + a * -b;

Codegen diff:

; Method Tests:Test1(int,int,int):int
G_M46617_IG01:
            stp     fp, lr, [sp,#-16]!
            mov     fp, sp
G_M46617_IG02:
-           mul     w0, w0, w1
-           add     w0, w0, w2
+           madd    w0, w0, w1, w2
G_M46617_IG03:
            ldp     fp, lr, [sp],#16
            ret     lr
-; Total bytes of code: 24
+; Total bytes of code: 20


; Method Tests:Test2(int,int,int):int
G_M50938_IG01:
            stp     fp, lr, [sp,#-16]!
            mov     fp, sp
G_M50938_IG02:
-           neg     w1, w1
-           mul     w0, w1, w0
-           add     w0, w2, w0
+           msub    w0, w1, w0, w2
G_M50938_IG03:
            ldp     fp, lr, [sp],#16
            ret     lr
-; Total bytes of code: 28
+; Total bytes of code: 20

coreclr_tests.pmi.Linux.arm64.checked.mch:


Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 165435344 (overridden on cmd)
Total bytes of diff: 165432380 (overridden on cmd)
Total bytes of delta: -2964 (-0.00 % of base)
    diff is an improvement.
    relative diff is an improvement.
Detail diffs


Top file improvements (bytes):
        -732 : 168496.dasm (-2.84% of base)
        -728 : 198996.dasm (-2.85% of base)
         -88 : 247784.dasm (-1.74% of base)
         -76 : 242268.dasm (-4.46% of base)
         -48 : 222397.dasm (-0.28% of base)
         -28 : 224669.dasm (-0.47% of base)
         -28 : 217066.dasm (-7.78% of base)
         -24 : 84725.dasm (-17.65% of base)
         -24 : 252849.dasm (-3.33% of base)
         -24 : 247731.dasm (-1.29% of base)
         -20 : 242310.dasm (-10.00% of base)
         -20 : 228486.dasm (-0.35% of base)
         -20 : 242313.dasm (-6.41% of base)
         -16 : 233197.dasm (-4.21% of base)
         -16 : 248453.dasm (-1.34% of base)
         -16 : 248454.dasm (-1.34% of base)
         -16 : 251801.dasm (-1.39% of base)
         -16 : 233196.dasm (-4.21% of base)
         -16 : 233198.dasm (-4.21% of base)
         -16 : 81975.dasm (-14.29% of base)

235 total files with Code Size differences (235 improved, 0 regressed), 5 unchanged.

Top method improvements (bytes):
        -732 (-2.84% of base) : 168496.dasm - CseTest.Test_Main:Main():int
        -728 (-2.85% of base) : 198996.dasm - CseTest.Test_Main:Main():int
         -88 (-1.74% of base) : 247784.dasm - Benchstone.BenchI.MulMatrix:Inner(System.Int32[][],System.Int32[][],System.Int32[][])
         -76 (-4.46% of base) : 242268.dasm - Benchstone.BenchF.MatInv4:MinV2(System.Single[],byref,byref,System.Single[],System.Single[])
         -48 (-0.28% of base) : 222397.dasm - CseTest.Test_Main:Main():int
         -28 (-0.47% of base) : 224669.dasm - CseTest.Test_Main:Main():int
         -28 (-7.78% of base) : 217066.dasm - Program:Calc(byref,byref,byref)
         -24 (-1.29% of base) : 247731.dasm - Benchstone.MDBenchI.MDMulMatrix:Inner(System.Int32[,],System.Int32[,],System.Int32[,])
         -24 (-3.33% of base) : 252849.dasm - CCSE:Main():int
         -24 (-17.65% of base) : 84725.dasm - Program:Main(System.String[]):int
         -20 (-0.35% of base) : 228486.dasm - CseTest.Test_Main:Main():int
         -20 (-10.00% of base) : 242310.dasm - N.C:FallThroughExit(int,int,int,int):bool
         -20 (-6.41% of base) : 242313.dasm - N.C:InnerInfiniteLoop(int,int,int,int):bool
         -16 (-1.39% of base) : 251801.dasm - MatrixMul.Test:MatrixMul()
         -16 (-1.34% of base) : 248453.dasm - SimpleArray_01.Test:BadMatrixMul1()
         -16 (-1.34% of base) : 248454.dasm - SimpleArray_01.Test:BadMatrixMul2()
         -16 (-14.29% of base) : 81975.dasm - SP1d:Foo(int,int,int,int,int,S):int
         -16 (-4.21% of base) : 233196.dasm - Test33objref:f2(ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl):long
         -16 (-4.21% of base) : 233197.dasm - Test33objref:f3(ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl,ratnl):long
         -16 (-4.21% of base) : 233198.dasm - Test33objref:f4(byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref,byref):long

Top method improvements (percentages):
         -24 (-17.65% of base) : 84725.dasm - Program:Main(System.String[]):int
          -4 (-16.67% of base) : 4435.dasm - JIT.HardwareIntrinsics.Arm.Helpers:MultiplyAdd(int,int,int):int
          -4 (-16.67% of base) : 4484.dasm - JIT.HardwareIntrinsics.Arm.Helpers:MultiplyAdd(int,int,int):int
          -4 (-16.67% of base) : 4401.dasm - JIT.HardwareIntrinsics.Arm.Helpers:MultiplyAdd(long,long,long):long
          -4 (-16.67% of base) : 4450.dasm - JIT.HardwareIntrinsics.Arm.Helpers:MultiplyAdd(long,long,long):long
          -4 (-16.67% of base) : 211581.dasm - test:f12(int,int):int
          -4 (-16.67% of base) : 226672.dasm - test:f12(long,long):long
         -12 (-15.00% of base) : 4353.dasm - JIT.HardwareIntrinsics.Arm.Helpers:MultiplyHigh(long,long):long
         -12 (-15.00% of base) : 4352.dasm - JIT.HardwareIntrinsics.Arm.Helpers:MultiplyHigh(long,long):long
         -16 (-14.29% of base) : 81975.dasm - SP1d:Foo(int,int,int,int,int,S):int
          -4 (-12.50% of base) : 189138.dasm - JIT.HardwareIntrinsics.Arm.Helpers:MultiplyWideningAndAdd(int,short,short):int
          -4 (-12.50% of base) : 186345.dasm - JIT.HardwareIntrinsics.Arm.Helpers:MultiplyWideningAndAdd(int,short,short):int
          -4 (-12.50% of base) : 119308.dasm - JIT.HardwareIntrinsics.Arm.Helpers:MultiplyWideningAndAdd(int,short,short):int
          -4 (-12.50% of base) : 145267.dasm - JIT.HardwareIntrinsics.Arm.Helpers:MultiplyWideningAndAdd(int,short,short):int
          -4 (-12.50% of base) : 182021.dasm - JIT.HardwareIntrinsics.Arm.Helpers:MultiplyWideningAndAdd(int,short,short):int
          -4 (-12.50% of base) : 129013.dasm - JIT.HardwareIntrinsics.Arm.Helpers:MultiplyWideningAndAdd(int,short,short):int
          -4 (-12.50% of base) : 102467.dasm - JIT.HardwareIntrinsics.Arm.Helpers:MultiplyWideningAndAdd(int,short,short):int
          -4 (-12.50% of base) : 207386.dasm - JIT.HardwareIntrinsics.Arm.Helpers:MultiplyWideningAndAdd(int,short,short):int
          -4 (-12.50% of base) : 172965.dasm - JIT.HardwareIntrinsics.Arm.Helpers:MultiplyWideningAndAdd(int,short,short):int
          -4 (-12.50% of base) : 202833.dasm - JIT.HardwareIntrinsics.Arm.Helpers:MultiplyWideningAndAdd(int,short,short):int

235 total methods with Code Size differences (235 improved, 0 regressed), 5 unchanged.


libraries.crossgen2.Linux.arm64.checked.mch:


Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 48496876 (overridden on cmd)
Total bytes of diff: 48494864 (overridden on cmd)
Total bytes of delta: -2012 (-0.00 % of base)
    diff is an improvement.
    relative diff is an improvement.
Detail diffs


Top file improvements (bytes):
         -76 : 62810.dasm (-6.33% of base)
         -36 : 180890.dasm (-34.62% of base)
         -32 : 82388.dasm (-5.00% of base)
         -32 : 87336.dasm (-9.20% of base)
         -28 : 86170.dasm (-8.33% of base)
         -28 : 82389.dasm (-4.79% of base)
         -28 : 62557.dasm (-8.14% of base)
         -28 : 60268.dasm (-7.14% of base)
         -24 : 86184.dasm (-5.04% of base)
         -24 : 86169.dasm (-7.79% of base)
         -24 : 82390.dasm (-4.55% of base)
         -20 : 61616.dasm (-5.21% of base)
         -20 : 76592.dasm (-1.27% of base)
         -20 : 86168.dasm (-6.94% of base)
         -20 : 82656.dasm (-0.94% of base)
         -20 : 82391.dasm (-4.31% of base)
         -20 : 129736.dasm (-22.73% of base)
         -16 : 82379.dasm (-3.88% of base)
         -16 : 82392.dasm (-4.00% of base)
         -16 : 86167.dasm (-6.35% of base)

300 total files with Code Size differences (300 improved, 0 regressed), 0 unchanged.

Top method improvements (bytes):
         -76 (-6.33% of base) : 62810.dasm - Microsoft.CodeAnalysis.CompilationOptions:GetHashCodeHelper():int:this
         -36 (-34.62% of base) : 180890.dasm - Microsoft.Build.Framework.BuildEventContext:GetHashCode():int:this
         -32 (-9.20% of base) : 87336.dasm - System.HashCode:Combine(int,int,int,int,int,int,int,int):int
         -32 (-5.00% of base) : 82388.dasm - System.HashCode:Combine(System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon):int
         -28 (-8.14% of base) : 62557.dasm - Microsoft.CodeAnalysis.DiagnosticDescriptor:GetHashCode():int:this
         -28 (-7.14% of base) : 60268.dasm - Microsoft.CodeAnalysis.Emit.EmitOptions:GetHashCode():int:this
         -28 (-8.33% of base) : 86170.dasm - System.HashCode:Combine(int,int,int,int,int,int,int):int
         -28 (-4.79% of base) : 82389.dasm - System.HashCode:Combine(System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon):int
         -24 (-5.04% of base) : 86184.dasm - System.HashCode:Combine(float,float,float,float,float,float):int
         -24 (-7.79% of base) : 86169.dasm - System.HashCode:Combine(int,int,int,int,int,int):int
         -24 (-4.55% of base) : 82390.dasm - System.HashCode:Combine(System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon):int
         -20 (-5.21% of base) : 61616.dasm - Microsoft.CodeAnalysis.CommonAttributeDataComparer:GetHashCode(Microsoft.CodeAnalysis.AttributeData):int:this
         -20 (-1.27% of base) : 76592.dasm - System.Buffers.Text.Utf8Parser:TryParseDateTimeOffsetO(System.ReadOnlySpan`1[System.Byte],byref,byref,byref):bool
         -20 (-0.94% of base) : 82656.dasm - System.DateTimeParse:ParseFormatO(System.ReadOnlySpan`1[System.Char],byref):bool
         -20 (-6.94% of base) : 86168.dasm - System.HashCode:Combine(int,int,int,int,int):int
         -20 (-4.31% of base) : 82391.dasm - System.HashCode:Combine(System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon):int
         -20 (-22.73% of base) : 129736.dasm - System.Reflection.Metadata.SequencePoint:GetHashCode():int:this
         -16 (-6.78% of base) : 62042.dasm - Microsoft.CodeAnalysis.AssemblyIdentity:GetHashCode():int:this
         -16 (-3.88% of base) : 82379.dasm - System.HashCode:Add(int):this
         -16 (-4.40% of base) : 86190.dasm - System.HashCode:Combine(float,float,float,float):int

Top method improvements (percentages):
         -36 (-34.62% of base) : 180890.dasm - Microsoft.Build.Framework.BuildEventContext:GetHashCode():int:this
         -20 (-22.73% of base) : 129736.dasm - System.Reflection.Metadata.SequencePoint:GetHashCode():int:this
         -12 (-17.65% of base) : 59256.dasm - Microsoft.CodeAnalysis.CodeGen.LambdaDebugInfo:GetHashCode():int:this
         -12 (-17.65% of base) : 60441.dasm - Microsoft.CodeAnalysis.Text.LinePositionSpan:GetHashCode():int:this
          -8 (-15.38% of base) : 60357.dasm - Microsoft.CodeAnalysis.Text.TextChangeRange:GetHashCode():int:this
          -8 (-14.29% of base) : 59275.dasm - Microsoft.CodeAnalysis.CodeGen.ClosureDebugInfo:GetHashCode():int:this
          -8 (-14.29% of base) : 59226.dasm - Microsoft.CodeAnalysis.CodeGen.LocalSlotDebugInfo:GetHashCode():int:this
          -4 (-12.50% of base) : 58898.dasm - Roslyn.Utilities.Hash:Combine(int,int):int
          -4 (-12.50% of base) : 128644.dasm - System.Reflection.Internal.Hash:Combine(int,int):int
          -4 (-11.11% of base) : 25851.dasm - Internal.NativeFormat.ExternalTypeSignature:GetHashCode():int:this
          -4 (-11.11% of base) : 25886.dasm - Internal.NativeFormat.UnsignedConstant:GetHashCode():int:this
          -4 (-11.11% of base) : 25819.dasm - Internal.NativeFormat.VariableTypeSignature:GetHashCode():int:this
          -4 (-11.11% of base) : 19161.dasm - Microsoft.Diagnostics.Tracing.Stacks.RecursionGuard:get_Depth():int:this
         -12 (-10.71% of base) : 85015.dasm - System.Math:<BigMul>g__SoftwareFallback|48_0(long,long,byref):long
          -8 (-10.00% of base) : 25377.dasm - Internal.NativeFormat.TypeHashingAlgorithms:ComputeSignatureVariableHashCode(int,bool):int
          -4 (-10.00% of base) : 59171.dasm - Microsoft.CodeAnalysis.CodeGen.DebugId:GetHashCode():int:this
          -4 (-10.00% of base) : 59043.dasm - Microsoft.CodeAnalysis.CodeGen.LocalDebugId:GetHashCode():int:this
          -4 (-10.00% of base) : 60126.dasm - Microsoft.CodeAnalysis.Emit.MethodImplKey:GetHashCode():int:this
          -4 (-10.00% of base) : 60453.dasm - Microsoft.CodeAnalysis.Text.LinePosition:GetHashCode():int:this
          -4 (-10.00% of base) : 60314.dasm - Microsoft.CodeAnalysis.Text.TextSpan:GetHashCode():int:this

300 total methods with Code Size differences (300 improved, 0 regressed), 0 unchanged.


libraries.pmi.Linux.arm64.checked.mch:


Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 47408036 (overridden on cmd)
Total bytes of diff: 47405408 (overridden on cmd)
Total bytes of delta: -2628 (-0.01 % of base)
    diff is an improvement.
    relative diff is an improvement.
Detail diffs


Top file improvements (bytes):
        -172 : 102556.dasm (-4.06% of base)
        -124 : 209075.dasm (-1.13% of base)
         -76 : 136973.dasm (-6.55% of base)
         -36 : 67461.dasm (-9.00% of base)
         -36 : 203976.dasm (-34.62% of base)
         -28 : 139548.dasm (-6.73% of base)
         -24 : 30643.dasm (-0.67% of base)
         -20 : 139691.dasm (-12.20% of base)
         -20 : 104391.dasm (-1.88% of base)
         -20 : 101968.dasm (-22.73% of base)
         -20 : 138181.dasm (-6.58% of base)
         -16 : 137756.dasm (-6.15% of base)
         -16 : 209070.dasm (-1.05% of base)
         -16 : 20622.dasm (-8.00% of base)
         -16 : 172380.dasm (-1.19% of base)
         -16 : 205435.dasm (-8.70% of base)
         -16 : 205440.dasm (-8.00% of base)
         -12 : 54524.dasm (-8.57% of base)
         -12 : 137371.dasm (-6.52% of base)
         -12 : 136781.dasm (-6.67% of base)

377 total files with Code Size differences (377 improved, 0 regressed), 0 unchanged.

Top method improvements (bytes):
        -172 (-4.06% of base) : 102556.dasm - System.Reflection.Metadata.Ecma335.MetadataSizes:.ctor(System.Collections.Immutable.ImmutableArray`1[Int32],System.Collections.Immutable.ImmutableArray`1[Int32],System.Collections.Immutable.ImmutableArray`1[Int32],int,bool):this
        -124 (-1.13% of base) : 209075.dasm - TerminalFormatStrings:.ctor(Database):this
         -76 (-6.55% of base) : 136973.dasm - Microsoft.CodeAnalysis.CompilationOptions:GetHashCodeHelper():int:this
         -36 (-34.62% of base) : 203976.dasm - Microsoft.Build.Framework.BuildEventContext:GetHashCode():int:this
         -36 (-9.00% of base) : 67461.dasm - Microsoft.CodeAnalysis.VisualBasic.VisualBasicCompilationOptions:GetHashCode():int:this
         -28 (-6.73% of base) : 139548.dasm - Microsoft.CodeAnalysis.Emit.EmitOptions:GetHashCode():int:this
         -24 (-0.67% of base) : 30643.dasm - Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.LanguageParser:ParsePostFixExpression(Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.ExpressionSyntax):Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.ExpressionSyntax:this
         -20 (-6.58% of base) : 138181.dasm - Microsoft.CodeAnalysis.CommonAttributeDataComparer:GetHashCode(Microsoft.CodeAnalysis.AttributeData):int:this
         -20 (-12.20% of base) : 139691.dasm - Microsoft.CodeAnalysis.Emit.EncLocalInfo:GetHashCode():int:this
         -20 (-1.88% of base) : 104391.dasm - Microsoft.VisualBasic.DateAndTime:DateAdd(int,double,System.DateTime):System.DateTime
         -20 (-22.73% of base) : 101968.dasm - System.Reflection.Metadata.SequencePoint:GetHashCode():int:this
         -16 (-6.15% of base) : 137756.dasm - Microsoft.CodeAnalysis.AssemblyIdentity:GetHashCode():int:this
         -16 (-8.00% of base) : 20622.dasm - Microsoft.CodeAnalysis.CSharp.BoundTypeOrValueData:GetHashCode():int:this
         -16 (-8.70% of base) : 205435.dasm - State:ProcessStripe(System.ReadOnlySpan`1[Byte]):this
         -16 (-8.00% of base) : 205440.dasm - State:ProcessStripe(System.ReadOnlySpan`1[Byte]):this
         -16 (-1.19% of base) : 172380.dasm - System.Numerics.BigIntegerCalculator:Gcd(System.Span`1[UInt32],System.Span`1[UInt32])
         -16 (-1.05% of base) : 209070.dasm - TerminalFormatStrings:GetTitle(Database):System.String
         -12 (-0.82% of base) : 182180.dasm - ILCompiler.Reflection.ReadyToRun.Amd64.GcInfo:GetLiveSlotsAtSafepoints(System.Byte[],byref):System.Collections.Generic.List`1[[System.Collections.Generic.List`1[[ILCompiler.Reflection.ReadyToRun.BaseGcSlot, ILCompiler.Reflection.ReadyToRun, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null]], System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]]:this
         -12 (-17.65% of base) : 140657.dasm - Microsoft.CodeAnalysis.CodeGen.LambdaDebugInfo:GetHashCode():int:this
         -12 (-5.45% of base) : 21584.dasm - Microsoft.CodeAnalysis.CSharp.QueryClauseInfo:GetHashCode():int:this

Top method improvements (percentages):
         -36 (-34.62% of base) : 203976.dasm - Microsoft.Build.Framework.BuildEventContext:GetHashCode():int:this
         -20 (-22.73% of base) : 101968.dasm - System.Reflection.Metadata.SequencePoint:GetHashCode():int:this
         -12 (-17.65% of base) : 140657.dasm - Microsoft.CodeAnalysis.CodeGen.LambdaDebugInfo:GetHashCode():int:this
         -12 (-17.65% of base) : 139363.dasm - Microsoft.CodeAnalysis.Text.LinePositionSpan:GetHashCode():int:this
          -8 (-15.38% of base) : 139449.dasm - Microsoft.CodeAnalysis.Text.TextChangeRange:GetHashCode():int:this
          -8 (-14.29% of base) : 140638.dasm - Microsoft.CodeAnalysis.CodeGen.ClosureDebugInfo:GetHashCode():int:this
          -8 (-14.29% of base) : 140686.dasm - Microsoft.CodeAnalysis.CodeGen.LocalSlotDebugInfo:GetHashCode():int:this
          -8 (-14.29% of base) : 51947.dasm - Microsoft.CodeAnalysis.VisualBasic.EmbeddedTreeLocation:GetHashCode():int:this
          -4 (-12.50% of base) : 103158.dasm - System.Reflection.Internal.Hash:Combine(int,int):int
          -4 (-12.50% of base) : 103159.dasm - System.Reflection.Internal.Hash:Combine(int,int):int
         -20 (-12.20% of base) : 139691.dasm - Microsoft.CodeAnalysis.Emit.EncLocalInfo:GetHashCode():int:this
          -4 (-11.11% of base) : 79035.dasm - Microsoft.Diagnostics.Tracing.Stacks.RecursionGuard:get_Depth():int:this
          -8 (-10.00% of base) : 169357.dasm - Internal.NativeFormat.TypeHashingAlgorithms:ComputeSignatureVariableHashCode(int,bool):int
          -4 (-10.00% of base) : 140741.dasm - Microsoft.CodeAnalysis.CodeGen.DebugId:GetHashCode():int:this
          -4 (-10.00% of base) : 140858.dasm - Microsoft.CodeAnalysis.CodeGen.LocalDebugId:GetHashCode():int:this
          -4 (-10.00% of base) : 139667.dasm - Microsoft.CodeAnalysis.Emit.MethodImplKey:GetHashCode():int:this
          -4 (-10.00% of base) : 137113.dasm - Microsoft.CodeAnalysis.SubsystemVersion:GetHashCode():int:this
          -4 (-10.00% of base) : 139350.dasm - Microsoft.CodeAnalysis.Text.LinePosition:GetHashCode():int:this
          -4 (-10.00% of base) : 139493.dasm - Microsoft.CodeAnalysis.Text.TextSpan:GetHashCode():int:this
          -4 (-9.09% of base) : 169653.dasm - Internal.TypeSystem.SignatureMethodVariable:GetHashCode():int:this

377 total methods with Code Size differences (377 improved, 0 regressed), 0 unchanged.


libraries_tests.pmi.Linux.arm64.checked.mch:


Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 112547716 (overridden on cmd)
Total bytes of diff: 112545488 (overridden on cmd)
Total bytes of delta: -2228 (-0.00 % of base)
    diff is an improvement.
    relative diff is an improvement.
Detail diffs


Top file improvements (bytes):
         -36 : 247634.dasm (-34.62% of base)
         -32 : 127574.dasm (-3.00% of base)
         -24 : 88659.dasm (-2.86% of base)
         -20 : 78204.dasm (-7.58% of base)
         -16 : 84786.dasm (-4.21% of base)
         -16 : 84984.dasm (-8.51% of base)
         -16 : 163562.dasm (-1.03% of base)
         -16 : 82397.dasm (-5.19% of base)
         -16 : 163534.dasm (-0.92% of base)
         -12 : 77714.dasm (-6.98% of base)
         -12 : 268509.dasm (-5.36% of base)
         -12 : 332252.dasm (-3.26% of base)
         -12 : 83258.dasm (-3.61% of base)
         -12 : 92293.dasm (-0.58% of base)
         -12 : 285411.dasm (-0.73% of base)
         -12 : 2279.dasm (-0.58% of base)
         -12 : 285647.dasm (-0.73% of base)
         -12 : 290039.dasm (-1.59% of base)
         -12 : 150632.dasm (-1.61% of base)
         -12 : 98795.dasm (-0.82% of base)

336 total files with Code Size differences (336 improved, 0 regressed), 0 unchanged.

Top method improvements (bytes):
         -36 (-34.62% of base) : 247634.dasm - Microsoft.Build.Framework.BuildEventContext:GetHashCode():int:this
         -32 (-3.00% of base) : 127574.dasm - Microsoft.Build.BackEnd.SdkResolution.SdkResult:GetHashCode():int:this
         -24 (-2.86% of base) : 88659.dasm - GetHashCodeVisitor:CombineHashCodes(Microsoft.CodeAnalysis.IMethodSymbol,int):int:this
         -20 (-7.58% of base) : 78204.dasm - Microsoft.CodeAnalysis.NamingStyles.NamingStyle:GetHashCode():int:this
         -16 (-8.51% of base) : 84984.dasm - Microsoft.CodeAnalysis.Diagnostics.Analyzers.NamingStyles.SymbolSpecification:GetHashCode():int:this
         -16 (-4.21% of base) : 84786.dasm - Microsoft.CodeAnalysis.Diagnostics.DiagnosticData:GetHashCode():int:this
         -16 (-5.19% of base) : 82397.dasm - Microsoft.CodeAnalysis.Options.OptionDefinition:GetHashCode():int:this
         -16 (-0.92% of base) : 163534.dasm - System.IO.Ports.Tests.Write_char_int_int:VerifyWriteCharArray(System.Char[],int,int,System.IO.Ports.SerialPort,System.IO.Ports.SerialPort,int):this
         -16 (-1.03% of base) : 163562.dasm - System.IO.Ports.Tests.Write_str:VerifyWriteStr(System.IO.Ports.SerialPort,System.IO.Ports.SerialPort,System.String,int):this
         -12 (-1.61% of base) : 150632.dasm - <>c__DisplayClass11_1:<TestPartitioningCore>b__0():this
         -12 (-5.36% of base) : 268509.dasm - Castle.DynamicProxy.ProxyGenerationOptions:GetHashCode():int:this
         -12 (-0.58% of base) : 92293.dasm - DataContractSerializerTests:DCS_DateTimeOffsetAsRoot()
         -12 (-0.58% of base) : 2279.dasm - DataContractSerializerTests:DCS_DateTimeOffsetAsRoot()
         -12 (-6.98% of base) : 77714.dasm - Microsoft.CodeAnalysis.BitVector:GetHashCode():int:this
         -12 (-3.61% of base) : 83258.dasm - Microsoft.CodeAnalysis.FindSymbols.DeclaredSymbolInfo:GetHashCode():int:this
         -12 (-0.73% of base) : 285411.dasm - System.Numerics.Tensors.IntArithmetic:Contract(System.Numerics.Tensors.Tensor`1[Int32],System.Numerics.Tensors.Tensor`1[Int32],System.Int32[],System.Int32[],System.Numerics.Tensors.Tensor`1[Int32]):this
         -12 (-0.73% of base) : 285469.dasm - System.Numerics.Tensors.LongArithmetic:Contract(System.Numerics.Tensors.Tensor`1[Int64],System.Numerics.Tensors.Tensor`1[Int64],System.Int32[],System.Int32[],System.Numerics.Tensors.Tensor`1[Int64]):this
         -12 (-0.73% of base) : 285647.dasm - System.Numerics.Tensors.UIntArithmetic:Contract(System.Numerics.Tensors.Tensor`1[UInt32],System.Numerics.Tensors.Tensor`1[UInt32],System.Int32[],System.Int32[],System.Numerics.Tensors.Tensor`1[UInt32]):this
         -12 (-0.73% of base) : 285703.dasm - System.Numerics.Tensors.ULongArithmetic:Contract(System.Numerics.Tensors.Tensor`1[UInt64],System.Numerics.Tensors.Tensor`1[UInt64],System.Int32[],System.Int32[],System.Numerics.Tensors.Tensor`1[UInt64]):this
         -12 (-1.59% of base) : 290039.dasm - System.PercentEncodingHelper:UnescapePercentEncodedUTF8Sequence(long,int,byref,bool,bool):int

Top method improvements (percentages):
         -36 (-34.62% of base) : 247634.dasm - Microsoft.Build.Framework.BuildEventContext:GetHashCode():int:this
          -4 (-16.67% of base) : 60188.dasm - <>c:<Spill_Optimizations_NoSpillBeyondSpillSite1>b__68_0(int,int,int):int:this
          -4 (-16.67% of base) : 309475.dasm - TypeWithMethods:DoOtherStuff(int,int,int):int:this
          -8 (-13.33% of base) : 84258.dasm - Microsoft.CodeAnalysis.Differencing.Edit`1[Byte][System.Byte]:GetHashCode():int:this
          -4 (-12.50% of base) : 75084.dasm - Roslyn.Utilities.Hash:Combine(int,int):int
          -8 (-12.50% of base) : 90884.dasm - SymbolKindOrTypeKind:GetHashCode():int:this
          -8 (-10.53% of base) : 76242.dasm - Roslyn.Utilities.SyntaxPath:GetHashCode():int:this
          -8 (-10.00% of base) : 84970.dasm - Microsoft.CodeAnalysis.Diagnostics.Analyzers.NamingStyles.SerializableNamingRule:GetHashCode():int:this
          -4 (-10.00% of base) : 84465.dasm - Microsoft.CodeAnalysis.Differencing.SequenceEdit:GetHashCode():int:this
          -4 (-10.00% of base) : 80984.dasm - Microsoft.CodeAnalysis.Shared.Extensions.LineSpan:GetHashCode():int:this
         -16 (-8.51% of base) : 84984.dasm - Microsoft.CodeAnalysis.Diagnostics.Analyzers.NamingStyles.SymbolSpecification:GetHashCode():int:this
          -4 (-8.33% of base) : 179875.dasm - System.Data.SqlClient.SqlBuffer:GetTicksFromDateTime2Info(DateTime2Info):long
          -4 (-8.33% of base) : 210299.dasm - System.Data.SqlClient.SqlBuffer:GetTicksFromDateTime2Info(DateTime2Info):long
          -4 (-7.69% of base) : 75085.dasm - Roslyn.Utilities.Hash:Combine(bool,int):int
         -20 (-7.58% of base) : 78204.dasm - Microsoft.CodeAnalysis.NamingStyles.NamingStyle:GetHashCode():int:this
          -4 (-7.14% of base) : 77367.dasm - Microsoft.CodeAnalysis.VersionStamp:GetHashCode():int:this
         -12 (-6.98% of base) : 77714.dasm - Microsoft.CodeAnalysis.BitVector:GetHashCode():int:this
          -8 (-6.90% of base) : 84249.dasm - Microsoft.CodeAnalysis.Differencing.Edit`1[__Canon][System.__Canon]:GetHashCode():int:this
          -4 (-6.67% of base) : 88668.dasm - GetHashCodeVisitor:CombineHashCodes(Microsoft.CodeAnalysis.IPreprocessingSymbol,int):int
          -8 (-6.45% of base) : 84957.dasm - Microsoft.CodeAnalysis.Diagnostics.Analyzers.NamingStyles.NamingStylePreferences:GetHashCode():int:this

336 total methods with Code Size differences (336 improved, 0 regressed), 0 unchanged.


@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Oct 30, 2021
@ghost
Copy link

ghost commented Oct 30, 2021

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details
static int Test1(int a, int b, int c) => a * b + c;
static int Test2(int a, int b, int c) => c + a * -b;

Codegen diff:

; Method Tests:Test1(int,int,int):int
G_M46617_IG01:
            stp     fp, lr, [sp,#-16]!
            mov     fp, sp
G_M46617_IG02:
-           mul     w0, w0, w1
-           add     w0, w0, w2
+           madd    w0, w0, w1, w2
G_M46617_IG03:
            ldp     fp, lr, [sp],#16
            ret     lr
; Total bytes of code: 24

; Method Tests:Test2(int,int,int):int
G_M50938_IG01:
            stp     fp, lr, [sp,#-16]!
            mov     fp, sp
G_M50938_IG02:
-           neg     w1, w1
-           mul     w0, w1, w0
-           add     w0, w2, w0
+           msub    w0, w1, w0, w2
G_M50938_IG03:
            ldp     fp, lr, [sp],#16
            ret     lr
; Total bytes of code: 28
Author: EgorBo
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@EgorBo
Copy link
Member Author

EgorBo commented Oct 30, 2021

/azp run runtime-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added few questions and suggestions. Overall, looks great!

src/coreclr/jit/lsrabuild.cpp Show resolved Hide resolved
src/coreclr/jit/lowerarmarch.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/emitarm64.cpp Outdated Show resolved Hide resolved
@@ -13597,6 +13597,33 @@ regNumber emitter::emitInsTernary(instruction ins, emitAttr attr, GenTree* dst,
// src2 can only be a reg
assert(!src2->isContained());
}
else if ((src1->OperIs(GT_MUL) && src1->isContained()) || (src2->OperIs(GT_MUL) && src2->isContained()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the src1->gtGetOp1() or src1->gtGetOp2() would be the one that is marked contained in lower and that should be checked here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

madd fot GT_ADD can only be emitted when GT_MUL is contained. So "isContaind" here basically means lower approved this tree to be optimized into madd. GT_MUL can be not contained e.g. if it has gtOverflow set

@EgorBo
Copy link
Member Author

EgorBo commented Nov 2, 2021

@dotnet/jit-contrib PTAL, should be ready to review/merge

if (b->OperIs(GT_NEG) && b->isContained())
{
b = b->gtGetOp1();
msub = !msub; // it's either "a * -b" or "-a * -b" which is the same as "a * b"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// it's either "a * -b" or "-a * -b" which is the same as "a * b"

This comment is little misleading...I can see how -a * -b is a * b, but then we will still use msub so probably we should not confuse by saying "same as a * b"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then we will still use msub

actually for -a * -b this code will emit madd (msub = !msub will return false)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah...I misread this if as else if. Never mind.

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@EgorBo EgorBo merged commit 0691c75 into dotnet:main Nov 2, 2021
@jakobbotsch
Copy link
Member

The Fuzzlyn run at https://dev.azure.com/dnceng/public/_build/results?buildId=1452609 found this example:

// Generated by Fuzzlyn v1.5 on 2021-11-03 12:55:21
// Run on Arm64 Windows
// Seed: 951014135056301943
// Reduced from 152.5 KiB to 0.3 KiB in 00:01:22
// Hits JIT assert in Release:
// Assertion failed 'ins == INS_add' in 'Program:Main(Fuzzlyn.ExecutionServer.IRuntime)' during 'Generate code' (IL size 28)
// 
//     File: D:\a\_work\3\s\src\coreclr\jit\emitarm64.cpp Line: 13602
// 
public class C0
{
}

public class Program
{
    public static void Main()
    {
        if (0 == (27452 + (-2147483647 * M1())))
        {
            var vr3 = new C0();
        }
    }

    public static long M1()
    {
        var vr1 = new C0[, ]{{new C0()}};
        return 0;
    }
}

Could it be related to this PR?

@EgorBo
Copy link
Member Author

EgorBo commented Nov 3, 2021

Oops. yes. totally forgot about INS_adds which sets flags, will fix in a moment

@EgorBo
Copy link
Member Author

EgorBo commented Nov 4, 2021

improvements on linux-arm64 dotnet/perf-autofiling-issues#2143

@EgorBo EgorBo deleted the arm-madd branch November 4, 2021 15:19
@ghost ghost locked as resolved and limited conversation to collaborators Dec 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RyuJIT][arm64] Recognize madd/msub
4 participants