Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RyuJIT: x*2 -> x+x; x*1 -> x (fp) #33024

Merged
merged 15 commits into from
May 6, 2020
24 changes: 24 additions & 0 deletions src/coreclr/src/jit/morph.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13034,6 +13034,30 @@ GenTree* Compiler::fgMorphSmpOp(GenTree* tree, MorphAddrContext* mac)
op2 = tree->AsOp()->gtOp2;
}

// See if we can fold floating point operations
if (varTypeIsFloating(tree->TypeGet()) && !optValnumCSE_phase)
{
if ((oper == GT_MUL) && !op1->IsCnsFltOrDbl() && op2->IsCnsFltOrDbl())
{
if (op2->AsDblCon()->gtDconVal == 2.0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In dotnet/coreclr#24584 we introduced an optimization which replaces x / pow2 with x * (1 / pow2).

In the case pow2 is 0.5, this would transform x / 0.5 into x * 2.0 and so the question is: Will this still be correctly optimized in this scenario? That is, will it become x + x?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tannergooding heh, yeah, I checked that 🙂

x / 0.5
is optimized to vaddss

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a general plan, doc, or outline of how to order these optimizations and ensure they stay correctly ordered moving forward?

Ideally, we would be able to apply these transformations without needing to consider that x / pow2 must come before x * 2.0 because reason.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no idea, we got lucky these two were executed in a correct order

I guess when you optimize something, e.g. change operator of a node (e.g. uX GT_GT 0 to X GT_NE 0) and feel this particular node can be re-optimized under a different case you should either:

  1. goto case GT_NE_OP (ugly)
  2. call fgMorphSmpOp recursively (needs some recursion depth limit just in case)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AndyAyersMS, Do you have any thoughts/comments here?

As we introduce more optimizations, this seems like something that will be easy to miss and it would be good to have a consistent process for how handling it is normally recommended...

Copy link
Member Author

@EgorBo EgorBo May 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, a good example for this problem is this PR: https://github.com/dotnet/coreclr/pull/25744/files

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's in morph is pretty messy. We don't have a good way of determining if there are dependencies between optimizations, or if one optimization may enable another. There are long-term ambitions of refactoring all this but it is not something we'll get to anytime soon.

Best I can suggest for now is to make sure you run diffs widely, eg over the entire set of Pri1 tests, and look carefully at the diffs, or (if you are internal) run your change over some of our SPMI collections.

We want to avoid repeated subtree traversals so we don't waste time, get stuck in some oscillating mode or blow the stack; so re-invoking morph recursively is only really viable if the node has been simplified in some way. If you are really sure optimizing A enables optimization B, then factor B out and invoke it as a subroutine as part finishing up A.

{
// Fold "x*2.0" to "x+x"
op2 = op1->OperIsLeaf() ? gtCloneExpr(op1) : fgMakeMultiUse(&tree->AsOp()->gtOp1);
op1 = tree->AsOp()->gtOp1;
oper = GT_ADD;
tree = gtNewOperNode(oper, tree->TypeGet(), op1, op2);
INDEBUG(tree->gtDebugFlags |= GTF_DEBUG_NODE_MORPHED);
}
else if (op2->AsDblCon()->gtDconVal == 1.0)
{
// Fold "x*1.0" to "x"
DEBUG_DESTROY_NODE(op2);
DEBUG_DESTROY_NODE(tree);
return op1;
}
}
}

/* See if we can fold GT_ADD nodes. */

if (oper == GT_ADD)
Expand Down
133 changes: 133 additions & 0 deletions src/coreclr/tests/src/JIT/opt/InstructionCombining/MulToAdd.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
// See the LICENSE file in the project root for more information.

using System;
using System.Linq;
using System.Runtime.CompilerServices;

// Test "X * 2" to "X + X"

public class Program
{
private static int resultCode = 100;

public static int Main(string[] args)
{
float[] testValues =
{
0, 0.01f, 1.333f, 1/3.0f, 0.5f, 1, 2, 3, 4,
MathF.PI, MathF.E,
float.MinValue, float.MaxValue,
int.MaxValue, long.MaxValue,
int.MinValue, long.MinValue,
float.NegativeInfinity,
float.PositiveInfinity,
float.NaN,
};

testValues = testValues.Concat(testValues.Select(v => -v)).ToArray();

foreach (float testValue in testValues)
{
var tf = new TestFloats();

// Case 1: argument
AssertEquals(tf.TestArg(testValue), tf.TestArg_var(testValue));

// Case 2: ref argument
float t1 = testValue, t2 = testValue;
tf.TestArgRef(ref t1);
tf.TestArgRef_var(ref t2);
AssertEquals(t1, t2);

// Case 3: out argument
tf.TestArgOut(t1, out t1);
tf.TestArgOut_var(t2, out t2);
AssertEquals(t1, t2);

// Case 4: field
tf.TestField();
tf.TestField_var();
AssertEquals(tf.field1, tf.field2);

// Case 5: call
AssertEquals(tf.TestCall(), tf.TestCall_var());
AssertEquals(tf.field1, tf.field2);
}

return resultCode;
}

[MethodImpl(MethodImplOptions.NoInlining)]
private static void AssertEquals(float expected, float actual)
{
int expectedBits = BitConverter.SingleToInt32Bits(expected);
int actualBits = BitConverter.SingleToInt32Bits(actual);
if (expectedBits != actualBits)
{
resultCode++;
Console.WriteLine($"AssertEquals: {expected} != {actual}");
}
}
}

public class TestFloats
{
[MethodImpl(MethodImplOptions.NoInlining)]
public static T Var<T>(T t) => t;


// Case 1: argument

[MethodImpl(MethodImplOptions.NoInlining)]
public float TestArg(float x) => x * 2;

[MethodImpl(MethodImplOptions.NoInlining)]
public float TestArg_var(float x) => x * Var(2);


// Case 2: ref argument

[MethodImpl(MethodImplOptions.NoInlining)]
public void TestArgRef(ref float x) => x *= 2;

[MethodImpl(MethodImplOptions.NoInlining)]
public void TestArgRef_var(ref float x) => x *= Var(2);


// Case 3: out argument

[MethodImpl(MethodImplOptions.NoInlining)]
public void TestArgOut(float x, out float y) => y = x * 2;

[MethodImpl(MethodImplOptions.NoInlining)]
public void TestArgOut_var(float x, out float y) => y = x * Var(2);


// Case 4: field

public float field1 = 3.14f;
[MethodImpl(MethodImplOptions.NoInlining)]
public void TestField() => field1 *= 2;

public float field2 = 3.14f;
[MethodImpl(MethodImplOptions.NoInlining)]
public void TestField_var() => field2 *= Var(2);


// Case 5: Call

[MethodImpl(MethodImplOptions.NoInlining)]
public float Call1() => field1++; // with side-effect

[MethodImpl(MethodImplOptions.NoInlining)]
public float Call2() => field2++; // with side-effect


[MethodImpl(MethodImplOptions.NoInlining)]
public float TestCall() => Call1() * 2;

[MethodImpl(MethodImplOptions.NoInlining)]
public float TestCall_var() => Call2() * Var(2);
}
12 changes: 12 additions & 0 deletions src/coreclr/tests/src/JIT/opt/InstructionCombining/MulToAdd.csproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
</PropertyGroup>
<PropertyGroup>
<DebugType>None</DebugType>
<Optimize>True</Optimize>
</PropertyGroup>
<ItemGroup>
<Compile Include="MulToAdd.cs" />
</ItemGroup>
</Project>