Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Enable CSE for VectorX.Create #50644

Merged
merged 4 commits into from
Apr 7, 2021
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions src/coreclr/jit/valuenum.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8918,16 +8918,16 @@ void Compiler::fgValueNumberHWIntrinsic(GenTree* tree)
assert(hwIntrinsicNode != nullptr);

// For safety/correctness we must mutate the global heap valuenumber
// for any HW intrinsic that performs a memory store operation
// for any HW intrinsic that performs a memory store operation
if (hwIntrinsicNode->OperIsMemoryStore())
{
fgMutateGcHeap(tree DEBUGARG("HWIntrinsic - MemoryStore"));
}

// Check for any intrintics that have variable number of args or more than 2 args
// For now we will generate a unique value number for these cases.
//
if ((HWIntrinsicInfo::lookupNumArgs(hwIntrinsicNode->gtHWIntrinsicId) == -1) ||
const bool isVariableArgs = HWIntrinsicInfo::lookupNumArgs(hwIntrinsicNode->gtHWIntrinsicId) == -1;

// In case of functions with variable number of args we only support single-arg ones.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering, why 0 or 1 and not 0, 1, or 2?

Copy link
Member

@tannergooding tannergooding Apr 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable length args only impact Vector.Create calls for ARM, but also impact cases like Sse.ReciprocalScalar for x86, where the upper limit is 2 args (and therefore op1 will not be a GT_LIST, but NumArgs will be -1).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I wanted to add support for args >= 2 cases later but since you noticed this case I've added it as part of this pr.

Repro:

static int Test()
{
    return Sse2.MoveMask(Sse2.Add(
                Vector128.Create(1, 1), 
                Vector128.Create(1, 1)).AsByte());
}

New codegen:

; Method Program:Test():int
G_M58954_IG01:
       vzeroupper 
						;; bbWeight=1    PerfScore 1.00

G_M58954_IG02:
       vmovupd  xmm0, xmmword ptr [reloc @RWD00]
       vpaddq   xmm0, xmm0, xmm0
       vpmovmskb eax, xmm0
						;; bbWeight=1    PerfScore 4.33

G_M58954_IG03:
       ret      
						;; bbWeight=1    PerfScore 1.00
RWD00  	dq	0000000000000001h, 0000000000000001h
; Total bytes of code: 20

if ((isVariableArgs && (tree->AsOp()->gtOp2 != nullptr)) ||
((tree->AsOp()->gtOp1 != nullptr) && (tree->AsOp()->gtOp1->OperIs(GT_LIST))))
{
// We have a HWINTRINSIC node in the GT_LIST form with 3 or more args
Expand Down Expand Up @@ -9018,12 +9018,12 @@ void Compiler::fgValueNumberHWIntrinsic(GenTree* tree)
if (encodeResultType)
{
normalPair = vnStore->VNPairForFunc(tree->TypeGet(), func, op1vnp, resvnp);
assert(vnStore->VNFuncArity(func) == 2);
assert((vnStore->VNFuncArity(func) == 2) || isVariableArgs);
}
else
{
normalPair = vnStore->VNPairForFunc(tree->TypeGet(), func, op1vnp);
assert(vnStore->VNFuncArity(func) == 1);
assert((vnStore->VNFuncArity(func) == 1) || isVariableArgs);
}
}
else
Expand Down