-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement X86 codegen support for dynamic array clone #4179
Implement X86 codegen support for dynamic array clone #4179
Conversation
1f7bd99
to
7d04ad5
Compare
The desired tree after transformation looks like following:
|
// the original array object (array object to be copied). i.e. we are loading the component class from the original array class | ||
// and pass the component class as 2nd child of anewarray. So the 2nd child must be evaluated into a reg. | ||
bool dynamicArrayClone = (node->getOpCodeValue() == TR::anewarray) && (node->getSecondChild()->getSymbolReference()) && | ||
(node->getSecondChild()->getSymbolReference() == comp->getSymRefTab()->findArrayComponentTypeSymbolRef()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems a little optimistic - shouldn't we say that if it isn't a loadaddr (or possibly aconst) then it is dynamic and assert that we have clazz if the array is not dynamic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, added extra condition that checks for the class node op code for dynamic array, also assertion that rensure clazz cannot be null when not dynamic array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still seems pretty special-purpose. Do we need to require that the component type child be either constant or in this restricted form? Or could we use something like the following?
node->getOpCodeValue() == TR::anewarray && clazz == NULL
I think canAllocateInline()
could set clazz = NULL
whenever the child isn't loadaddr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I revised the logic for evaluating the isDynamicArray
boolean. Now we only initialize the boolean once in the top level NewEvaluator using the return values from canAllocateInline()
. This way checking the ArrayComponentType
symbol occurs only once in canAllocateInline()
. And the NewEvaluator passes this flag down to helper routines which only need to check the flag to do the right thing. Currently the boolean is named as array-specific since only dynamic reference array allocations are supported for now. When we implement dynamic allocation for other object we can rename it properly -- just isDynamicAllocation
.
|
||
TR_X86OpCodes op = (TR::Compiler->target.is64Bit() && fej9->generateCompressedLockWord()) ? S4MemImm4 : SMemImm4(); | ||
generateMemImmInstruction(op, node, generateX86MemoryReference(objectReg, lwOffset, cg), initialLwValue, cg); | ||
if (initLw) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think some comments about lock word handling are warranted along with a description of the interaction with lock reservation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added more comments to address lock reservation case.
7d04ad5
to
27ea288
Compare
@0dvictor could I get your thoughts on this please? |
#if defined (J9VM_THR_LOCK_NURSERY) | ||
generateRegMemInstruction(LRegMem(), node, tempReg, generateX86MemoryReference(clzReg, offsetof(J9ArrayClass, lockOffset), cg), cg); | ||
#else | ||
generateRegImmInstruction(MOVRegImm4(), node, tempReg, TMP_OFFSETOF_J9OBJECT_MONITOR, cg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, tempReg
holds a compile-time constant, so that the branching logic below can be fold away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the !J9VM_THR_LOCK_NURSERY
case altogether. The flag is set on all platforms. Also modified VP to not generate dynamic objects if flag is not set.
generateRegImmInstruction(MOVRegImm4(), node, tempReg, TMP_OFFSETOF_J9OBJECT_MONITOR, cg); | ||
#endif | ||
generateRegImmInstruction(CMPRegImm4(), node, tempReg, (int32_t)-1, cg); | ||
generateLabelInstruction (JE4, node, doneLabel, true, cg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generateLabelInstruction
with needsVMThreadRegister
parameter has been deprecated; hence this line should be:
generateLabelInstruction (JE4, node, doneLabel, cg);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
generateRegImmInstruction(CMPRegImm4(), node, tempReg, (int32_t)-1, cg); | ||
generateLabelInstruction (JE4, node, doneLabel, true, cg); | ||
generateRegMemInstruction(LEARegMem(), node, tempReg, generateX86MemoryReference(objectReg, tempReg, 0, cg) ,cg); | ||
TR_X86OpCodes op = (TR::Compiler->target.is64Bit() && fej9->generateCompressedLockWord()) ? S4MemImm4 : SMemImm4(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic may be simplified as:
SMemImm4(TR::Compiler->target.is64Bit() && !fej9->generateCompressedLockWord())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
if (initReservable) | ||
initialLwValue = OBJECT_HEADER_LOCK_RESERVED; | ||
|
||
TR_X86OpCodes op = (TR::Compiler->target.is64Bit() && fej9->generateCompressedLockWord()) ? S4MemImm4 : SMemImm4(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic may be simplified as:
SMemImm4(TR::Compiler->target.is64Bit() && !fej9->generateCompressedLockWord())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
{ | ||
clzReg = tempReg; | ||
generateRegMemInstruction(LRegMem(), node, clzReg, generateX86MemoryReference(classReg, offsetof(J9Class, arrayClass), cg), cg); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it stands, this will generate the load twice if dynamic array clone occurs in "old" AOT. Could it be combined with the path that generates the corresponding load above? e.g.
if (classReg != NULL && node->getOpCodeValue() == TR::anewarray)
{
// Variable component type, due either to dynamicArrayClone or (non-SVM) AOT.
// Load the array type from it at runtime.
clzReg = tempReg;
generateRegMemInstruction(LRegMem(), node, clzReg, generateX86MemoryReference(classReg, offsetof(J9Class, arrayClass), cg), cg);
}
else if (cg->needClassAndMethodPointerRelocations() && !comp->getOption(TR_UseSymbolValidationManager))
{
// ... [check for TR::newarray and TR::New, since TR::anewarray has been ruled out] ...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revised the logic as suggested.
// the original array object (array object to be copied). i.e. we are loading the component class from the original array class | ||
// and pass the component class as 2nd child of anewarray. So the 2nd child must be evaluated into a reg. | ||
bool dynamicArrayClone = (node->getOpCodeValue() == TR::anewarray) && (node->getSecondChild()->getSymbolReference()) && | ||
(node->getSecondChild()->getSymbolReference() == comp->getSymRefTab()->findArrayComponentTypeSymbolRef()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still seems pretty special-purpose. Do we need to require that the component type child be either constant or in this restricted form? Or could we use something like the following?
node->getOpCodeValue() == TR::anewarray && clazz == NULL
I think canAllocateInline()
could set clazz = NULL
whenever the child isn't loadaddr
.
generateLabelInstruction (JE4, node, doneLabel, true, cg); | ||
generateRegMemInstruction(LEARegMem(), node, tempReg, generateX86MemoryReference(objectReg, tempReg, 0, cg) ,cg); | ||
TR_X86OpCodes op = (TR::Compiler->target.is64Bit() && fej9->generateCompressedLockWord()) ? S4MemImm4 : SMemImm4(); | ||
generateMemImmInstruction(op, node, generateX86MemoryReference(tempReg, 0, cg), 0, cg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get away without the lea
just above, and generate
mov ... ptr [rObject + rTemp], 0
instead of
lea rTemp, [rObject + rTemp]
mov ... ptr [rTemp], 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
27ea288
to
2660839
Compare
@yanluo7 could you comment on what has changed and respond to the review questions above so we can continue reviewing this change? |
@andrewcraik yes, forgot to hit |
2660839
to
8f44d5e
Compare
@@ -650,6 +650,14 @@ J9::Compilation::canAllocateInline(TR::Node* node, TR_OpaqueClassBlock* &classIn | |||
classRef = node->getSecondChild(); | |||
classSymRef = classRef->getSymbolReference(); | |||
|
|||
// In the case of dynamic array cloning, second child is the array component type load, | |||
// return 0 indicating variable dynamic array allocation | |||
if (classSymRef && (classRef->getOpCodeValue() == TR::aloadi) && (classSymRef == self()->getSymRefTab()->findArrayComponentTypeSymbolRef())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition still requires aloadi <arrayComponentType>
, but the following code assumes that classRef
is a loadaddr
. So this will break if we ever get a variable component type child with a different form. AFAICT the fix for such a situation would be to weaken this condition to classRef->getOpCodeValue() != TR::loadaddr
(and probably also to move the getSymbolReference()
call after the conditional, to a point where we know that classRef
is in fact a loadaddr
, and therefore that it has a symref). Is there any particular reason we shouldn't do that now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More generic, done.
// For dynamic array allocation, load the array class from the component class and store into clzReg | ||
if (isDynamicAllocation) | ||
{ | ||
clzReg = tempReg; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we assert here that node->getOpCodeValue() == TR::anewarray
? Just because if (isDynamicAllocation)
sounds more generic, but only anewarray
is handled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -7293,7 +7314,7 @@ J9::X86::TreeEvaluator::VMnewEvaluator( | |||
TR_OpaqueClassBlock *clazz = NULL; | |||
TR::Register *classReg = NULL; | |||
bool isArrayNew = false; | |||
|
|||
bool dynamicArrayAllocation = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this isn't used between here and the real initialization below, would you mind moving the declaration down? Right now it looks like something might read this false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -7697,6 +7723,7 @@ J9::X86::TreeEvaluator::VMnewEvaluator( | |||
dataOffset, | |||
tempReg, | |||
monitorSlotIsInitialized, | |||
dynamicArrayAllocation, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you pass true
here, then the path in genInitObjectHeader()
that loads the array type for "old" AOT can be deleted, since we'll start using the path that does it for dynamicArrayAllocation
instead. Personally I think that would be more straightforward because effectively what old AOT is doing here is forcing the allocation to be dynamic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
8f44d5e
to
316217a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@andrewcraik Victor and Devin have approved the change. If you're OK with it, this can be merged. |
Jenkins test sanity xlinux,win jdk8,jdk11 |
once sanity is passed I'm good for this to go in |
Dynamic array cloning transforms refArrayObject.clone() into `anewarray` followed by `arraycopy`, where refArrayObject is a resolved, not fixed class. For code gen, It requires the class (2nd) child of anewarray to be runtime evaluated instead of compile time constant, as actual refArrayObject type is not known at compiled time (other than it's an ref array). This change implements support in X86 codegen to handle a runtime evaluated class child of anewarray. This is prerequisite for the optimizer change that performs the object.clone() call transformation. Signed-off-by: Yan Luo <Yan_Luo@ca.ibm.com>
316217a
to
0709034
Compare
Jenkins test sanity xlinux,win jdk8,jdk11 |
Dynamic array cloning transforms refArrayObject.clone()
into
anewarray
followed byarraycopy
, where refArrayObjectis a resolved, not fixed class. For code gen, It requires the
class (2nd) child of anewarray to be runtime evaluated instead
of compile time constant, as actual refArrayObject type is not
known at compiled time (other than it's an ref array). This
change implements support in X86 codegen to handle a runtime
evaluated class child of anewarray.
This is prerequisite for the optimizer change that performs the
object.clone() call transformation.
Signed-off-by: Yan Luo Yan_Luo@ca.ibm.com