Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement X86 codegen support for dynamic array clone #4179

Merged
merged 1 commit into from
Jan 14, 2019

Conversation

yanluo7
Copy link
Contributor

@yanluo7 yanluo7 commented Jan 4, 2019

Dynamic array cloning transforms refArrayObject.clone()
into anewarray followed by arraycopy, where refArrayObject
is a resolved, not fixed class. For code gen, It requires the
class (2nd) child of anewarray to be runtime evaluated instead
of compile time constant, as actual refArrayObject type is not
known at compiled time (other than it's an ref array). This
change implements support in X86 codegen to handle a runtime
evaluated class child of anewarray.

This is prerequisite for the optimizer change that performs the
object.clone() call transformation.

Signed-off-by: Yan Luo Yan_Luo@ca.ibm.com

@yanluo7 yanluo7 force-pushed the dynamicarrayclone_openj9 branch 2 times, most recently from 1f7bd99 to 7d04ad5 Compare January 4, 2019 21:19
@yanluo7
Copy link
Contributor Author

yanluo7 commented Jan 4, 2019

The desired tree after transformation looks like following:
Note the aloadi <componentClass> child of anewarray node where we obtain the component class from the class of the receiver array object of object.clone() :

treetop ()
  anewarray  jitANewArray
    iloadi  <contiguous-array-size>
      ==>l2a (array object to be cloned)
    aloadi  <componentClass>
      ==>aloadi (vftload of array object)
treetop ()
  arraycopy  <arraycopy>
    ==>l2a
    ==>anewarray
    aladd
      ==>l2a
      lconst 8
    aladd 
      ==>anewarray
      ==>lconst 8
    lshl 
       i2l 
         ==>iloadi
       iconst 2 

// the original array object (array object to be copied). i.e. we are loading the component class from the original array class
// and pass the component class as 2nd child of anewarray. So the 2nd child must be evaluated into a reg.
bool dynamicArrayClone = (node->getOpCodeValue() == TR::anewarray) && (node->getSecondChild()->getSymbolReference()) &&
(node->getSecondChild()->getSymbolReference() == comp->getSymRefTab()->findArrayComponentTypeSymbolRef());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems a little optimistic - shouldn't we say that if it isn't a loadaddr (or possibly aconst) then it is dynamic and assert that we have clazz if the array is not dynamic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, added extra condition that checks for the class node op code for dynamic array, also assertion that rensure clazz cannot be null when not dynamic array.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still seems pretty special-purpose. Do we need to require that the component type child be either constant or in this restricted form? Or could we use something like the following?

node->getOpCodeValue() == TR::anewarray && clazz == NULL

I think canAllocateInline() could set clazz = NULL whenever the child isn't loadaddr.

Copy link
Contributor Author

@yanluo7 yanluo7 Jan 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I revised the logic for evaluating the isDynamicArray boolean. Now we only initialize the boolean once in the top level NewEvaluator using the return values from canAllocateInline(). This way checking the ArrayComponentType symbol occurs only once in canAllocateInline(). And the NewEvaluator passes this flag down to helper routines which only need to check the flag to do the right thing. Currently the boolean is named as array-specific since only dynamic reference array allocations are supported for now. When we implement dynamic allocation for other object we can rename it properly -- just isDynamicAllocation.


TR_X86OpCodes op = (TR::Compiler->target.is64Bit() && fej9->generateCompressedLockWord()) ? S4MemImm4 : SMemImm4();
generateMemImmInstruction(op, node, generateX86MemoryReference(objectReg, lwOffset, cg), initialLwValue, cg);
if (initLw)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some comments about lock word handling are warranted along with a description of the interaction with lock reservation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more comments to address lock reservation case.

@andrewcraik
Copy link
Contributor

@0dvictor could I get your thoughts on this please?

#if defined (J9VM_THR_LOCK_NURSERY)
generateRegMemInstruction(LRegMem(), node, tempReg, generateX86MemoryReference(clzReg, offsetof(J9ArrayClass, lockOffset), cg), cg);
#else
generateRegImmInstruction(MOVRegImm4(), node, tempReg, TMP_OFFSETOF_J9OBJECT_MONITOR, cg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, tempReg holds a compile-time constant, so that the branching logic below can be fold away.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the !J9VM_THR_LOCK_NURSERY case altogether. The flag is set on all platforms. Also modified VP to not generate dynamic objects if flag is not set.

generateRegImmInstruction(MOVRegImm4(), node, tempReg, TMP_OFFSETOF_J9OBJECT_MONITOR, cg);
#endif
generateRegImmInstruction(CMPRegImm4(), node, tempReg, (int32_t)-1, cg);
generateLabelInstruction (JE4, node, doneLabel, true, cg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generateLabelInstruction with needsVMThreadRegister parameter has been deprecated; hence this line should be:
generateLabelInstruction (JE4, node, doneLabel, cg);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

generateRegImmInstruction(CMPRegImm4(), node, tempReg, (int32_t)-1, cg);
generateLabelInstruction (JE4, node, doneLabel, true, cg);
generateRegMemInstruction(LEARegMem(), node, tempReg, generateX86MemoryReference(objectReg, tempReg, 0, cg) ,cg);
TR_X86OpCodes op = (TR::Compiler->target.is64Bit() && fej9->generateCompressedLockWord()) ? S4MemImm4 : SMemImm4();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic may be simplified as:
SMemImm4(TR::Compiler->target.is64Bit() && !fej9->generateCompressedLockWord())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

if (initReservable)
initialLwValue = OBJECT_HEADER_LOCK_RESERVED;

TR_X86OpCodes op = (TR::Compiler->target.is64Bit() && fej9->generateCompressedLockWord()) ? S4MemImm4 : SMemImm4();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic may be simplified as:
SMemImm4(TR::Compiler->target.is64Bit() && !fej9->generateCompressedLockWord())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

{
clzReg = tempReg;
generateRegMemInstruction(LRegMem(), node, clzReg, generateX86MemoryReference(classReg, offsetof(J9Class, arrayClass), cg), cg);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it stands, this will generate the load twice if dynamic array clone occurs in "old" AOT. Could it be combined with the path that generates the corresponding load above? e.g.

if (classReg != NULL && node->getOpCodeValue() == TR::anewarray)
   {
   // Variable component type, due either to dynamicArrayClone or (non-SVM) AOT.
   // Load the array type from it at runtime.
   clzReg = tempReg;
   generateRegMemInstruction(LRegMem(), node, clzReg, generateX86MemoryReference(classReg, offsetof(J9Class, arrayClass), cg), cg);
   }
else if (cg->needClassAndMethodPointerRelocations() && !comp->getOption(TR_UseSymbolValidationManager))
   {
   // ... [check for TR::newarray and TR::New, since TR::anewarray has been ruled out] ...
   }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised the logic as suggested.

// the original array object (array object to be copied). i.e. we are loading the component class from the original array class
// and pass the component class as 2nd child of anewarray. So the 2nd child must be evaluated into a reg.
bool dynamicArrayClone = (node->getOpCodeValue() == TR::anewarray) && (node->getSecondChild()->getSymbolReference()) &&
(node->getSecondChild()->getSymbolReference() == comp->getSymRefTab()->findArrayComponentTypeSymbolRef());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still seems pretty special-purpose. Do we need to require that the component type child be either constant or in this restricted form? Or could we use something like the following?

node->getOpCodeValue() == TR::anewarray && clazz == NULL

I think canAllocateInline() could set clazz = NULL whenever the child isn't loadaddr.

generateLabelInstruction (JE4, node, doneLabel, true, cg);
generateRegMemInstruction(LEARegMem(), node, tempReg, generateX86MemoryReference(objectReg, tempReg, 0, cg) ,cg);
TR_X86OpCodes op = (TR::Compiler->target.is64Bit() && fej9->generateCompressedLockWord()) ? S4MemImm4 : SMemImm4();
generateMemImmInstruction(op, node, generateX86MemoryReference(tempReg, 0, cg), 0, cg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get away without the lea just above, and generate

mov ... ptr [rObject + rTemp], 0

instead of

lea rTemp, [rObject + rTemp]
mov ... ptr [rTemp], 0

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@yanluo7 yanluo7 force-pushed the dynamicarrayclone_openj9 branch from 27ea288 to 2660839 Compare January 10, 2019 01:08
@andrewcraik
Copy link
Contributor

@yanluo7 could you comment on what has changed and respond to the review questions above so we can continue reviewing this change?

@yanluo7
Copy link
Contributor Author

yanluo7 commented Jan 10, 2019

@andrewcraik yes, forgot to hit submit

@andrewcraik
Copy link
Contributor

@0dvictor and @jdmpapin could you review the latest changes to see if they address your comments on this item previously? thanks!

@yanluo7 yanluo7 force-pushed the dynamicarrayclone_openj9 branch from 2660839 to 8f44d5e Compare January 10, 2019 16:23
@@ -650,6 +650,14 @@ J9::Compilation::canAllocateInline(TR::Node* node, TR_OpaqueClassBlock* &classIn
classRef = node->getSecondChild();
classSymRef = classRef->getSymbolReference();

// In the case of dynamic array cloning, second child is the array component type load,
// return 0 indicating variable dynamic array allocation
if (classSymRef && (classRef->getOpCodeValue() == TR::aloadi) && (classSymRef == self()->getSymRefTab()->findArrayComponentTypeSymbolRef()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition still requires aloadi <arrayComponentType>, but the following code assumes that classRef is a loadaddr. So this will break if we ever get a variable component type child with a different form. AFAICT the fix for such a situation would be to weaken this condition to classRef->getOpCodeValue() != TR::loadaddr (and probably also to move the getSymbolReference() call after the conditional, to a point where we know that classRef is in fact a loadaddr, and therefore that it has a symref). Is there any particular reason we shouldn't do that now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More generic, done.

// For dynamic array allocation, load the array class from the component class and store into clzReg
if (isDynamicAllocation)
{
clzReg = tempReg;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assert here that node->getOpCodeValue() == TR::anewarray? Just because if (isDynamicAllocation) sounds more generic, but only anewarray is handled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -7293,7 +7314,7 @@ J9::X86::TreeEvaluator::VMnewEvaluator(
TR_OpaqueClassBlock *clazz = NULL;
TR::Register *classReg = NULL;
bool isArrayNew = false;

bool dynamicArrayAllocation = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this isn't used between here and the real initialization below, would you mind moving the declaration down? Right now it looks like something might read this false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -7697,6 +7723,7 @@ J9::X86::TreeEvaluator::VMnewEvaluator(
dataOffset,
tempReg,
monitorSlotIsInitialized,
dynamicArrayAllocation,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you pass true here, then the path in genInitObjectHeader() that loads the array type for "old" AOT can be deleted, since we'll start using the path that does it for dynamicArrayAllocation instead. Personally I think that would be more straightforward because effectively what old AOT is doing here is forcing the allocation to be dynamic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

@yanluo7 yanluo7 force-pushed the dynamicarrayclone_openj9 branch from 8f44d5e to 316217a Compare January 10, 2019 17:48
Copy link
Contributor

@0dvictor 0dvictor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yanluo7
Copy link
Contributor Author

yanluo7 commented Jan 10, 2019

@andrewcraik Victor and Devin have approved the change. If you're OK with it, this can be merged.

@andrewcraik
Copy link
Contributor

Jenkins test sanity xlinux,win jdk8,jdk11

@andrewcraik
Copy link
Contributor

once sanity is passed I'm good for this to go in

Dynamic array cloning transforms refArrayObject.clone()
into `anewarray` followed by `arraycopy`, where refArrayObject
is a resolved, not fixed class. For code gen, It requires the
class (2nd) child of anewarray to be runtime evaluated instead
of compile time constant, as actual refArrayObject type is not
known at compiled time (other than it's an ref array). This
change implements support in X86 codegen to handle a runtime
evaluated class child of anewarray.

This is prerequisite for the optimizer change that performs the
object.clone() call transformation.

Signed-off-by: Yan Luo <Yan_Luo@ca.ibm.com>
@yanluo7 yanluo7 force-pushed the dynamicarrayclone_openj9 branch from 316217a to 0709034 Compare January 14, 2019 15:39
@andrewcraik
Copy link
Contributor

Jenkins test sanity xlinux,win jdk8,jdk11

@andrewcraik andrewcraik merged commit cd026c7 into eclipse-openj9:master Jan 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants