Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't turn off OSR for Quad #3946

Merged
merged 1 commit into from
Jan 10, 2019
Merged

Don't turn off OSR for Quad #3946

merged 1 commit into from
Jan 10, 2019

Conversation

liqunl
Copy link
Contributor

@liqunl liqunl commented Dec 6, 2018

Quad optimizations turns a call to Quad method into arithmetic
operations, which is not an OSR point. Therefore no OSR bookkeeping will
be done for Quad and there is no interaction between Quad and OSR. It is
safe to turn OSR on even when Quad is encountered.

Signed-off-by: Liqun Liu liqunl@ca.ibm.com

@liqunl
Copy link
Contributor Author

liqunl commented Dec 6, 2018

@andrewcraik @vijaysun-omr Can I have a review?

@liqunl
Copy link
Contributor Author

liqunl commented Dec 6, 2018

Quad methods are used in DivisionLong. The change has been tested in RSABench which extensively uses DivisionLong. There is no performance impact with this change. Trace log shows that Quad optimization kicks in under OSR, there is no gc point in the range where Quad is used.

@andrewcraik
Copy link
Contributor

Concept seems fine, but I think @vijaysun-omr has a deeper understanding of these opts so would value his opinion.

@vijaysun-omr
Copy link
Contributor

@liqunl so you have looked at "all" uses of Quad in the BigInteger class manually and reasoned that there cannot be any bad interaction with OSR ? The tricky point I think is that the local variable slot that was supposed to point at a Quad object as per the bytecodes obviously won't be doing so since the JIT special handling of that class avoids objects and does equivalent logic using primitives. While you may have verified that a Quad object is never live in the sense of "going to be used" by bytecodes that are downstream of an OSR point, I am wondering about what will be the contents of such a local variable slot if a GC occurred after the OSR happens. i.e. what will be in the interepreter's frame slot for a local variable that ought to have been pointing at a Quad object. We need to make sure the GC does'nt crash, i.e. the slot cannot contain garbage.

@liqunl
Copy link
Contributor Author

liqunl commented Dec 6, 2018

@vijaysun-omr
Yes. Quad is used to only produce an intermediate result which is to be stored in a primitive.

Quad is used in a way that when it's optimized by the JIT, no invalidation point exists in the sequence of Quad calls. That means we won't OSR during Quad operations. After Quad operations finish, the Quad object is dead in the sense of java code. If we're going to OSR right after Quad operations finish, interpreter will not going to use the Quad object generated by earlier bytecodes.

Since the Quad object is dead when we transition to the interpreter, gc won't scan its slot. Or at least gc won't assume that slot contains a Quad object if that slot is reused for something else.

In fact, if we forget about OSR, even if there is a gc point in the middle of the sequence of Quad calls, since no Quad object is generated, the auto slot stores a primitive, which makes the slot not collected reference, thus the slot will never show up in the gcmap.

The following is a piece of code to show you how Quad is used

long m1, m2;
long h;
{
   Quad z = Quad.mul(m1, m2);
   h = Quad.hi(z);
}

As you see, Quad is usually used in a local scope. After the result is stored in a primitive declared outside of the local scope, the Quad object's job has finished. If we want to do another Quad operations, we'll do something similar.

Here's a mapping of java bytecode to trees after Quad optimization. As you can see, we use lstore to store the result of Quad.add to an auto.

JBinvokestatic  // com/ibm/Compiler/Internal/Quad.add(JJ)Lcom/ibm/Compiler/Internal/Quad;
JBastore

to

n116n     lstore  <auto slot 17>[#637  Auto] [flags 0x4 0x0 ] 
n115n       luaddc ()                                         
n112n         lconst 0                                        
n112n         ==>lconst 0
n114n         computeCC
...

@vijaysun-omr
Copy link
Contributor

Are you sure about "...since the Quad object is dead when we transition to the interpreter, gc won't scan its slot." ? This is true in the JIT compiled code because we generate our own GC stack maps using liveness done by the JIT. I just wanted to be sure that the interpreter behaviour is as you stated.

@liqunl
Copy link
Contributor Author

liqunl commented Dec 7, 2018

@vijaysun-omr After talking to @andrewcraik and @cathyzhyi, I think there is a problem if gc occurs right after we transition to the interpreter. The slot is dead to the JIT. After the Quad optimization, primitive will be stored to the slot. In the prospective of the JIT, the slot contains a primitive and is dead on all paths to OSR code block, we don't zero it out. So the slot in the OSR buffer will contain a primitive instead of a reference.

Currently, we only zero out reference type dead slots. One solution is to treat Quad a special case and zero out all dead slots if we've seen Quad.

@liqunl liqunl changed the title Don't turn off OSR for Quad WIP: Don't turn off OSR for Quad Dec 7, 2018
@liqunl
Copy link
Contributor Author

liqunl commented Dec 7, 2018

@vijaysun-omr Asked @gacholio, If a slot is not read again as an object, it is considered dead in interpreter's gcmap (any write to that slot will invalidate that slot), then gc won't scan the slot. The gcmap is calculated the same way as the JIT, so there should be no problem if a gc occurs right after the OSR transition.

@liqunl liqunl changed the title WIP: Don't turn off OSR for Quad Don't turn off OSR for Quad Dec 7, 2018
@gacholio
Copy link
Contributor

gacholio commented Dec 7, 2018

Note this is true only in normal mode. In debug (FSD) mode, we use a different local map strategy that tracks the values written to slots, rather than how the slot is later used.

@liqunl
Copy link
Contributor Author

liqunl commented Dec 14, 2018

@vijaysun-omr Quad optimization is off under FSD, this change is still safe.

@DanHeidinga
Copy link
Member

In debug (FSD) mode, we use a different local map strategy that tracks the values written to slots, rather than how the slot is later used.

There is also an option that enables that mapping strategy in PR #3625 so we should be careful about using FSD mode to determine if this is safe or not.

@liqunl
Copy link
Contributor Author

liqunl commented Dec 15, 2018

@DanHeidinga Is there an API to tell if the debug local mapper is enabled? I can disable Quad if that's on.

@gacholio
Copy link
Contributor

@liqunl
Copy link
Contributor Author

liqunl commented Jan 2, 2019

In #3625 (comment), I talked about why -XX:DebugLocalMapper will cause problems in voluntary OSR and that the JIT can't zero out dead o-slots because of performance. This PR is blocking the field folding stuff, which is a .12 item. @gacholio @DanHeidinga is #3625 a .12 item? If not, can we get this PR in and discuss #3625 later?

Quad optimizations turns a call to Quad method into arithmetic
operations, which is not an OSR point. Therefore no OSR bookkeeping will
be done for Quad and there is no interaction between Quad and OSR. It is
safe to turn OSR on even when Quad is encountered.

Signed-off-by: Liqun Liu <liqunl@ca.ibm.com>
@liqunl
Copy link
Contributor Author

liqunl commented Jan 3, 2019

@vijaysun-omr The problem in current implementation of #3625 affects voluntary OSR in general, not limited to Quad. In #3625 (comment), GAC proposed two approaches to work around the problem. No JIT change is required for either approach. Thus I think this change is safe to be merged.

@vijaysun-omr
Copy link
Contributor

@liqunl I am fine with this change given the above explanation.

@liqunl
Copy link
Contributor Author

liqunl commented Jan 3, 2019

@andrewcraik Could you review again?

Copy link
Contributor

@andrewcraik andrewcraik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once sanity passes

@andrewcraik
Copy link
Contributor

Jenkins test sanity xlinux,win,plinux jdk8,jdk11

@andrewcraik
Copy link
Contributor

@liqunl do the failures in the x86-64 linux look related?

@liqunl
Copy link
Contributor Author

liqunl commented Jan 10, 2019

The x86-64 failures are classNotFoundException in shared class cache tests, not related to this the change in this PR.

 [ERR] java.lang.ClassNotFoundException: A
 [ERR] 	at CustomClassloaders.CustomPartitioningURLClassLoader.findClass(CustomPartitioningURLClassLoader.java:393)
 [ERR] 	at java.base/java.lang.ClassLoader.loadClassHelper(ClassLoader.java:1134)
 [ERR] 	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:1049)
 [ERR] 	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:1032)

@andrewcraik
Copy link
Contributor

ok - we can merge this then

@andrewcraik andrewcraik merged commit 16eaa1c into eclipse-openj9:master Jan 10, 2019
@liqunl liqunl deleted the quad branch January 24, 2019 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants