Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimized checkcast and instanceof on X86 #2361

Merged
merged 1 commit into from
Jul 23, 2018

Conversation

0dvictor
Copy link
Contributor

@0dvictor 0dvictor commented Jul 9, 2018

  1. Introduced a call site cache to cache last J9Class encountered for Interface;
  2. Inlined iTable lookup for Interface, eliminating the need calling JIT helper;
  3. Reduced number of branches needed for Class.

Signed-off-by: Victor Ding dvictor@ca.ibm.com

@0dvictor
Copy link
Contributor Author

0dvictor commented Jul 9, 2018

This PR requires eclipse-omr/omr#2732 and eclipse-omr/omr#2733

@fjeremic fjeremic added comp:jit depends:omr Pull request is dependent on a corresponding change in OMR labels Jul 10, 2018
@0dvictor
Copy link
Contributor Author

0dvictor commented Jul 10, 2018

eclipse-omr/omr#2732 has been merged, waiting for eclipse-omr/omr#2733.

@0dvictor
Copy link
Contributor Author

Both OMR dependencies have been merged.

Copy link
Contributor

@andrewcraik andrewcraik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@andrewcraik
Copy link
Contributor

Jenkins test sanity xlinux,win

@andrewcraik
Copy link
Contributor

Jenkins test sanity win32

@andrewcraik
Copy link
Contributor

@0dvictor build failures in the Linux_x86-64 not sure if something is broken in the mainline or if you need to rebase forward. Either way could you take a look?

@0dvictor
Copy link
Contributor Author

The failures are because the dependent OMR changes have not been promoted to OpenJ9 yet. I will monitor it and re-base once they are ready.

@fjeremic
Copy link
Contributor

@r30shah could you take a quick peek and see if we're missing any of these for Z?

@fjeremic fjeremic self-assigned this Jul 13, 2018
@fjeremic
Copy link
Contributor

fjeremic commented Jul 13, 2018

The failures are because the dependent OMR changes have not been promoted to OpenJ9 yet. I will monitor it and re-base once they are ready.

I'll put myself as the assignee for this one given that @andrewcraik is away for the next few weeks. Please let me know when this happens and we'll re-launch the build and merge this in.

1. Introduced a call site cache to cache last J9Class encountered for Interface;
2. Inlined iTable lookup for Interface, eliminating the need calling JIT helper;
3. Reduced number of branches needed for Class.

Signed-off-by: Victor Ding <dvictor@ca.ibm.com>
@0dvictor
Copy link
Contributor Author

Since the dependent OMR changes have been promoted, this pull request has been rebased and re-tested. It is ready for merging.

@fjeremic
Copy link
Contributor

Jenkins test sanity xlinux

@fjeremic fjeremic merged commit 6b5c844 into eclipse-openj9:master Jul 23, 2018
@0dvictor 0dvictor deleted the checkcast branch July 23, 2018 15:13
ktbriggs added a commit to ktbriggs/openj9 that referenced this pull request Mar 20, 2019
Squashing all commits following original pull request candidate commit
(2018-04-03). Changes included in this commit are documented in eclipse-openj9#2664
and eclipse-openj9#2361.

As of this commit, 64-bit builds are working and performing fine, 
consistently passing all git tests in internal x86_64, ppc and zos 
64-bit builds for linux. Recent changes to scavenger (outside of 
evacuator, possibly introduction of scavenger delegate or changes 
relating to heap resizing) have broken 32-bit builds. This bug is 
present in evacuator builds regardless of whether evacuator or scavenger
is selected as algorithm for gencon generational collector (scavenger is
default as of this commit, evacuator is selected using
`-XXgc:recursiveScanOrdering`.

The 32-bit bug can be reproduced by running an evacuator-enabled build
on specJVM2008/xml.validation with default heap size. There will be a 
segfault on NULL pointer dereference after a few GC cycles after 
concuuent marking is kicked off. When this is run with `-Xcheck:gc`,
gccheck reports no errors up to point of failure, which occurs after the
stop-the-world phase for most recent nursery collection completes. 

Signed-off-by: Kim Briggs <briggs@ca.ibm.com>
0xdaryl added a commit to 0xdaryl/openj9 that referenced this pull request May 22, 2022
PR eclipse-openj9#2361 introduced a number of JIT performance enhancements for checkcasts
involving interface classes.  Two of the most significant changes were the
introduction of a one-slot, dynamic cache to cache the last successful instance
class that matched the cast class.  It also inlined the itable walk to determine
if the interface class implemented the cast class.

Each time a successful checkcast was performed the cache would be updated with
that result at runtime (it was an LRU cache).  This cache was implemented as a
single, 8-byte data snippet located in the code cache.  The performance of this
approach may be acceptable for checkcast sites that do not see more than one
instance class, or when there is only a single thread of execution along this
path.  However, in the presence of multiple threads and multiple instance classes
the performance penalty of multiple threads writing to the same data address (the
class cache) is quite significant due to hardware cache coherency protocols.
In fact, the performance overhead gets significantly worse the more threads that
are involved.

There is no information provided in eclipse-openj9#2361 for what motivated the change or the
workload it was expected to benefit.

This PR modifies that implementation as follows:

* Consult available profiling information to determine how many instance classes
  may be seen by this checkcast/instanceof site.  If there are more than one then
  do not use a cache.  If there is no information available then do not use a
  cache as the characteristics are unknown.

* If there is a single profiled instance class then pre-populate the cache with
  that class.  Do not update the cache at runtime.

* If a cache is not used, then inline the interface table walk in mainline code
  rather than from outlined instructions.

Three environment variables have been introduced to control behaviour of this
evaluator:

1) `TR_updateInterfaceCheckCastCacheSlot` : when set the interface cast cache
   slot will be updated at runtime (i.e., this is the original behaviour)

2) `TR_forceDisableInterfaceCastCache` : never use the cache regardless of
   profiling information.

3) `TR_forceEnableInterfaceCastCache` : force the use of the cache regardless of
   profiling information.

2) and 3) are mutually exclusive.  Behaviour is undefined if both are set.

Using 1) and 3) will restore original behaviour.

Signed-off-by: Daryl Maier <maier@ca.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:jit depends:omr Pull request is dependent on a corresponding change in OMR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants