-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimized checkcast and instanceof on X86 #2361
Conversation
This PR requires eclipse-omr/omr#2732 and eclipse-omr/omr#2733 |
eclipse-omr/omr#2732 has been merged, waiting for eclipse-omr/omr#2733. |
Both OMR dependencies have been merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Jenkins test sanity xlinux,win |
Jenkins test sanity win32 |
@0dvictor build failures in the Linux_x86-64 not sure if something is broken in the mainline or if you need to rebase forward. Either way could you take a look? |
The failures are because the dependent OMR changes have not been promoted to OpenJ9 yet. I will monitor it and re-base once they are ready. |
@r30shah could you take a quick peek and see if we're missing any of these for Z? |
I'll put myself as the assignee for this one given that @andrewcraik is away for the next few weeks. Please let me know when this happens and we'll re-launch the build and merge this in. |
1. Introduced a call site cache to cache last J9Class encountered for Interface; 2. Inlined iTable lookup for Interface, eliminating the need calling JIT helper; 3. Reduced number of branches needed for Class. Signed-off-by: Victor Ding <dvictor@ca.ibm.com>
Since the dependent OMR changes have been promoted, this pull request has been rebased and re-tested. It is ready for merging. |
Jenkins test sanity xlinux |
Squashing all commits following original pull request candidate commit (2018-04-03). Changes included in this commit are documented in eclipse-openj9#2664 and eclipse-openj9#2361. As of this commit, 64-bit builds are working and performing fine, consistently passing all git tests in internal x86_64, ppc and zos 64-bit builds for linux. Recent changes to scavenger (outside of evacuator, possibly introduction of scavenger delegate or changes relating to heap resizing) have broken 32-bit builds. This bug is present in evacuator builds regardless of whether evacuator or scavenger is selected as algorithm for gencon generational collector (scavenger is default as of this commit, evacuator is selected using `-XXgc:recursiveScanOrdering`. The 32-bit bug can be reproduced by running an evacuator-enabled build on specJVM2008/xml.validation with default heap size. There will be a segfault on NULL pointer dereference after a few GC cycles after concuuent marking is kicked off. When this is run with `-Xcheck:gc`, gccheck reports no errors up to point of failure, which occurs after the stop-the-world phase for most recent nursery collection completes. Signed-off-by: Kim Briggs <briggs@ca.ibm.com>
PR eclipse-openj9#2361 introduced a number of JIT performance enhancements for checkcasts involving interface classes. Two of the most significant changes were the introduction of a one-slot, dynamic cache to cache the last successful instance class that matched the cast class. It also inlined the itable walk to determine if the interface class implemented the cast class. Each time a successful checkcast was performed the cache would be updated with that result at runtime (it was an LRU cache). This cache was implemented as a single, 8-byte data snippet located in the code cache. The performance of this approach may be acceptable for checkcast sites that do not see more than one instance class, or when there is only a single thread of execution along this path. However, in the presence of multiple threads and multiple instance classes the performance penalty of multiple threads writing to the same data address (the class cache) is quite significant due to hardware cache coherency protocols. In fact, the performance overhead gets significantly worse the more threads that are involved. There is no information provided in eclipse-openj9#2361 for what motivated the change or the workload it was expected to benefit. This PR modifies that implementation as follows: * Consult available profiling information to determine how many instance classes may be seen by this checkcast/instanceof site. If there are more than one then do not use a cache. If there is no information available then do not use a cache as the characteristics are unknown. * If there is a single profiled instance class then pre-populate the cache with that class. Do not update the cache at runtime. * If a cache is not used, then inline the interface table walk in mainline code rather than from outlined instructions. Three environment variables have been introduced to control behaviour of this evaluator: 1) `TR_updateInterfaceCheckCastCacheSlot` : when set the interface cast cache slot will be updated at runtime (i.e., this is the original behaviour) 2) `TR_forceDisableInterfaceCastCache` : never use the cache regardless of profiling information. 3) `TR_forceEnableInterfaceCastCache` : force the use of the cache regardless of profiling information. 2) and 3) are mutually exclusive. Behaviour is undefined if both are set. Using 1) and 3) will restore original behaviour. Signed-off-by: Daryl Maier <maier@ca.ibm.com>
Signed-off-by: Victor Ding dvictor@ca.ibm.com