-
Notifications
You must be signed in to change notification settings - Fork 729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JTReg jdk11-m2 Failure: TestJlmRemoteMemoryAuth_0 #9046
Comments
https://ci.eclipse.org/openj9/job/Test_openjdk11_j9_sanity.system_aarch64_linux_Personal/2/ was the last time I saw this failure. |
yeah it seems intermittent. Couldn't reproduce in a singular grinder https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/2721/ |
I ran this test 10 times manually on cent7-aarch64-1, and saw no failures. |
I managed to recreate this. Again, its on |
I ran this test 20 times on test-packet-armv8-ubuntu-16-04, but I couldn't reproduce the failure.
|
I managed to reproduce this on cent7-aarch64-1.
|
The failure was recreated on cent7-aarch64-1 and on test-packet-armv8-ubuntu-16-04, but it takes long time and the rate is around 1/10 - 1/20. I found this test failed with
My colleague and I ran the test repeatedly on those test servers with the following options:
We couldn't recreate failures with either of these options. |
Both servers above have 96 CPU cores and OpenSSL 1.0.2. We used the following OpenJ9 build, which contains the fix for multi-threading with OpenSSL 1.0.2 ibmruntimes/openj9-openjdk-jdk11#283 :
I haven't been able to recreate this failure on a quad-core machine with OpenSSL 1.1.1. |
https://ci.eclipse.org/openj9/view/Test/job/Grinder/796 |
This issue is difficult to reproduce (only been seen on a large core machine) and may be related to the version of OpenSSL (for example, we have not seen this problem on the machine on which it failed running with OpenSSL 1.1.1). We'll leave it open for a few more days to try and collect more data, but given that AArch64 is still early access there is a good chance this will be deferred. |
#9769 reports a failure with |
0.21.0 m2 build https://ci.eclipse.org/openj9/job/Test_openjdk11_j9_sanity.system_aarch64_linux_Personal/5/
|
0.21.0 m2 build https://ci.eclipse.org/openj9/job/Test_openjdk11_j9_sanity.system_aarch64_linux_xl_Personal/2
|
I ran the test on cent7-aarch64-1 50 times with
|
Moving this forward as we've completed the milestone 2 builds for 0.21.0 and it's too late to put this in. |
This is occurring intermittently on AdoptOpenJDK nightly builds - e.g. https://ci.adoptopenjdk.net/job/Test_openjdk11_j9_sanity.system_aarch64_linux_xl/207/consoleFull I have gone through the most recent runs and only found it occurring on machines test-packet-ubuntu1604-armv8-1 and test-packet-ubuntu1604-armv8-2 - i.e. Ubuntu 16.04 machines. I did not see any failures on the other AdoptOpenJDK test machines running rhel7.6 |
Do we need this in the milestone plan? I'll remove it. |
Based on the investigations so far, the strong suspicion on this one is that it is an OpenSSL issue. We haven't found any evidence to suggest it is an OpenJ9 problem, hence deferring is the recommended approach. One useful datapoint would be the version of OpenSSL found on the Ubuntu 16.04 boxes at AdoptOpenJDK and the RHL 7.6 boxes. |
The AArch64 builds use |
Perhaps so. |
Just to add something extra to this - the problem appears to only be shown on the ThunderX aarch64 systems (Not sure which OpenJ9 have but if it's the 96-core ones then it's likely to be ThunderX) so it's possible that a fix will be out of our control if it's hardware related. I may try upgrading one of ours to a later level (I'll do it under but I'm not overly confident that it will really change anything. Current infrastructure issue at adoptopenjdk is adoptium/infrastructure#1897 - it may ultimately by a |
Re-reading Daryl's comment #9046 (comment) I realized I misunderstood the options and my earlier comment was bogus. What is the openssl version available on the build system? Considering it only fails on one target, it's likely the fault of the openssl installed on the target, but I'm not sure of the side affects of compiling against an out-dated openssl version. |
As per recent updates on the infrastructure issue I would not assume it's related to the version that OpenJ9 is being linked with as I'm able to reproduce the same failures using the latest openssl111i codebase outside OpenJ9 on multiple ThunderX systems. |
Failure link
https://ci.adoptopenjdk.net/job/Test_openjdk11_j9_sanity.system_aarch64_linux_xl/73
Optional info
Failure output (captured from console output)
The text was updated successfully, but these errors were encountered: