Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDK11 Segmentation error vmState=0x000514ff #15364

Closed
connglli opened this issue Jun 20, 2022 · 9 comments · Fixed by eclipse/omr#6620
Closed

JDK11 Segmentation error vmState=0x000514ff #15364

connglli opened this issue Jun 20, 2022 · 9 comments · Fixed by eclipse/omr#6620
Labels
comp:jit segfault Issues that describe segfaults / JVM crashes userRaised

Comments

@connglli
Copy link

Java -version output

openjdk version "11.0.16-internal" 2022-07-19
OpenJDK Runtime Environment (build 11.0.16-internal+0-adhoc..openj9-openjdk-jdk11)
Eclipse OpenJ9 VM (build master-4ca209b54, JRE 11 Linux amd64-64-Bit Compressed References 20220615_000000 (JIT enabled, AOT enabled)
OpenJ9   - 4ca209b54
OMR      - 26b89f9f9
JCL      - 231dcc9eeb based on jdk-11.0.16+6)

Summary of problem

The following Test.java, which is reduced by us, crashes OpenJ9's JIT compiler

class Test {
  public static final int N = 256;
  static long instanceCount = 1103254304L;
  public static volatile short sFld = 26731;
  public static volatile double dFld = 2.108137;
  static short[] sArrFld = new short[N];
  static boolean[] bArrFld = new boolean[N];
  int[] iArrFld = new int[N];
  long[] lArrFld = new long[N];
  static long vMeth_check_sum;
  static long vMeth1_check_sum;
  long vMeth2_check_sum;

  static void vMeth1() {
    int i1, i2 = 84, iArr2[] = new int[N];
    byte by1 = 10, byArr[] = new byte[N];
    float f2 = 1.140F, fArr[] = new float[N];
    boolean b1 = false;
    long[] lArr1 = new long[N];
    init(iArr2, 8);
    init(lArr1, 8486100487510871511L);
    for (i1 = 247; i1 > 10; i1--) Test.instanceCount = i2;
    vMeth1_check_sum += Double.doubleToLongBits(checkSum(fArr));
  }

  void vMeth(int i) {
    int i11, i12 = 741, i16 = 0, i17 = 107, i18 = 82, i20 = 58938, i21 = 11, i22 = 14;
    float f3, fArr1[] = new float[N];
    double[] dArr = new double[N];
    for (i11 = 4; i11 < 100; i11++) f3 = i12;
    for (i17 = 1; i17 < 131; ++i17) i = (3 % i11);
    vMeth_check_sum += Double.doubleToLongBits(checkSum(fArr1));
  }

  void mainTest(String[] strArr1) {
    int i23 = 48109;
    vMeth(i23);
  }

  public static void main(String[] strArr) {
    Test _instance = new Test();
    for (int i = 0; ; ) {
      _instance.mainTest(strArr);
    }
  }

  public static void init(long[] a, long seed) {
    for (int j = 0; j < a.length; j++) {
      a[j] = (j % 2 == 0) ? seed + j : seed - j;
    }
  }

  public static void init(int[] a, int seed) {
    for (int j = 0; j < a.length; j++) {
      a[j] = (j % 2 == 0) ? seed + j : seed - j;
    }
  }

  public static double checkSum(float[] a) {
    double sum = 0;
    for (int j = 0; j < a.length; j++) {
      sum += (a[j] / (j + 1) + a[j] % (j + 1));
    }
    return sum;
  }

  static Boolean ax$0;
  Boolean ax$14 = false;
}

Diagnostic files

By issuing

$ java -Xmx1G -Xshareclasses:none Test

the following crash log is given:

#0: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8bfda5) [0x7f1a27464da5]
#1: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8cb090) [0x7f1a27470090]
#2: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x139579) [0x7f1a26cde579]
#3: /zdata/congli/OpenJ9/jdk11/lib/default/libj9prt29.so(+0x2911a) [0x7f1a2ca2f11a]
#4: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420) [0x7f1a2ceb5420]
#5: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x80a386) [0x7f1a273af386]
#6: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x80a894) [0x7f1a273af894]
#7: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8a7958) [0x7f1a2744c958]
#8: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8b9f24) [0x7f1a2745ef24]
#9: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x87eb8f) [0x7f1a27423b8f]
#10: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8b9fcb) [0x7f1a2745efcb]
#11: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8ba3f1) [0x7f1a2745f3f1]
#12: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8a3dfe) [0x7f1a27448dfe]
#13: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8a5b8a) [0x7f1a2744ab8a]
#14: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8a5d37) [0x7f1a2744ad37]
#15: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8a5b8a) [0x7f1a2744ab8a]
#16: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8a6c54) [0x7f1a2744bc54]
#17: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8a71d8) [0x7f1a2744c1d8]
#18: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8a71d8) [0x7f1a2744c1d8]
#19: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8a71d8) [0x7f1a2744c1d8]
#20: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8a71d8) [0x7f1a2744c1d8]
#21: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8a71d8) [0x7f1a2744c1d8]
#22: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8a71d8) [0x7f1a2744c1d8]
#23: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8a71d8) [0x7f1a2744c1d8]
#24: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8a5b8a) [0x7f1a2744ab8a]
#25: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8a6c54) [0x7f1a2744bc54]
#26: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8a6ee5) [0x7f1a2744bee5]
#27: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x8ab0c2) [0x7f1a274500c2]
#28: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x749af7) [0x7f1a272eeaf7]
#29: /zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so(+0x74a089) [0x7f1a272ef089]
Unhandled exception
Type=Segmentation error vmState=0x000514ff
J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
Handler1=00007F1A2CC45FD0 Handler2=00007F1A2CA2EEF0 InaccessibleAddress=00007F1A06A45000
RDI=00007F1A05B95020 RSI=000000000000003D RAX=00000000001C2109 RBX=00007F1A0C936B70
RCX=00000000FFFFFD5D RDX=0000000000000000 R8=0000000000E10840 R9=00007F1A05BB5570
R10=00007F1A05C347C0 R11=00007F1A05BB56D8 R12=000000000000003D R13=00007F1A0C936CBF
R14=0000000000000000 R15=0000000020000000
RIP=00007F1A273AF386 GS=0000 FS=0000 RSP=00007F1A0C936AB8
EFlags=0000000000010206 CS=0033 RBP=00007F1A0C936C00 ERR=0000000000000007
TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=00007F1A06A45000
xmm0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm4 00007f1a05b89130 (f: 95981872.000000, d: 6.904555e-310)
xmm5 00007f1a05b87870 (f: 95975536.000000, d: 6.904555e-310)
xmm6 00007f1a05b87130 (f: 95973680.000000, d: 6.904555e-310)
xmm7 00007f1a05b87200 (f: 95973888.000000, d: 6.904555e-310)
xmm8 252e732a2e250073 (f: 774176896.000000, d: 1.372768e-129)
xmm9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
Module=/zdata/congli/OpenJ9/jdk11/lib/default/libj9jit29.so
Module_base_address=00007F1A26BA5000

Method_being_compiled=Test.mainTest([Ljava/lang/String;)V
Target=2_90_20220615_000000 (Linux 5.4.0-120-generic)
CPU=amd64 (8 logical CPUs) (0x3e45ba000 RAM)
----------- Stack Backtrace -----------
_ZN13TR_UseDefInfo13getUseDef_refEiPN3CS210ABitVectorINS0_16shared_allocatorINS0_14heap_allocatorILm65536ELj12E17TRMemoryAllocatorIL17TR_AllocationKind1ELj12ELj28EEEEEEEE+0x26 (0x00007F1A273AF386 [libj9jit29.so+0x80a386])
_ZN13TR_UseDefInfo9getUseDefERN3CS210ABitVectorINS0_16shared_allocatorINS0_14heap_allocatorILm65536ELj12E17TRMemoryAllocatorIL17TR_AllocationKind1ELj12ELj28EEEEEEEEi+0x14 (0x00007F1A273AF894 [libj9jit29.so+0x80a894])
_ZN3OMR16ValuePropagation19mergeDefConstraintsEPN2TR4NodeEiRbb+0x3a8 (0x00007F1A2744C958 [libj9jit29.so+0x8a7958])
_ZN3OMR16ValuePropagation13getConstraintEPN2TR4NodeERbS3_+0x224 (0x00007F1A2745EF24 [libj9jit29.so+0x8b9f24])
_ZL22constrainIfcmplessthanPN3OMR16ValuePropagationEPN2TR4NodeES4_S4_b.constprop.225+0x137f (0x00007F1A27423B8F [libj9jit29.so+0x87eb8f])
_ZN3OMR16ValuePropagation10launchNodeEPN2TR4NodeES3_i+0x9b (0x00007F1A2745EFCB [libj9jit29.so+0x8b9fcb])
_ZN3OMR16ValuePropagation12processTreesEPN2TR7TreeTopES3_+0x191 (0x00007F1A2745F3F1 [libj9jit29.so+0x8ba3f1])
_ZN2TR22GlobalValuePropagation12processBlockEP24TR_StructureSubGraphNodebb+0x31e (0x00007F1A27448DFE [libj9jit29.so+0x8a3dfe])
_ZN2TR22GlobalValuePropagation21processRegionSubgraphEP24TR_StructureSubGraphNodebbb+0x10a (0x00007F1A2744AB8A [libj9jit29.so+0x8a5b8a])
_ZN2TR22GlobalValuePropagation18processNaturalLoopEP24TR_StructureSubGraphNodebb+0xb7 (0x00007F1A2744AD37 [libj9jit29.so+0x8a5d37])
_ZN2TR22GlobalValuePropagation21processRegionSubgraphEP24TR_StructureSubGraphNodebbb+0x10a (0x00007F1A2744AB8A [libj9jit29.so+0x8a5b8a])
_ZN2TR22GlobalValuePropagation20processAcyclicRegionEP24TR_StructureSubGraphNodebb+0x34 (0x00007F1A2744BC54 [libj9jit29.so+0x8a6c54])
_ZN2TR22GlobalValuePropagation17processRegionNodeEP24TR_StructureSubGraphNodebb+0x178 (0x00007F1A2744C1D8 [libj9jit29.so+0x8a71d8])
_ZN2TR22GlobalValuePropagation17processRegionNodeEP24TR_StructureSubGraphNodebb+0x178 (0x00007F1A2744C1D8 [libj9jit29.so+0x8a71d8])
_ZN2TR22GlobalValuePropagation17processRegionNodeEP24TR_StructureSubGraphNodebb+0x178 (0x00007F1A2744C1D8 [libj9jit29.so+0x8a71d8])
_ZN2TR22GlobalValuePropagation17processRegionNodeEP24TR_StructureSubGraphNodebb+0x178 (0x00007F1A2744C1D8 [libj9jit29.so+0x8a71d8])
_ZN2TR22GlobalValuePropagation17processRegionNodeEP24TR_StructureSubGraphNodebb+0x178 (0x00007F1A2744C1D8 [libj9jit29.so+0x8a71d8])
_ZN2TR22GlobalValuePropagation17processRegionNodeEP24TR_StructureSubGraphNodebb+0x178 (0x00007F1A2744C1D8 [libj9jit29.so+0x8a71d8])
_ZN2TR22GlobalValuePropagation17processRegionNodeEP24TR_StructureSubGraphNodebb+0x178 (0x00007F1A2744C1D8 [libj9jit29.so+0x8a71d8])
_ZN2TR22GlobalValuePropagation21processRegionSubgraphEP24TR_StructureSubGraphNodebbb+0x10a (0x00007F1A2744AB8A [libj9jit29.so+0x8a5b8a])
_ZN2TR22GlobalValuePropagation20processAcyclicRegionEP24TR_StructureSubGraphNodebb+0x34 (0x00007F1A2744BC54 [libj9jit29.so+0x8a6c54])
_ZN2TR22GlobalValuePropagation20determineConstraintsEv+0xa5 (0x00007F1A2744BEE5 [libj9jit29.so+0x8a6ee5])
_ZN2TR22GlobalValuePropagation7performEv+0x442 (0x00007F1A274500C2 [libj9jit29.so+0x8ab0c2])
_ZN3OMR9Optimizer19performOptimizationEPK20OptimizationStrategyiii+0x767 (0x00007F1A272EEAF7 [libj9jit29.so+0x749af7])
_ZN3OMR9Optimizer19performOptimizationEPK20OptimizationStrategyiii+0xcf9 (0x00007F1A272EF089 [libj9jit29.so+0x74a089])
_ZN3OMR9Optimizer19performOptimizationEPK20OptimizationStrategyiii+0xcf9 (0x00007F1A272EF089 [libj9jit29.so+0x74a089])
_ZN3OMR9Optimizer8optimizeEv+0x1db (0x00007F1A272F043B [libj9jit29.so+0x74b43b])
_ZN3OMR11Compilation7compileEv+0x925 (0x00007F1A270E5425 [libj9jit29.so+0x540425])
_ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadPNS_11CompilationEP17TR_ResolvedMethodR11TR_J9VMBaseP19TR_OptimizationPlanRKNS_16SegmentAllocatorE+0x4bf (0x00007F1A26CF27EF [libj9jit29.so+0x14d7ef])
_ZN2TR28CompilationInfoPerThreadBase14wrappedCompileEP13J9PortLibraryPv+0x314 (0x00007F1A26CF3834 [libj9jit29.so+0x14e834])
omrsig_protect+0x1e3 (0x00007F1A2CA2FC53 [libj9prt29.so+0x29c53])
_ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadP21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x309 (0x00007F1A26CF0F79 [libj9jit29.so+0x14bf79])
_ZN2TR24CompilationInfoPerThread12processEntryER21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x1c0 (0x00007F1A26CF15C0 [libj9jit29.so+0x14c5c0])
_ZN2TR24CompilationInfoPerThread14processEntriesEv+0x3b3 (0x00007F1A26CF00E3 [libj9jit29.so+0x14b0e3])
_ZN2TR24CompilationInfoPerThread3runEv+0x42 (0x00007F1A26CF05C2 [libj9jit29.so+0x14b5c2])
_Z30protectedCompilationThreadProcP13J9PortLibraryPN2TR24CompilationInfoPerThreadE+0x82 (0x00007F1A26CF0672 [libj9jit29.so+0x14b672])
omrsig_protect+0x1e3 (0x00007F1A2CA2FC53 [libj9prt29.so+0x29c53])
_Z21compilationThreadProcPv+0x1d2 (0x00007F1A26CF0AB2 [libj9jit29.so+0x14bab2])
thread_wrapper+0x162 (0x00007F1A2CBF12B2 [libj9thr29.so+0xf2b2])
start_thread+0xd9 (0x00007F1A2CEA9609 [libpthread.so.0+0x8609])
---------------------------------------
JVMDUMP039I Processing dump event "gpf", detail "" at 2022/06/20 14:38:36 - please wait.
JVMDUMP032I JVM requested System dump using '/zdata/congli/ax-exp/ax-eval/2-ax-only/79.openj9/mutant/red/core.20220620.143836.2697058.0001.dmp' in response to an event
JVMDUMP010I System dump written to /zdata/congli/ax-exp/ax-eval/2-ax-only/79.openj9/mutant/red/core.20220620.143836.2697058.0001.dmp
JVMDUMP032I JVM requested Java dump using '/zdata/congli/ax-exp/ax-eval/2-ax-only/79.openj9/mutant/red/javacore.20220620.143836.2697058.0002.txt' in response to an event
JVMDUMP010I Java dump written to /zdata/congli/ax-exp/ax-eval/2-ax-only/79.openj9/mutant/red/javacore.20220620.143836.2697058.0002.txt
JVMDUMP032I JVM requested Snap dump using '/zdata/congli/ax-exp/ax-eval/2-ax-only/79.openj9/mutant/red/Snap.20220620.143836.2697058.0003.trc' in response to an event
JVMDUMP010I Snap dump written to /zdata/congli/ax-exp/ax-eval/2-ax-only/79.openj9/mutant/red/Snap.20220620.143836.2697058.0003.trc
JVMDUMP032I JVM requested JIT dump using '/zdata/congli/ax-exp/ax-eval/2-ax-only/79.openj9/mutant/red/jitdump.20220620.143836.2697058.0004.dmp' in response to an event
JVMDUMP051I JIT dump occurred in 'JIT Compilation Thread-001' thread 0x0000000000022300
JVMDUMP049I JIT dump notified all waiting threads of the current method to be compiled
JVMDUMP054I JIT dump is tracing the IL of the method on the crashed compilation thread
JVMDUMP048I JIT dump method being compiled is an ordinary method
JVMDUMP053I JIT dump is recompiling Test.mainTest([Ljava/lang/String;)V
JVMDUMP052I JIT dump recursive crash occurred on diagnostic thread
JVMDUMP010I JIT dump written to /zdata/congli/ax-exp/ax-eval/2-ax-only/79.openj9/mutant/red/jitdump.20220620.143836.2697058.0004.dmp
JVMDUMP013I Processed dump event "gpf", detail "".

Please also check openj9-bug-79.tar.gz for all the logs (jitdump, snap, etc.), the test (Test.java, Test.class), and the unreduced test (Test.java.orig).

Notice

The given Test.java (which is reduced by us) is always reproducible for us. If it is not reproducible for you, please use Test.java.orig in the above link.

This issue crashed with the same vmState as issue #15311 but different stack trace and different JDK version.

@connglli
Copy link
Author

Sometimes, the reduced and unreduced test does not reproduce. Just repeat it for some times. It will definitely crash (at least on our desktop and server).

@pshipton pshipton added comp:jit segfault Issues that describe segfaults / JVM crashes labels Jun 20, 2022
@pshipton
Copy link
Member

No problem reproducing the crash, which also occurs with jdk8. I also saw the following

Assertion failed at /home/jenkins/workspace/Build_JDK8_x86-64_linux_Release/openj9/runtime/compiler/env/PersistentAllocator.cpp:621: block->next() == NULL
VMState: 0x000514ff
	Freeing a block that is already on the free list. block=0x7febd328f6a0 next=0x7febd328f660
compiling Test.mainTest([Ljava/lang/String;)V at level: scorching

@pshipton
Copy link
Member

@0xdaryl another one

@0xdaryl
Copy link
Contributor

0xdaryl commented Jun 21, 2022

This is not a regression in 0.33. Reproducible on JDK8 back to at least 0.29.

@0xdaryl
Copy link
Contributor

0xdaryl commented Jun 21, 2022

Original problem crashes during GVP. @hzongaro, could you add this to your list to investigate please?

@hzongaro
Copy link
Member

Ooops! I thought I had posted this comment yesterday. Anyway, I'm about to add a follow on post with more details.

I took a quick look at this running in gdb. In that run I see the crash happening in a memcpy call from TR_BitVector::setChunkSize. It looks like the _numChunks field (and more) in the TR_BitVector is corrupted:

#0  __memcpy_ssse3 () at ../sysdeps/x86_64/multiarch/memcpy-ssse3.S:132
#1  0x00007fffef723d53 in TR_BitVector::setChunkSize (this=0x7fffd25d2570, chunkSize=chunkSize@entry=1)
    at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/infra/BitVector.cpp:178
#2  0x00007fffef2aa630 in TR_BitVector::set (this=this@entry=0x7fffd25d2570, n=61)
    at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/infra/BitVector.hpp:380
#3  0x00007fffef9a8ede in TR_UseDefInfo::getUseDef_ref_body (this=0x7fffd25b2020, useIndex=61, visitedDefs=0x7fffd25d2570, defs=defs@entry=0x0)
    at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/optimizer/UseDefInfo.cpp:2680
#4  0x00007fffef9a92c7 in TR_UseDefInfo::getUseDef_ref (this=<optimized out>, useIndex=useIndex@entry=61, defs=defs@entry=0x0)
    at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/optimizer/UseDefInfo.cpp:2670
#5  0x00007fffef9a9774 in TR_UseDefInfo::getUseDef (this=<optimized out>, useDef=..., useIndex=useIndex@entry=61)
    at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/optimizer/UseDefInfo.cpp:2657
  . . .
(gdb) list 178
175        if (_chunks)
176           {
177           uint32_t chunksToCopy = (chunkSize < _numChunks) ? chunkSize : _numChunks;
178           memcpy(newChunks, _chunks, chunksToCopy*sizeof(chunk_t));

(gdb) p chunkSize
$14 = 1
(gdb) p _numChunks
$15 = -765647408
(gdb) p/x _numChunks
$16 = 0xd25d25d0
(gdb) p *this
$18 = {static nullContainerCharacteristic = -1, _chunks = 0x7fffd26517c0, _region = 0x0, _numChunks = -765647408, _firstChunkWithNonZero = 1, 
  _lastChunkWithNonZero = -1, _growable = (growable | unknown: 32766)}

@hzongaro
Copy link
Member

Upon rerunning the test in gdb with -Xjit:enableScratchMemoryDebugging,limit={Test15364.mainTest*},optLevel=veryHot, a crash happens in ABitVector<Allocator>::PopulationCount, which is called from TR_UseDefInfo::getSingleDefiningLoad during Global Value Propagation:

(gdb) where 10
#0  CS2::ABitVector<CS2::shared_allocator<CS2::heap_allocator<65536ul, 12u, TRMemoryAllocator<(TR_AllocationKind)1, 12u, 28u> > > >::PopulationCount
    (this=this@entry=0x7fffd08b45b8, numBits=4026531839) at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/cs2/bitvectr.h:1196
#1  0x00007fffef9a50f4 in CS2::ABitVector<CS2::shared_allocator<CS2::heap_allocator<65536ul, 12u, TRMemoryAllocator<(TR_AllocationKind)1, 12u, 28u> > > >::PopulationCount (numBits=4026531839, this=<optimized out>) at /usr/local/include/c++/7.5.0/bits/stl_vector.h:798
#2  TR_UseDefInfo::getSingleDefiningLoad (this=0x7fffd0ac4020, node=node@entry=0x7fffd0c861c0)
    at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/optimizer/UseDefInfo.cpp:2598
#3  0x00007fffefa4431e in OMR::ValuePropagation::mergeDefConstraints (this=this@entry=0x7fffd0b04020, node=node@entry=0x7fffd0c861c0, 
    relative=relative@entry=-1, isGlobal=@0x7fffd35548df: true, forceMerge=forceMerge@entry=false)
    at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/optimizer/OMRValuePropagation.cpp:2099
#4  0x00007fffefa56ec0 in OMR::ValuePropagation::getConstraint (this=this@entry=0x7fffd0b04020, node=node@entry=0x7fffd0c861c0, 
    isGlobal=@0x7fffd35548df: true, relative=relative@entry=0x0)

Setting a breakpoint in TR::DebugSegmentProvider::release to determine where that memory was released, it happens when a irem node is replaced during Global Value Propagation:

#0  TR::DebugSegmentProvider::release (this=0x7fffd355d7f0, segment=...)
    at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/env/DebugSegmentProvider.cpp:97
#1  0x00007fffef6ecaae in TR::Region::~Region (this=0x7fffd0ac4028, __in_chrg=<optimized out>)
    at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/env/Region.cpp:71
#2  0x00007fffef8ea152 in TR_UseDefInfo::~TR_UseDefInfo (this=0x7fffd0ac4020, __in_chrg=<optimized out>)
    at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/optimizer/UseDefInfo.hpp:81
#3  TR_UseDefInfo::~TR_UseDefInfo (this=0x7fffd0ac4020, __in_chrg=<optimized out>)
    at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/optimizer/UseDefInfo.hpp:81
#4  OMR::Optimizer::setUseDefInfo (this=this@entry=0x7fffd0ca4e60, u=u@entry=0x0)
    at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/optimizer/OMROptimizer.cpp:960
#5  0x00007fffef8eaa55 in OMR::Optimizer::prepareForNodeRemoval (this=this@entry=0x7fffd0ca4e60, node=0x7fffd0c86080,
    deferInvalidatingUseDefInfo=deferInvalidatingUseDefInfo@entry=false)
    at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/optimizer/OMROptimizer.cpp:2289
#6  0x00007fffef8eaaad in OMR::Optimizer::prepareForNodeRemoval (this=0x7fffd0ca4e60, node=0x7fffd0c860d0, 
    deferInvalidatingUseDefInfo=<optimized out>)
    at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/optimizer/OMROptimizer.cpp:2306
#7  0x00007fffef8e42cd in OMR::Optimization::replaceNode (this=this@entry=0x7fffd0b04020, node=node@entry=0x7fffd0c860d0, other=0x7fffd0c86030, 
    anchorTree=0x7fffd0cac2b0, anchorChildren=anchorChildren@entry=true)
    at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/optimizer/OMROptimization.cpp:539
#8  0x00007fffefa12e17 in removeRedundantREM (vp=vp@entry=0x7fffd0b04020, node=node@entry=0x7fffd0c860d0, 
    nodeConstraint=nodeConstraint@entry=0x7fffd086a2a0, firstChildConstraint=firstChildConstraint@entry=0x7fffd086a2a0, 
    secondChildConstraint=secondChildConstraint@entry=0x7fffd087e950)
    at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/optimizer/VPHandlers.cpp:6704
#9  0x00007fffefa13163 in constrainIrem (vp=0x7fffd0b04020, node=0x7fffd0c860d0)
    at /root/hostdir/defects/issue15311/openj9-openjdk-jdk11/omr/compiler/optimizer/VPHandlers.cpp:6764

Removing the redundant node results in OMR::Optimizer::prepareForNodeRemoval calling setUseDefInfo(NULL), which results in the OMR::Optimizer::_useDefInfo's being deleted. But, Value Propagation has a cached reference to it, and continues to use it, resulting in the crash with enableScratchMemoryDebugging or memory corruption without that option.

@hzongaro
Copy link
Member

hzongaro commented Jun 27, 2022

As Daryl pointed out, this is a long-standing problem. It appears to require the % operator to be applied in a situation where the divisor is known to be a power of ten and the dividend is known to be less than the divisor, so it's a little bit unlikely to be encountered frequently. I would suggest that the blocker label can be removed.

Having said that, I think I know how to fix the problem, but it might take me a couple of days to write up the fix and test it out. I would rather not rush it in.

@pshipton
Copy link
Member

pshipton commented Jun 27, 2022

Removing blocker and moving forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:jit segfault Issues that describe segfaults / JVM crashes userRaised
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants