Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

leakwheel GC test triggered assert #22786

Closed
stephentoub opened this issue Jan 31, 2020 · 1 comment
Closed

leakwheel GC test triggered assert #22786

stephentoub opened this issue Jan 31, 2020 · 1 comment

Comments

@stephentoub
Copy link
Member

@kouvel, it looks like this is coming from the call counting code you just added in #1457.
https://helix.dot.net/api/2019-06-17/jobs/198b2a74-8a56-43f2-84cc-7b0a1d75b132/workitems/PayloadGroup0/console

  Discovering: GC.Scenarios.XUnitWrapper
  Discovered:  GC.Scenarios.XUnitWrapper
  Starting:    GC.Scenarios.XUnitWrapper
    GC\Scenarios\LeakWheel\leakwheel\leakwheel.cmd [FAIL]
      
      Assert failure(PID 280 [0x00000118], Thread: 5488 [0x1570]): activeCodeVersion == methodDesc->GetCodeVersionManager()->GetActiveILCodeVersion(methodDesc).GetActiveNativeCodeVersion(methodDesc)
      
      CORECLR! CallCountingManager::SetCodeEntryPoint + 0x219 (0x71e7dd9a)
      CORECLR! CodeVersionManager::PublishNativeCodeVersion + 0x1C8 (0x71ef6ed9)
      CORECLR! ILCodeVersion::SetActiveNativeCodeVersion + 0xA4 (0x71ef88f7)
      CORECLR! TieredCompilationManager::ActivateCodeVersion + 0x11A (0x71f070b4)
      CORECLR! TieredCompilationManager::OptimizeMethod + 0xC5 (0x71f0983d)
      CORECLR! TieredCompilationManager::DoBackgroundWork + 0x4D4 (0x71f0884a)
      CORECLR! TieredCompilationManager::StaticBackgroundWorkCallback + 0x71 (0x71f09a71)
      CORECLR! UnManagedPerAppDomainTPCount::DispatchWorkItem + 0x1D3 (0x72121d53)
      CORECLR! ThreadpoolMgr::ExecuteWorkRequest + 0xC9 (0x720c0245)
      CORECLR! ThreadpoolMgr::WorkerThreadStart + 0x450 (0x720c4fc0)
          File: F:\workspace\_work\1\s\src\coreclr\src\vm\callcounting.cpp Line: 518
          Image: C:\h\w\A584090F\p\CoreRun.exe
      
      
      Return code:      1
      Raw output file:      C:\h\w\A584090F\w\A886090C\e\GC\Scenarios\Reports\GC.Scenarios\LeakWheel\leakwheel\leakwheel.output.txt
      Raw output:
      BEGIN EXECUTION
       "C:\h\w\A584090F\p\corerun.exe" leakwheel.dll 
      Repro with these values:
      iMem= 10 MB, iIter= 1500000, iTable=500 iSeed=-700946039
      After Delete and GCed all Objects: 69520
      After Delete and GCed all Objects: 70192
      Expected: 100
      Actual: -1073740286
      END EXECUTION - FAILED
      FAILED
      Test Harness Exitcode is : 1
      To run the test:
      > set CORE_ROOT=C:\h\w\A584090F\p
      > C:\h\w\A584090F\w\A886090C\e\GC\Scenarios\LeakWheel\leakwheel\leakwheel.cmd
      Expected: True
      Actual:   False
      Stack Trace:
        F:\workspace\_work\1\s\artifacts\tests\coreclr\Windows_NT.x86.Checked\TestWrappers\GC.Scenarios\GC.Scenarios.XUnitWrapper.cs(2371,0): at GC_Scenarios._LeakWheel_leakwheel_leakwheel_._LeakWheel_leakwheel_leakwheel_cmd()
      Output:
        
        Assert failure(PID 280 [0x00000118], Thread: 5488 [0x1570]): activeCodeVersion == methodDesc->GetCodeVersionManager()->GetActiveILCodeVersion(methodDesc).GetActiveNativeCodeVersion(methodDesc)
        
        CORECLR! CallCountingManager::SetCodeEntryPoint + 0x219 (0x71e7dd9a)
        CORECLR! CodeVersionManager::PublishNativeCodeVersion + 0x1C8 (0x71ef6ed9)
        CORECLR! ILCodeVersion::SetActiveNativeCodeVersion + 0xA4 (0x71ef88f7)
        CORECLR! TieredCompilationManager::ActivateCodeVersion + 0x11A (0x71f070b4)
        CORECLR! TieredCompilationManager::OptimizeMethod + 0xC5 (0x71f0983d)
        CORECLR! TieredCompilationManager::DoBackgroundWork + 0x4D4 (0x71f0884a)
        CORECLR! TieredCompilationManager::StaticBackgroundWorkCallback + 0x71 (0x71f09a71)
        CORECLR! UnManagedPerAppDomainTPCount::DispatchWorkItem + 0x1D3 (0x72121d53)
        CORECLR! ThreadpoolMgr::ExecuteWorkRequest + 0xC9 (0x720c0245)
        CORECLR! ThreadpoolMgr::WorkerThreadStart + 0x450 (0x720c4fc0)
            File: F:\workspace\_work\1\s\src\coreclr\src\vm\callcounting.cpp Line: 518
            Image: C:\h\w\A584090F\p\CoreRun.exe
        
        
        Return code:      1
        Raw output file:      C:\h\w\A584090F\w\A886090C\e\GC\Scenarios\Reports\GC.Scenarios\LeakWheel\leakwheel\leakwheel.output.txt
        Raw output:
        BEGIN EXECUTION
         "C:\h\w\A584090F\p\corerun.exe" leakwheel.dll 
        Repro with these values:
        iMem= 10 MB, iIter= 1500000, iTable=500 iSeed=-700946039
        After Delete and GCed all Objects: 69520
        After Delete and GCed all Objects: 70192
        Expected: 100
        Actual: -1073740286
        END EXECUTION - FAILED
        FAILED
        Test Harness Exitcode is : 1
        To run the test:
        > set CORE_ROOT=C:\h\w\A584090F\p
        > C:\h\w\A584090F\w\A886090C\e\GC\Scenarios\LeakWheel\leakwheel\leakwheel.cmd
@stephentoub stephentoub added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 31, 2020
@stephentoub stephentoub added this to the 5.0 milestone Jan 31, 2020
@BruceForstall BruceForstall added area-TieredCompilation-coreclr and removed area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Jan 31, 2020
@stephentoub
Copy link
Member Author

We can use #29934 to track.

kouvel added a commit to kouvel/runtime that referenced this issue Mar 2, 2020
- Commit 1
  - Reverts commit f954c6b, which reverted PR dotnet#1457 due to issues
- Commit 2
  - Fixes crashes and assertion failures seen by the original change, fixes dotnet#29934
  - The crashes were caused by commit dotnet@6aa3c70 in the original PR
  - Call counting infos cannot be deleted when the corresponding call counting stubs may still run, because:
    - The remaining call count decremented by the stub is in the call counting info
    - The only way to get a code version / method desc from a stub is to go through the call counting info
  - Got one repro of the assertion failure in dotnet#22786 and it is most likely caused by the same issue, following heap corruption from modifying a deleted call counting info where the memory is reused for a `NativeCodeVersionNode`, messing up the method desc pointer
  - Fixed with a partial revert of the above commit. Added back the `Complete` stage and then call counting infos are deleted only after it's ensured that call counting stubs won't be used (shortly before deleting them).
- Commit 3
  - Public static functions of `CallCountingManager` that may be called through the debugger may occur before static initialization, added a check for null as suggested in dotnet#29892
kouvel added a commit that referenced this issue Mar 3, 2020
* Improve call counting mechanism

- Commit 1
  - Reverts commit f954c6b, which reverted PR #1457 due to issues
- Commit 2
  - Fixes crashes and assertion failures seen by the original change, fixes #29934
  - The crashes were caused by commit 6aa3c70 in the original PR
  - Call counting infos cannot be deleted when the corresponding call counting stubs may still run, because:
    - The remaining call count decremented by the stub is in the call counting info
    - The only way to get a code version / method desc from a stub is to go through the call counting info
  - Got one repro of the assertion failure in #22786 and it is most likely caused by the same issue, following heap corruption from modifying a deleted call counting info where the memory is reused for a `NativeCodeVersionNode`, messing up the method desc pointer
  - Fixed with a partial revert of the above commit. Added back the `Complete` stage and then call counting infos are deleted only after it's ensured that call counting stubs won't be used (shortly before deleting them).
- Commit 3
  - Public static functions of `CallCountingManager` that may be called through the debugger may occur before static initialization, added a check for null as suggested in #29892

* Fix crashes and assertion failures seen by the original change

* Add check for null for some functions callable from the debugger
@ghost ghost locked as resolved and limited conversation to collaborators Dec 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants