Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test failure System.Runtime.InteropServices.HandleCollectorTests.TestHandleCollector #66571

Closed
VincentBu opened this issue Mar 14, 2022 · 16 comments · Fixed by #68285 or #68652
Closed
Labels
arch-arm64 arch-x64 area-System.Runtime.InteropServices blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' JitStress CLR JIT issues involving JIT internal stress modes os-windows

Comments

@VincentBu
Copy link
Contributor

VincentBu commented Mar 14, 2022

Run: runtime-coreclr libraries-jitstressregs 20220313.1

Failed test:

net7.0-windows-Release-arm64-CoreCLR_checked-jitstressregs1-Windows.10.Arm64v8.Open

- System.Runtime.InteropServices.HandleCollectorTests.TestHandleCollector

net7.0-windows-Release-x64-CoreCLR_checked-jitstressregs0x10-Windows.10.Amd64.Open

- System.Runtime.InteropServices.HandleCollectorTests.TestHandleCollector

net7.0-windows-Release-x64-CoreCLR_checked-jitstressregs0x1000-Windows.10.Amd64.Open

- System.Runtime.InteropServices.HandleCollectorTests.TestHandleCollector

Error message:

High limit handle did not trigger a GC
Expected: True
Actual:   False


Stack trace

Runfo Tracking Issue: system.runtime.interopservices.handlecollectortests.testhandlecollector

Build Definition Kind Run Name Console Core Dump Test Results Run Client
1728243 runtime PR 68251 net7.0-windows-Debug-x64-CoreCLR_release-Windows.10.Amd64.Server2022.ES.Open console.log runclient.py
1727199 runtime PR 68236 net7.0-windows-Debug-x64-CoreCLR_checked-Windows.10.Amd64.Open console.log runclient.py
1726405 runtime PR 68219 net7.0-windows-Debug-x64-CoreCLR_checked-Windows.10.Amd64.Open console.log runclient.py
1726335 runtime PR 68221 net7.0-windows-Debug-x64-CoreCLR_checked-Windows.10.Amd64.Open console.log runclient.py
1724082 runtime PR 68077 net7.0-windows-Debug-x64-CoreCLR_release-Windows.81.Amd64.Open console.log runclient.py
1723556 runtime PR 67903 net7.0-windows-Debug-x64-CoreCLR_checked-Windows.10.Amd64.Open console.log runclient.py
1723058 runtime PR 68141 net7.0-windows-Debug-x64-CoreCLR_checked-Windows.10.Amd64.Open console.log runclient.py
1722815 runtime PR 68130 net7.0-windows-Debug-x64-CoreCLR_checked-Windows.10.Amd64.Open console.log runclient.py
1722478 runtime Rolling net7.0-windows-Release-x64-CoreCLR_checked-Windows.10.Amd64.Open console.log runclient.py

Build Result Summary

Day Hit Count Week Hit Count Month Hit Count
2 9 9
@ghost
Copy link

ghost commented Mar 14, 2022

Tagging subscribers to this area: @dotnet/interop-contrib
See info in area-owners.md if you want to be subscribed.

Issue Details

Run: runtime-coreclr libraries-jitstressregs 20220313.1

Failed test:

net7.0-windows-Release-arm64-CoreCLR_checked-jitstressregs1-Windows.10.Arm64v8.Open

- System.Runtime.InteropServices.HandleCollectorTests.TestHandleCollector

net7.0-windows-Release-x64-CoreCLR_checked-jitstressregs0x10-Windows.10.Amd64.Open

- System.Runtime.InteropServices.HandleCollectorTests.TestHandleCollector

net7.0-windows-Release-x64-CoreCLR_checked-jitstressregs0x1000-Windows.10.Amd64.Open

- System.Runtime.InteropServices.HandleCollectorTests.TestHandleCollector

Error message:

High limit handle did not trigger a GC
Expected: True
Actual:   False


Stack trace
Author: VincentBu
Assignees: -
Labels:

arch-arm64, area-System.Runtime.InteropServices, os-windows, arch-x64

Milestone: -

@jkoritzinsky jkoritzinsky added the JitStress CLR JIT issues involving JIT internal stress modes label Mar 14, 2022
@jkoritzinsky
Copy link
Member

This might not be only JitStress (I'm seeing intermittent repros in some PR failures in the runtime pipeline), but JitStress might be the best way to repro it

@VincentBu
Copy link
Contributor Author

Failed again in: runtime-coreclr libraries-jitstress 20220322.1

Failed test:

net7.0-windows-Release-x64-CoreCLR_checked-jitstress1_tiered-Windows.10.Amd64.Open

- System.Runtime.InteropServices.HandleCollectorTests.TestHandleCollector

Error message:

High limit handle did not trigger a GC
Expected: True
Actual:   False


Stack trace
   at System.Runtime.InteropServices.HandleCollectorTests.TestHandleCollector() in /_/src/libraries/System.Runtime.InteropServices/tests/System.Runtime.InteropServices.UnitTests/System/Runtime/InteropServices/HandleCollectorTests.cs:line 144

@VincentBu
Copy link
Contributor Author

Failed again in: runtime-coreclr libraries-jitstress 20220405.1

Failed test:

net7.0-windows-Release-arm64-CoreCLR_checked-jitstress2-Windows.10.Arm64v8.Open

- System.Runtime.InteropServices.HandleCollectorTests.TestHandleCollector

Error message:

High limit handle did not trigger a GC
Expected: True
Actual:   False


Stack trace

@VincentBu
Copy link
Contributor Author

Failed again in: runtime-coreclr libraries-jitstress 20220413.1

Failed test:

net7.0-windows-Release-arm64-CoreCLR_checked-zapdisable-Windows.10.Arm64v8.Open

- System.Runtime.InteropServices.HandleCollectorTests.TestHandleCollector

Error message:

High limit handle did not trigger a GC
Expected: True
Actual:   False


Stack trace
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
   at System.Reflection.RuntimeMethodInfo.InvokeNonEmitUnsafe(Object obj, IntPtr* arguments, Span`1 argsForTemporaryMonoSupport, BindingFlags invokeAttr)

@ViktorHofer
Copy link
Member

ViktorHofer commented Apr 20, 2022

Happened again in https://dev.azure.com/dnceng/public/_build/results?buildId=1726335&view=ms.vss-test-web.build-test-results-tab&runId=46812918&resultId=146155&paneView=debug. I converted this issue to a live tracking one.

@AaronRobinsonMSFT @jkoritzinsky @jeffschwMSFT can you please take a look and disable the test if necessary? It hits 3 times per day...

@AaronRobinsonMSFT
Copy link
Member

@ViktorHofer Someone will look into this today.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Apr 20, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Apr 20, 2022
@elinor-fung
Copy link
Member

@AaronRobinsonMSFT it looks like this is still happening

System.Runtime.InteropServices.HandleCollectorTests.TestHandleCollector [FAIL]
      High limit handle did not trigger a GC: 21 < 21
      Expected: True
      Actual:   False
      Stack Trace:

Console log

13 hits in the past five days after #68285 went in: https://runfo.azurewebsites.net/search/tests/?q=started%3A%7E5+definition%3Aruntime+name%3ATestHandleCollector

@AaronRobinsonMSFT
Copy link
Member

@elinor-fung Thanks for keeping an eye on this. The new logging is very interesting. Based on the above log output there is something odd going on here. The HighLimitSize is hardcoded to 100000 so I would expect there to always be at least 1 additional collected object, but clearly that is not occurring under some scenario. I can think of three possible scenarios, others are welcome.

  1. The "High limit" loop wasn't run. The JIT elided it entirely or some other optimization caused it to not execute any iterations.

HandleCollector highLimitCollector = new HandleCollector("HighLimit.Collector", HighLimitSize);
for (int i = 0; i < HighLimitSize + 10; ++i)
{
HandleLimitTester hlt = new HandleLimitTester(highLimitCollector);
}

  1. The GC didn't allocate new memory for each iteration for some reason and reused existing object instances. This could mean the "collected" count didn't change - I don't know if that is possible.

  2. Some combination of (1) and (2) around a JIT optimization and the GC doing what it is told that changes how this is expected to run.

The ctor for HandleLimitTester is doing all the right things in my mind:

internal HandleLimitTester(HandleCollector collector)
{
_collector = collector;
_collector.Add();
GC.KeepAlive(this);
}

I'm not really sure what is occurring, and I've been unable to reproduce this locally under any DOTNET_ variable combination. Perhaps @jkotas or @Maoni0 could help educate me on what could be happening here.

@jkotas
Copy link
Member

jkotas commented Apr 26, 2022

This test can be running together with all other System.Runtime.InteropServices tests. Those other tests can be allocating and triggering GCs (ie altering the global state that this test asserts are based on). Is the logic of the test robust against that?

@AaronRobinsonMSFT
Copy link
Member

Yes, I believe so. All we are attempting to validate is that the sum of GC.CollectionCount(0) + GC.CollectionCount(1) + GC.CollectionCount(2) after the above loop is greater than the same computed sum before the loop. The test is arguably not that useful in a situation where multiple tests are running so it is odd that it fails at all since we simply need 1 additional object to be collected for the test to pass.

Let me confirm again the logic in HandleCollector.Add() and confirm that GC.Collect() will be called more than once.

@VincentBu
Copy link
Contributor Author

Failed again in: runtime-coreclr libraries-jitstress 20220427.1

Failed test:

net7.0-windows-Release-arm64-CoreCLR_checked-jitminopts-Windows.10.Arm64v8.Open

- System.Runtime.InteropServices.HandleCollectorTests.TestHandleCollector

Error message:

High limit handle did not trigger a GC: 41 < 41
Expected: True
Actual:   False


Stack trace
   at System.Runtime.InteropServices.HandleCollectorTests.TestHandleCollector() in /_/src/libraries/System.Runtime.InteropServices/tests/System.Runtime.InteropServices.UnitTests/System/Runtime/InteropServices/HandleCollectorTests.cs:line 153
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
   at System.Reflection.RuntimeMethodInfo.InvokeNonEmitUnsafe(Object obj, IntPtr* arguments, Span`1 argsForTemporaryMonoSupport, BindingFlags invokeAttr) in /_/src/coreclr/System.Private.CoreLib/src/System/Reflection/RuntimeMethodInfo.CoreCLR.cs:line 383

@AaronRobinsonMSFT
Copy link
Member

I finally see what is going on here. This is a classic lack of memory pressure and the GC is making the right call and saying: "leave me alone". I am working on a fix to provide the needed pressure.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Apr 28, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Apr 29, 2022
@jeffhandley jeffhandley removed the untriaged New issue has not been triaged by the area owner label May 6, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Jun 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-arm64 arch-x64 area-System.Runtime.InteropServices blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' JitStress CLR JIT issues involving JIT internal stress modes os-windows
Projects
None yet
10 participants