-
Notifications
You must be signed in to change notification settings - Fork 2.7k
[3.1] Fail FuncEval if slot backpatching lock is held by any thread #28006
Conversation
- In many cases cooperative GC mode is entered after acquiring the slot backpatching lock and the thread may block for debugger suspension while holding the lock. A FuncEval may time out on entering the lock if for example it calls a virtual or interface method for the first time. Failing the FuncEval when the lock is held enables the debugger to fall back to other options for expression evaluation. - Also added polls for debugger suspension before acquiring the slot backpatching lock on background threads that often operate in preemptive GC mode. A common case is when the debugger breaks while the tiering delay timer is active, the timer ticks shortly afterwards (after debugger suspension completes) and if a thread pool thread is already available, the background thread would block while holding the lock. The poll checks for debugger suspension and pulses the GC mode to block before acquiring the lock. Risks: - The fix is only a heuristic and lessens the problem when it is detected that the lock is held by some thread. Since the lock is acquired in preemptive GC mode, it is still possible that after the check at the start of a FuncEval, another thread acquires the lock and the FuncEval may time out. The polling makes it less likely for the lock to be taken by background tiering work, for example if a FuncEval starts while rejitting a method. - The expression evaluation experience may be worse when it is detected that the lock is held, and may still happen from unfortunate timing - Low risk for the change itself Port of dotnet/runtime#2380 Fix for dotnet/runtime#1537
@noahfalk could you please take a look? This is a 3.1 port of the similar change to master: dotnet/runtime#2380 |
@jkotas reverted this change in master. Please investigate why. |
The change had conflicts in master with the other big tiered compilations changes that had to be reverted to fix the stability problems. I was not sure whether these two changes are independent, so I have reverted both. |
The change in master would be slightly different before and after the reverted TC change, I'll probably make a separate PR in master after investigating and fixing those issues. The equivalent of this change in master wasn't the cause of the issues, and this port still applies in 3.1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @kouvel, this did not cause the issues in master, and believe it will address the funceval problems in 3.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved. I will seek tactics approval for March
@jeffschwMSFT @kouvel is this for 3.1.3 (march)? If so please merge by EOD today if it gets approval |
Tactics approved for March 3.1.3 |
Thanks all! |
Customer impact
Regression?
Yes, regression from 2.x, starting in 3.0
Testing
Risks
Port of dotnet/runtime#2380
Fix for dotnet/runtime#1537