-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mismatch between domain reload handling in soft debugger client and agent. #57
Comments
Any thoughts on this? I will start work on this soon in our own branch and would like to coordinate the effort on getting it into master. |
I didn't investigate much into this... But if we get all info needed over SDB protocol to implement this for example to which AppDomain Assembly Load/Unload belongs and to which AppDomain/Assembly TypeMirrors belong... I see no reason for not accepting this... |
Sounds good. If possible, I will open smaller PRs with incremental changes to fix this issue. |
I had a look into this issue and found out that in order to fix this issue, the debugger protocol has to be updated to include the domain id for a type loads and possibly for assemblies as well. Also, other classes that call into SoftDebuggerSession.GetType do not take the domain into account. The issue I'm trying to fix is evaluation of enums, where you would sometimes get "ERR_UNLOADED" if you use domain reloads. This happens because evaluation of enums uses System.Int64 in ObjectValueAdaptor. What can happen the in debugger agent is that System.Int64 from mscorlib can be be associated with a "child" domain, which is not the default/"root" domain. The mscorlib assembly is always loaded by the root domain and is never unloaded. If the "child" domain is unloaded, then System.Int64 / child domain type pair is also unloaded in the debugger agent. But, because mscorlib is never unloaded and an assembly unload event is never sent, the SoftDebuggerSession never removes the System.Int64 from it's "types" cache. So when System.Int64 is requested in order to evaluate an enum, the TypeMirror id in the SoftDebuggerSession types cache has been unloaded in the debugger agent. When evaluating enums and trying to get "System.Int64", SoftDebuggerAdaptor falls back to trying to find the type using assemblies and other approaches if SoftDebuggerSession.GetType() fails. https://github.com/mono/debugger-libs/blob/master/Mono.Debugging.Soft/SoftDebuggerAdaptor.cs#L1339 So instead of fixing enum evaluation by updating the debugger protocol and changing the debugger-libs, I've made a workaround in SoftDebuggerSession.GetType by checking whether the requested type has been unloaded. This fix is backwards compatible with older versions of the debugger agent. What are your thoughts on this workaround? |
Looking at HandleAppDomainUnloadEvents I noticed AppDomainUnloadEvent which has Domain property which has GetAssemblies method... I'm assuming this returns different assemblies then root domain has?(or am I wrong?)... What I'm trying to say is... Maybe we can do something like Problem that I see with current hack... every time after 1st app domain is unloaded every call to GetType we execute sdb command(sometimes 2) which is pretty expensive on slow USB latency :( We are trying really hard to keep sdb communication to minimum... what could be option... is to add new dictionary which would be cleared on domain unload and filled every time when perform check(and when type is loaded) so we don't check every time...(unloaded types are removed from types dictionary but valid ones stay and are rechecked every time...) |
I understand your concern about additional sdb communication every time GetType is called and domain unloads are in use. The problem with So how about this: Add a What do you think? |
A bit weird that domain has assemblies unloaded when event happens since it can't unload assemblies 1 by 1(I'm just saying it would make more sense to always return assemblies since returning empty doesn't mean much... but w/e) Ye, imo making list of loaded assemblies into domain via assembly loaded event... and using that list to remove types looks very logical and promising... |
The debugger agent listens on MONO_PROFILE_END_UNLOAD events, so it gets the event after the assemblies have been unloaded. New PR #60 to fix this issue. |
My proposed fixes do not solve this issue without side-effects. I will work-around this issue in our own fork and revisit this issue with a proper fix in the future. |
Adds ppdb support
I'm submitting this issue as a starting point for a discussion on how to fix a mismatch between how the soft debugger client and agent handle domain reloads.
The debugger agent registers new types (mirrors) with a type + domain pair.
https://github.com/mono/mono/blob/master/mono/mini/debugger-agent.c#L2276
and the types are marked as unloaded when a domain is unloaded.
https://github.com/mono/mono/blob/master/mono/mini/debugger-agent.c#L2210
In the client there is a single cache for all the types across domains
https://github.com/mono/debugger-libs/blob/master/Mono.Debugging.Soft/SoftDebuggerSession.cs#L60
and the types are removed when an assembly is unloaded.
https://github.com/mono/debugger-libs/blob/master/Mono.Debugging.Soft/SoftDebuggerSession.cs#L1879
This causes issues where the agent can unload a type associated with a domain, but if the assembly for the type is not unloaded (due to it being used in another domain), then the client will have a reference to a type that is marked as unloaded in the agent.
Zoltan Varga told me the following when I asked whether this was problem in the debugger agent: "Conceptually, assemblies are loaded into each domain they are referenced, so a reference to a type is really a reference to a type+domain pair, this is how the runtime code works. I think the problem is in the debugger-libs code, which should handle this the same way as the runtime code does, i.e. keep things in domain specific caches, and invalidate those caches when a domain is unloaded, not when an assembly is unloaded."
Would you accept PRs that address this issue and if so, do you have any suggestions for how to go about fixing it?
The text was updated successfully, but these errors were encountered: