Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AccessViolationException in ILLink in .NET 7 #86807

Closed
jonathanpeppers opened this issue Aug 31, 2022 · 20 comments
Closed

AccessViolationException in ILLink in .NET 7 #86807

jonathanpeppers opened this issue Aug 31, 2022 · 20 comments
Milestone

Comments

@jonathanpeppers
Copy link
Member

We have a integration test that builds 100 class libraries for use in an Android app. This works most of the time, but sometimes on Windows we hit:

    Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
       at Mono.Cecil.MetadataReader.RangesSize(Mono.Cecil.Range[])
       at Mono.Cecil.MetadataReader.ReadCustomAttributes(Mono.Cecil.ICustomAttributeProvider)
       at Mono.Cecil.Mixin+<>c.<GetCustomAttributes>b__13_0(Mono.Cecil.ICustomAttributeProvider, Mono.Cecil.MetadataReader)
       at Mono.Cecil.ModuleDefinition.Read[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.__Canon ByRef, System.__Canon, System.Func`3<System.__Canon,Mono.Cecil.MetadataReader,System.__Canon>)
       at Mono.Cecil.Mixin.GetCustomAttributes(Mono.Cecil.ICustomAttributeProvider, Mono.Collections.Generic.Collection`1<Mono.Cecil.CustomAttribute> ByRef, Mono.Cecil.ModuleDefinition)
       at Mono.Cecil.MethodDefinition.get_CustomAttributes()
       at Microsoft.Android.Sdk.ILLink.ApplyPreserveAttribute.GetPreserveAttributes(Mono.Cecil.ICustomAttributeProvider)
       at Microsoft.Android.Sdk.ILLink.ApplyPreserveAttribute.MarkMethodIfPreserved(Mono.Cecil.MethodDefinition)
       at Microsoft.Android.Sdk.ILLink.ApplyPreserveAttribute.ProcessMethod(Mono.Cecil.MethodDefinition)
       at Mono.Linker.Steps.MarkSubStepsDispatcher.DispatchMethod(Mono.Cecil.MethodDefinition)
       at Mono.Linker.Steps.MarkSubStepsDispatcher.BrowseTypes(Mono.Collections.Generic.Collection`1<Mono.Cecil.TypeDefinition>)
       at Mono.Linker.Steps.MarkSubStepsDispatcher.BrowseTypes(Mono.Collections.Generic.Collection`1<Mono.Cecil.TypeDefinition>)
       at Mono.Linker.Steps.MarkSubStepsDispatcher.BrowseAssembly(Mono.Cecil.AssemblyDefinition)
       at Mono.Linker.Steps.MarkSubStepsDispatcher.<Initialize>b__5_0(Mono.Cecil.AssemblyDefinition)
       at Mono.Linker.Steps.MarkStep.MarkAssembly(Mono.Cecil.AssemblyDefinition, Mono.Linker.DependencyInfo)
       at Mono.Linker.Steps.MarkStep.MarkModule(Mono.Cecil.ModuleDefinition, Mono.Linker.DependencyInfo)
       at Mono.Linker.Steps.MarkStep.MarkType(Mono.Cecil.TypeReference, Mono.Linker.DependencyInfo, System.Nullable`1<Mono.Linker.MessageOrigin>)
       at Mono.Linker.Steps.MarkStep.ProcessMarkedPending()
       at Mono.Linker.Steps.MarkStep.Initialize()
       at Mono.Linker.Steps.MarkStep.Process(Mono.Linker.LinkContext)
       at Mono.Linker.Pipeline.ProcessStep(Mono.Linker.LinkContext, Mono.Linker.Steps.IStep)
       at Mono.Linker.Pipeline.Process(Mono.Linker.LinkContext)
       at Mono.Linker.Driver.Run(Mono.Linker.ILogger)
       at Mono.Linker.Driver.Main(System.String[])
    Errors
        C:\a\_work\1\s\bin\Release\dotnet\sdk\7.0.100-rc.1.22425.9\Sdks\Microsoft.NET.ILLink.Tasks\build\Microsoft.NET.ILLink.targets(133,5): error MSB6006: "dotnet.exe" exited with code 57005. [C:\a\_work\1\a\TestRelease\08-31_14.34.17\temp\BuildAMassiveApp\App1\App1.csproj]
    OutputProperties
        _ILLinkExitCode = 57005

I believe this started happening here:

This Maestro bump, I believe, was the first time we ever saw something like this.

Logs: accessviolation.zip

Let me know if you need the entire project, I believe the solution with 100+ class libraries is too large to attach here.

/fyi @sbomer

@sbomer
Copy link
Member

sbomer commented Aug 31, 2022

Unfortunately I can't tell much from the stacktrace or binlog - I think we'll need a repro. Would you be able to share the project?

@vitek-karas do you think your repro tool would be able to extract a repro from this binlog? If so, could you please share instructions with @jonathanpeppers?

@jonathanpeppers
Copy link
Member Author

We have this build artifact:

image

I'll see if I can put it somewhere to share in Teams.

@vitek-karas
Copy link
Member

My repro tool would only work on the machine where the failure happens, since it would copy out the files from the local paths mentioned in binlog.

@marek-safar
Copy link
Contributor

I recall seeing this problem before and the workaround was to disable server GC.

@jonathanpeppers
Copy link
Member Author

@marek-safar I think that MSBuild / dotnet build would be using the workstation GC by default:

https://docs.microsoft.com/en-us/dotnet/core/runtime-config/garbage-collector#flavors-of-garbage-collection

I don't think we do anything to enable server GC, does the .NET SDK or MSBuild do it?

(note we also are only seeing this issue on Windows)

@sbomer
Copy link
Member

sbomer commented Sep 7, 2022

ILLink is configured to use server GC by default: https://github.com/dotnet/linker/blob/fdcff4c6e14b48f40011d5c8538053f84ab1d0e0/src/linker/Mono.Linker.csproj#L5 (it runs as a separate process, not hosted by MSBuild). I think you can override it with an environment variable: https://docs.microsoft.com/en-us/dotnet/core/runtime-config/garbage-collector#workstation-vs-server

@jonathanpeppers
Copy link
Member Author

Ok, yeah if I look on disk then illink.runtimeconfig.json has:

    "configProperties": {
      "System.GC.Server": true,

I'll see if setting %DOTNET_gcServer% to 0 does anything. I'm not sure if the env var overwrites the runtimeconfig.json.

jonathanpeppers referenced this issue in jonathanpeppers/xamarin-android Sep 8, 2022
Context: https://github.com/dotnet/linker/issues/3012#issuecomment-1239958027

See if disabling the "server GC" in favor of the "workstation GC" helps a failing test.
jonpryor referenced this issue in dotnet/android Sep 12, 2022
Context: https://github.com/dotnet/linker/issues/3012
Context: https://github.com/dotnet/linker/issues/3012#issuecomment-1239958027

Occasionally, `AotTests.BuildAMassiveApp()` will fail, because the
linker crashed:

	Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
	   at Mono.Cecil.MetadataReader.RangesSize(Mono.Cecil.Range[])
	   at Mono.Cecil.MetadataReader.ReadCustomAttributes(Mono.Cecil.ICustomAttributeProvider)
	   at Mono.Cecil.Mixin+<>c.<GetCustomAttributes>b__13_0(Mono.Cecil.ICustomAttributeProvider, Mono.Cecil.MetadataReader)
	   at Mono.Cecil.ModuleDefinition.Read[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.__Canon ByRef, System.__Canon, System.Func`3<System.__Canon,Mono.Cecil.MetadataReader,System.__Canon>)
	   at Mono.Cecil.Mixin.GetCustomAttributes(Mono.Cecil.ICustomAttributeProvider, Mono.Collections.Generic.Collection`1<Mono.Cecil.CustomAttribute> ByRef, Mono.Cecil.ModuleDefinition)
	   at Mono.Cecil.MethodDefinition.get_CustomAttributes()
	   at Microsoft.Android.Sdk.ILLink.ApplyPreserveAttribute.GetPreserveAttributes(Mono.Cecil.ICustomAttributeProvider)
	   …

The ILLink utility is run in a separate process, and is configured to
use the Server GC by default.

We believe that the `AccessViolationException` is a bug in the
Server GC, and the [suggested workaround][0] was to disable the
server GC.

Export `DOTNET_gcServer=0` so that the Server GC is disabled, in an
effort to prevent the `AccessViolationException`.

[0]: https://github.com/dotnet/linker/issues/3012#issuecomment-1237823998
@jonathanpeppers
Copy link
Member Author

@sbomer FYI, a PR build passed 4 times with the change above. ^^ So we merged it.

So exporting $DOTNET_gcServer as 0, does seem to fix the problem?

@MichalStrehovsky
Copy link
Member

Is there an issue tracking this in the runtime repo? I don't think we would want to work around apparent managed heap corruptions without understanding the root cause.

Cc @mangod9

@mangod9
Copy link
Member

mangod9 commented Sep 14, 2022

yeah if there is a dump available for this we could take a look. fyi @AntonLapounov

@jonathanpeppers
Copy link
Member Author

jonathanpeppers commented Sep 14, 2022

It appears to only happen on Azure DevOps 1 in 3 tries or so during an integration test that runs dotnet build out of process. I'm not sure how we can easily get a dump? We'd have to somehow keep dotnet build (or illink) alive and save one?

@AntonLapounov
Copy link
Member

You can enable collecting crash dumps by setting the registry, but I don't know how easy it would be to obtain the collected dumps from the build machines.

@marek-safar
Copy link
Contributor

Do we really support getting crash dumps by hacking the registry with admin rights only?

@jonathanpeppers
Copy link
Member Author

@sbomer @marek-safar we started seeing this in dotnet/maui (still in .NET 7):

dotnet/maui#15275 (review)

Should we move this issue to dotnet/runtime? Worth investigating for a fix in .NET 8, etc.?

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label May 26, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label May 26, 2023
@sbomer sbomer transferred this issue from dotnet/linker May 26, 2023
@danmoseley danmoseley added os-android area-Tools-ILLink .NET linker development as well as trimming analyzers and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels May 26, 2023
@ghost
Copy link

ghost commented May 26, 2023

Tagging subscribers to 'arch-android': @steveisok, @akoeplinger
See info in area-owners.md if you want to be subscribed.

Issue Details

We have a integration test that builds 100 class libraries for use in an Android app. This works most of the time, but sometimes on Windows we hit:

    Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
       at Mono.Cecil.MetadataReader.RangesSize(Mono.Cecil.Range[])
       at Mono.Cecil.MetadataReader.ReadCustomAttributes(Mono.Cecil.ICustomAttributeProvider)
       at Mono.Cecil.Mixin+<>c.<GetCustomAttributes>b__13_0(Mono.Cecil.ICustomAttributeProvider, Mono.Cecil.MetadataReader)
       at Mono.Cecil.ModuleDefinition.Read[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.__Canon ByRef, System.__Canon, System.Func`3<System.__Canon,Mono.Cecil.MetadataReader,System.__Canon>)
       at Mono.Cecil.Mixin.GetCustomAttributes(Mono.Cecil.ICustomAttributeProvider, Mono.Collections.Generic.Collection`1<Mono.Cecil.CustomAttribute> ByRef, Mono.Cecil.ModuleDefinition)
       at Mono.Cecil.MethodDefinition.get_CustomAttributes()
       at Microsoft.Android.Sdk.ILLink.ApplyPreserveAttribute.GetPreserveAttributes(Mono.Cecil.ICustomAttributeProvider)
       at Microsoft.Android.Sdk.ILLink.ApplyPreserveAttribute.MarkMethodIfPreserved(Mono.Cecil.MethodDefinition)
       at Microsoft.Android.Sdk.ILLink.ApplyPreserveAttribute.ProcessMethod(Mono.Cecil.MethodDefinition)
       at Mono.Linker.Steps.MarkSubStepsDispatcher.DispatchMethod(Mono.Cecil.MethodDefinition)
       at Mono.Linker.Steps.MarkSubStepsDispatcher.BrowseTypes(Mono.Collections.Generic.Collection`1<Mono.Cecil.TypeDefinition>)
       at Mono.Linker.Steps.MarkSubStepsDispatcher.BrowseTypes(Mono.Collections.Generic.Collection`1<Mono.Cecil.TypeDefinition>)
       at Mono.Linker.Steps.MarkSubStepsDispatcher.BrowseAssembly(Mono.Cecil.AssemblyDefinition)
       at Mono.Linker.Steps.MarkSubStepsDispatcher.<Initialize>b__5_0(Mono.Cecil.AssemblyDefinition)
       at Mono.Linker.Steps.MarkStep.MarkAssembly(Mono.Cecil.AssemblyDefinition, Mono.Linker.DependencyInfo)
       at Mono.Linker.Steps.MarkStep.MarkModule(Mono.Cecil.ModuleDefinition, Mono.Linker.DependencyInfo)
       at Mono.Linker.Steps.MarkStep.MarkType(Mono.Cecil.TypeReference, Mono.Linker.DependencyInfo, System.Nullable`1<Mono.Linker.MessageOrigin>)
       at Mono.Linker.Steps.MarkStep.ProcessMarkedPending()
       at Mono.Linker.Steps.MarkStep.Initialize()
       at Mono.Linker.Steps.MarkStep.Process(Mono.Linker.LinkContext)
       at Mono.Linker.Pipeline.ProcessStep(Mono.Linker.LinkContext, Mono.Linker.Steps.IStep)
       at Mono.Linker.Pipeline.Process(Mono.Linker.LinkContext)
       at Mono.Linker.Driver.Run(Mono.Linker.ILogger)
       at Mono.Linker.Driver.Main(System.String[])
    Errors
        C:\a\_work\1\s\bin\Release\dotnet\sdk\7.0.100-rc.1.22425.9\Sdks\Microsoft.NET.ILLink.Tasks\build\Microsoft.NET.ILLink.targets(133,5): error MSB6006: "dotnet.exe" exited with code 57005. [C:\a\_work\1\a\TestRelease\08-31_14.34.17\temp\BuildAMassiveApp\App1\App1.csproj]
    OutputProperties
        _ILLinkExitCode = 57005

I believe this started happening here:

This Maestro bump, I believe, was the first time we ever saw something like this.

Logs: accessviolation.zip

Let me know if you need the entire project, I believe the solution with 100+ class libraries is too large to attach here.

/fyi @sbomer

Author: jonathanpeppers
Assignees: -
Labels:

os-android, untriaged, area-Tools-ILLink

Milestone: -

@marek-safar marek-safar removed the untriaged New issue has not been triaged by the area owner label May 29, 2023
@marek-safar marek-safar added this to the 8.0.0 milestone May 29, 2023
@marek-safar
Copy link
Contributor

@vitek-karas should we remove server forcing GC on desktop? If I recall the gain was not very significant and it seems to make illink unstable.

@MichalStrehovsky
Copy link
Member

@vitek-karas should we remove server forcing GC on desktop? If I recall the gain was not very significant and it seems to make illink unstable.

It was 20%: dotnet/linker#1493

We should move this to the VM area. Server GC is a supported configuration option and our flagship appmodels (ASP.NET) use it by default.

Is this seen only on 7.0? Or is the workaround also used in 8.0 and we don't have data?

@MichalStrehovsky MichalStrehovsky removed os-android area-Tools-ILLink .NET linker development as well as trimming analyzers labels May 29, 2023
@jonathanpeppers
Copy link
Member Author

jonathanpeppers commented May 30, 2023

I don't think we've noticed this issue in .NET 8 yet (only .NET 7), but I am testing here:

dotnet/android#8085

We have an integration test that builds an Android app with 100 class libraries -- I think it could reproduce this if I rerun it a couple times.

jonathanpeppers added a commit to dotnet/android that referenced this issue Jun 12, 2023
Context: https://github.com/dotnet/linker/issues/3012
Context: dotnet/runtime#86807

In 34986b9, we exported `DOTNET_gcServer=0` so that the Server GC is disabled, in an
effort to prevent an `AccessViolationException`.

It appears that this might be *fixed* in .NET 8. Let's remove the workaround, so we
can either close the issue or get further investigation on the issue.
@jonathanpeppers
Copy link
Member Author

jonathanpeppers commented Jun 12, 2023

It passed for us three times in a row using .NET 8 (main/preview 6 builds). I took out the workaround, and we'll see if this happens again.

@mangod9
Copy link
Member

mangod9 commented Jul 12, 2023

@jonathanpeppers hoping this is ok to close now?

@ghost ghost locked as resolved and limited conversation to collaborators Aug 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Archived in project
Development

No branches or pull requests

8 participants