Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NativeAOT] ordering issue around ensuring static constructor execution #81151

Closed
VSadov opened this issue Jan 25, 2023 · 10 comments · Fixed by #106004
Closed

[NativeAOT] ordering issue around ensuring static constructor execution #81151

VSadov opened this issue Jan 25, 2023 · 10 comments · Fixed by #106004
Labels
area-NativeAOT-coreclr in-pr There is an active PR which will close this issue when it is merged
Milestone

Comments

@VSadov
Copy link
Member

VSadov commented Jan 25, 2023

Re:

We check pContext->initialized with an ordinary read, before accessing the static fields. There is a possibility that we will read the fields before reading the flag on weak-ordered architectures.

The code orders the write to the flag after running constructors by doing full memory fence, but that is both too much and not sufficient.
Too much is because just doing Volatile.Write(ref pContext->initialized, 1) is enough to guarantee the order of writes.
Not sufficient because the read side can still read ahead of the flag check.

A simple fix would be to do volatile read of the flag when checking.
However the same flag check is emitted by the JIT. Not sure if we emit ldar there. I suspect we do not. If that is a case, emitting ldar would be a simple solution.

If ultimately, we do not want to emit ldar, an alternative could be using a slightly different pattern where initialized instead of being a condition flag, could be a pointer used to access the statics. I.E. if it is null the cctor has not run yet, and if it is not null, use it as a pointer (or an offset even if it is fixed, as long as it is used to compute the address) to access static state, then we could benefit from data dependency ordering the reads.

Replacing Interlocked.MemoryBarrier() with a process-wide barrier would work too, but seems too heavy.

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jan 25, 2023
@ghost
Copy link

ghost commented Jan 25, 2023

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

Re:

We check pContext->initialized with an ordinary read, before accessing the static fields. There is a possibility that we will read the fields before reading the flag on weak-ordered architectures.

The code orders the write to the flag after running constructors by doing full memory fence, but that is both too much and not sufficient.
Too much is because just doing Volatile.Write(ref pContext->initialized, 1) is enough to guarantee the order of writes.
Not sufficient because the read side can still read ahead of the flag check.

A simple fix would be to do volatile read of the flag when checking.
However the same flag check is emitted by the JIT. Not sure if we emit ldar there. I suspect we do not. If that is a case emitting ldar would be a simple solution.

If ultimately, we do not want to emit ldar, an alternative could be using slightly different pattern where initialized instead of being a condition flag, could be a pointer used to access the statics. I.E. if it is null the cctor has not run yet, and if it is not null, use it as a pointer (or an offset even if it is fixed, as long as it is used to compute the address) to access static state, then we could benefit from data dependency ordering the reads.

Replacing Interlocked.MemoryBarrier() witha process-wide barrier would work too, but seems too heavy.

Author: VSadov
Assignees: -
Labels:

untriaged, area-NativeAOT-coreclr

Milestone: -

@jkotas
Copy link
Member

jkotas commented Jan 25, 2023

Does standard CoreCLR have the same problem? I do not see any barriers there either.

@VSadov
Copy link
Member Author

VSadov commented Jan 25, 2023

There is even less synchronization in CoreCLR. Setting the flag (ClassInitFlags::INITIALIZED_FLAG) does not try to order it after running the cctor. If I am looking at the right place.

CoreCLR implementation is more complex too. Perhaps it is just timing between completion of cctor and setting the flag, that makes it unlikely that inconsistent state can be observed.

This is how the full fence could be helping in NativeAOT. The fence could be introducing enough delay that we get lucky most of the time.
We do not need that fence on x64 though, so I wonder if there is any history for adding it - did we see a race/failure on arm64 and adding the fence helped?

A race for a cctor completion would be a relatively rare thing, since for a given class you get just one chance per program execution.

@agocke agocke added this to AppModel Mar 6, 2023
@MichalStrehovsky MichalStrehovsky added this to the Future milestone Jul 11, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Jul 11, 2023
@jkotas jkotas modified the milestones: Future, 9.0.0 Aug 2, 2024
@jkotas
Copy link
Member

jkotas commented Aug 2, 2024

I think this should be fixed for 9.0 given that we have an evidence that these race conditions are a real problem and we are fixing them in the JIT.

@VSadov
Copy link
Member Author

VSadov commented Aug 3, 2024

Are we going to fix this for ARM32 as well?

@jkotas
Copy link
Member

jkotas commented Aug 3, 2024

I think so. The JIT fix #105832 is arch neural, it should work for arm32 as well assuming that we emit correct code for volatile on arm32.

@jkotas
Copy link
Member

jkotas commented Aug 3, 2024

I do not think there is anything to fix beyond the JIT fix (#105832). cctorMethodAddress that is used to indicate whether the static constructor run is marked volatile:

. It means that the non-inlined version of the cctor checks should have the required barriers.

@VSadov Do you agree that this issue can be closed?

jkotas added a commit to jkotas/runtime that referenced this issue Aug 3, 2024
Delete unnecessary MemoryBarrier

Fixes dotnet#81151
@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Aug 3, 2024
@VSadov
Copy link
Member Author

VSadov commented Aug 3, 2024

I think there is also a piece of hand-emitted (not by JIT) assembly that checks cctorMethodAddress.

encoder.EmitLDR(encoder.TargetRegister.Arg2, encoder.TargetRegister.Arg3);

It would need to emit ldar/dmb for arm64/32 or whatever RISC-V uses for acq/volatile loads.
I was looking into this too. Ideally it would be nice to just move those stubs into C#, if possible. - just not to deal with assembly encoding.

@VSadov
Copy link
Member Author

VSadov commented Aug 3, 2024

I’ve also explored possibility to use data dependency instead of explicit load fence by adding(as in "+") the ctor pointer to the base reference before returning it back. Just to create fake dependency. It is possible, but then I realized that it is not sufficient. The way memory model specifies this scenario requires that all writes of cctor are observable before accessing members, not just those accessible via instance/type. So it basically demands an actual load fence.

## Static constructors
All side-effects of static constructor execution will become observable no later than effects of accessing any member of the type. Other member methods of the type, when invoked, will observe complete results of the type's static constructor execution.

jkotas added a commit that referenced this issue Aug 6, 2024
- Fix comments around static constructor memory model
- Delete unnecessary MemoryBarrier

Contributes to #81151
jkotas added a commit to jkotas/runtime that referenced this issue Aug 6, 2024
@jkotas jkotas closed this as completed in ec496fe Aug 10, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Sep 9, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-NativeAOT-coreclr in-pr There is an active PR which will close this issue when it is merged
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants