Fix NativeAOT ThunksPool thunk data block size handling. #110732

lateralusX · 2024-12-16T12:09:57Z

#88710 made a change in ThunkPool.cs moving away from using a page size define to calling ThunkBlockSize to get to end of thunk data block where a common stub address get stored.

This change is not equivalent on platforms where the thunk blocks are laid out in pair where a stub thunk blocks are followed by a data thunk block and all gets mapped from file. This is the schema used on Windows platforms.

In that layout schema the ThunkBlockSize is 2 * page size meaning that the calculation getting to the end of the thunk data block will move to the end of next thunk stub that is RX memory and storing the common stub address at that location will trigger an AV.

This works on iOS since it reports its ThunkBlockSize as one page but that is not totally correct since it uses 2 pages, just that they are allocated in the same way as FEATURE_RX_THUNKS, all thunk stubs blocks followed by all thunk data blocks. The reason why this works is because it only maps the thunk stubs from file, reporting a ThunkBlockSize that is inline with what gets map:ed from file, but then there is a special handling in PalAllocateThunksFromTemplate on iOS that virutal alloc template size * 2, mapping the first template size bytes from the file and the rest are kept as its thunk data blocks.

This commit calculates the size of code/data block based on code/data thunk size * number of thunks per block and since this is guaranteed to fit into one block and that the block size needs to be a power of 2, the correct full block size used in arch specific implementation when laying out the stub and data blocks can be calculated directly in managed code.

dotnet#88710 made a change in TunkPool.cs moving away from using a page size define to calling ThunkBlockSize to get to end of thunk data block where a common stub address get stored. This change is not equivalent on platforms where the thunk blocks are laid out in pair where a stub thunk blocks are followed by a data thunk block and all gets mapped from file. This is the schema used on Windows platforms. In that layout schema the ThunkBlockSize is 2 * page size meaning that the calculation getting to the end of the thunk data block will move to the end of next thunk stub that is RX memory and storing the common stub address at that location will trigger an AV. This works on iOS since it reports its ThunkBlockSize as one page but that is not totally correct since it uses 2 pages, just that they are allocated in the same way as FEATURE_RX_THUNKS, all thunk stubs blocks followed by all thunk data blocks. The reason why this works is because it only maps the thunk stubs from file, reporting a ThunkBlockSize that is inline with what gets map:ed from file, but then there is a special handling in PalAllocateThunksFromTemplate on iOS that virutal alloc template size * 2, mapping the first template size bytes from the file and the rest are kept as its thunk data blocks. This commit adds a new function returning the size of the thunk data block and use that when calculating the end of the data block instead of using the ThunkBlockSize since its reflects the size of each block getting mapped from file, on Windows platforms that is stub+data, 2 * page size.

dotnet-policy-service · 2024-12-16T12:10:45Z

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

src/coreclr/nativeaot/System.Private.CoreLib/src/System/Runtime/ThunkPool.cs

jkotas · 2024-12-16T23:18:32Z

This works on iOS since it reports its ThunkBlockSize

It seems that ThunkBlockSize means code size in some situations and code + data size in other situations. It would be nice to clean it up.

lateralusX · 2024-12-17T09:11:12Z

This works on iOS since it reports its ThunkBlockSize

It seems that ThunkBlockSize means code size in some situations and code + data size in other situations. It would be nice to clean it up.

Yes, I was tempted rewrite it, the introduction of iOS changes the patterns a little, since it uses the per arch implementation but changed how it allocates the data blocks, only the stub blocks are laid out in the asm implementation, and the rest is handled in the platform specific implementation in native code, so the original logic in the asm implementation that describes layout of what gets re-mapped was altered, meaning that ThunkBlockSize for implementations used on iOS ended up as one page (just the stub), while it will be 2 pages on Windows platforms (stub+data page). So, Windows platforms are inline with how these functions were originally designed while iOS ended up using a hybrid only using the arch specific implementation to layout the thunk stubs. Cleaning it up would mean to expand the concept that the arch implementation might or might not handle the layout of data and reshape what the different sizes return from functions means and how they are used when calculating what needs to be virtual allocated, how much that should be re-mapped from file etc. In the end, since the arch asm implementation handles most of these implementation details, I believe it make sense to get all that data from the arch asm implementation, in the end the native code needs to know how much memory to virtual alloc for the complete mapping and how much of that memory it should map from the file, on Windows this will be the same, but on platforms like iOS it would be only half of the virtual memory would be re-mapped (just the stubs) while the rest is kept as is (data pages).

In this PR, I decided to just implement what was needed to mainly restore the previous behavior that used page size directly in ThunkPool.cs but read it directly from the arch implementation that has the correct details of the size of individual blocks (in this case the thunk data block size). It was a simpler change solving the problem that dynamic ThunkPool is broken on Windows platforms but also getting the size from the arch specific asm implementation that have the correct size of the data block.

I can certainly try to clean this up if we think that it's feasible. The underlying issue is the definition and usage of a "block", depending on implementation it is viewed as the stub thunk block or a combination of stub+data block. The block size is used to figure out the total size of virtual allocated memory as well as the stride used to go from one stub block to the next etc. Cleaning this up probably mean to change the current perception that a block original was a stub+data pair into something more aligned to the different implementations currently in use under the various variations of thunk pool implementations.

src/coreclr/nativeaot/System.Private.CoreLib/src/System/Runtime/ThunkPool.cs

Co-authored-by: Jan Kotas <jkotas@microsoft.com>

jkotas

Thanks!

…0732)" This reverts commit b91087f.

…0732)" (#110983) This reverts commit b91087f.

Updated version of dotnet#110732 fixing issues on ARM32 and other platforms where size diffrence between code and data thunk caused thunk data block size calculations to be to small. dotnet#88710 made a change in TunkPool.cs moving away from using a page size define to calling ThunkBlockSize to get to end of thunk data block where a common stub address get stored. This change is not equivalent on platforms where the thunk blocks are laid out in pair where a stub thunk blocks are followed by a data thunk block and all gets mapped from file. This is the schema used on Windows platforms. In that layout schema the ThunkBlockSize is 2 * page size meaning that the calculation getting to the end of the thunk data block will move to the end of next thunk stub that is RX memory and storing the common stub address at that location will trigger an AV. This works on iOS since it reports its ThunkBlockSize as one page but that is not totally correct since it uses 2 pages, just that they are allocated in the same way as FEATURE_RX_THUNKS, all thunk stubs blocks followed by all thunk data blocks. The reason why this works is because it only maps the thunk stubs from file, reporting a ThunkBlockSize that is inline with what gets map:ed from file, but then there is a special handling in PalAllocateThunksFromTemplate on iOS that virutal alloc template size * 2, mapping the first template size bytes from the file and the rest are kept as its thunk data blocks. This commit calculates the page size of code/data block based on max of code/data thunk size * number of thunks per block and since this is guaranteed to fit into one block and that the block size needs to be a power of 2, the correct full block size used in arch specific implementation when laying out the stub and data blocks can be calculated directly in managed code. Review feedback. Switch to ThunkDataBlockSizeMask in one place. Fix build error. Include pointer size slot in data block size calculation. Calculate page size using NumThunksPerBlock.

lateralusX · 2025-01-07T11:15:22Z

Slightly modified version of this PR fixing ARM32, #111149.

Updated version of dotnet#110732 fixing issues on ARM32 and other platforms where size diffrence between code and data thunk caused thunk data block size calculations to be to small. dotnet#88710 made a change in TunkPool.cs moving away from using a page size define to calling ThunkBlockSize to get to end of thunk data block where a common stub address get stored. This change is not equivalent on platforms where the thunk blocks are laid out in pair where a stub thunk blocks are followed by a data thunk block and all gets mapped from file. This is the schema used on Windows platforms. In that layout schema the ThunkBlockSize is 2 * page size meaning that the calculation getting to the end of the thunk data block will move to the end of next thunk stub that is RX memory and storing the common stub address at that location will trigger an AV. This works on iOS since it reports its ThunkBlockSize as one page but that is not totally correct since it uses 2 pages, just that they are allocated in the same way as FEATURE_RX_THUNKS, all thunk stubs blocks followed by all thunk data blocks. The reason why this works is because it only maps the thunk stubs from file, reporting a ThunkBlockSize that is inline with what gets map:ed from file, but then there is a special handling in PalAllocateThunksFromTemplate on iOS that virutal alloc template size * 2, mapping the first template size bytes from the file and the rest are kept as its thunk data blocks. This commit calculates the page size of code/data block based on max of code/data thunk size * number of thunks per block and since this is guaranteed to fit into one block and that the block size needs to be a power of 2, the correct full block size used in arch specific implementation when laying out the stub and data blocks can be calculated directly in managed code. Review feedback. Switch to ThunkDataBlockSizeMask in one place. Fix build error. Include pointer size slot in data block size calculation. Calculate page size using NumThunksPerBlock.

Updated version of #110732 fixing issues on ARM32 and other platforms where size diffrence between code and data thunk caused thunk data block size calculations to be to small. #88710 made a change in TunkPool.cs moving away from using a page size define to calling ThunkBlockSize to get to end of thunk data block where a common stub address get stored. This change is not equivalent on platforms where the thunk blocks are laid out in pair where a stub thunk blocks are followed by a data thunk block and all gets mapped from file. This is the schema used on Windows platforms. In that layout schema the ThunkBlockSize is 2 * page size meaning that the calculation getting to the end of the thunk data block will move to the end of next thunk stub that is RX memory and storing the common stub address at that location will trigger an AV. This works on iOS since it reports its ThunkBlockSize as one page but that is not totally correct since it uses 2 pages, just that they are allocated in the same way as FEATURE_RX_THUNKS, all thunk stubs blocks followed by all thunk data blocks. The reason why this works is because it only maps the thunk stubs from file, reporting a ThunkBlockSize that is inline with what gets map:ed from file, but then there is a special handling in PalAllocateThunksFromTemplate on iOS that virutal alloc template size * 2, mapping the first template size bytes from the file and the rest are kept as its thunk data blocks. This commit calculates the page size of code/data block based on max of code/data thunk size * number of thunks per block and since this is guaranteed to fit into one block and that the block size needs to be a power of 2, the correct full block size used in arch specific implementation when laying out the stub and data blocks can be calculated directly in managed code. Review feedback. Switch to ThunkDataBlockSizeMask in one place. Fix build error. Include pointer size slot in data block size calculation. Calculate page size using NumThunksPerBlock.

lateralusX requested a review from MichalStrehovsky as a code owner December 16, 2024 12:09

dotnet-issue-labeler bot added the area-NativeAOT-coreclr label Dec 16, 2024

dotnet-policy-service bot assigned lateralusX Dec 16, 2024

build-analysis bot mentioned this pull request Dec 16, 2024

Test failure: baseservices/exceptions/stackoverflow/stackoverflowtester/stackoverflowtester.cmd #110173

Open

jkotas reviewed Dec 16, 2024

View reviewed changes

src/coreclr/nativeaot/System.Private.CoreLib/src/System/Runtime/ThunkPool.cs Outdated Show resolved Hide resolved

src/coreclr/nativeaot/System.Private.CoreLib/src/System/Runtime/ThunkPool.cs Outdated Show resolved Hide resolved

lateralusX force-pushed the lateralusX/fix-thunkpool-av branch from 958b9e7 to 0595470 Compare December 18, 2024 10:03

Review feedback.

258c248

lateralusX force-pushed the lateralusX/fix-thunkpool-av branch from 0595470 to 258c248 Compare December 18, 2024 10:05

lateralusX added 2 commits December 18, 2024 11:10

Switch to ThunkDataBlockSizeMask in one place.

5501c59

Fix build error.

eae5b8e

build-analysis bot mentioned this pull request Dec 18, 2024

MSBuild crashing in the build #92290

Open

jkotas reviewed Dec 18, 2024

View reviewed changes

src/coreclr/nativeaot/System.Private.CoreLib/src/System/Runtime/ThunkPool.cs Outdated Show resolved Hide resolved

Include pointer size slot in data block size calculation.

439d567

Co-authored-by: Jan Kotas <jkotas@microsoft.com>

jkotas approved these changes Dec 19, 2024

View reviewed changes

jkotas merged commit b91087f into dotnet:main Dec 19, 2024
88 checks passed

jkotas mentioned this pull request Dec 29, 2024

Revert "Remove uses of DECODE_RETURN_KIND part of GCInfo" #110852

Closed

jkotas added a commit that referenced this pull request Dec 29, 2024

Revert "Fix NativeAOT ThunksPool thunk data block size handling. (#11…

7321bc2

…0732)" This reverts commit b91087f.

jkotas mentioned this pull request Dec 29, 2024

Revert "Fix NativeAOT ThunksPool thunk data block size handling." #110983

Merged

MichalStrehovsky pushed a commit that referenced this pull request Jan 7, 2025

Revert "Fix NativeAOT ThunksPool thunk data block size handling. (#11…

a29d5f9

…0732)" (#110983) This reverts commit b91087f.

lateralusX mentioned this pull request Jan 7, 2025

Fix NativeAOT ThunksPool thunk data block size handling. #111149

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix NativeAOT ThunksPool thunk data block size handling. #110732

Fix NativeAOT ThunksPool thunk data block size handling. #110732

lateralusX commented Dec 16, 2024 •

edited

Loading

dotnet-policy-service bot commented Dec 16, 2024

jkotas commented Dec 16, 2024

lateralusX commented Dec 17, 2024 •

edited

Loading

jkotas left a comment

lateralusX commented Jan 7, 2025

Fix NativeAOT ThunksPool thunk data block size handling. #110732

Fix NativeAOT ThunksPool thunk data block size handling. #110732

Conversation

lateralusX commented Dec 16, 2024 • edited Loading

dotnet-policy-service bot commented Dec 16, 2024

jkotas commented Dec 16, 2024

lateralusX commented Dec 17, 2024 • edited Loading

jkotas left a comment

Choose a reason for hiding this comment

lateralusX commented Jan 7, 2025

lateralusX commented Dec 16, 2024 •

edited

Loading

lateralusX commented Dec 17, 2024 •

edited

Loading