Add hot/cold splitting test job to jit-runtime-experimental #69922

amanasifkhalid · 2022-05-27T21:29:59Z

Adding a new rolling test job to JIT-runtime-experimental can help ensure the new fake- and stress-splitting modes in the JIT do not regress. The proposed test, titled jitosr_stress_splitting, runs a Checked build of the JIT on OS- and platform-specific test suites with the following environment:

COMPlus_GCgen0size=1000000
COMPlus_JitFakeProcedureSplitting=1
COMPlus_JitStressprocedureSplitting=1

This environment forces the JIT to fake-split every method after its first basic block, assuming the method consists of more than one block. COMPlus_GCgen0size=1000000 sets a large memory allocation threshold before the GC runs; recall that COMPlus_JitFakeProcedureSplitting currently doesn't generate unwind information for cold code, thus breaking the GC's stack walks. COMPlus_GCgen0size provides a patchwork fix for this limitation in cases where memory usage is not excessive (i.e. >= 0x1000000 B).

In my local testing, all of the x64 tests in src/tests/ pass with this environment.

ghost · 2022-05-27T21:30:06Z

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

Adding a new rolling test job to JIT-runtime-experimental can help ensure the new fake- and stress-splitting modes in the JIT do not regress. The proposed test, titled jitosr_stress_splitting, runs a Checked build of the JIT on OS- and platform-specific test suites with the following environment:

COMPlus_GCgen0size=1000000
COMPlus_JitFakeProcedureSplitting=1
COMPlus_JitStressprocedureSplitting=1

This environment forces the JIT to fake-split every method after its first basic block, assuming the method consists of more than one block. COMPlus_GCgen0size=1000000 sets a large memory allocation threshold before the GC runs; recall that COMPlus_JitFakeProcedureSplitting currently doesn't generate unwind information for cold code, thus breaking the GC's stack walks. COMPlus_GCgen0size provides a patchwork fix for this limitation in cases where memory usage is not excessive (i.e. >= 0x1000000 B).

In my local testing, all of the x64 tests in src/tests/ pass with this environment.

Author:	amanasifkhalid
Assignees:	amanasifkhalid
Labels:	`area-CodeGen-coreclr`
Milestone:	-

BruceForstall · 2022-05-27T21:36:21Z

Looks good. However, this isn't related to OSR, so remove "osr" from the name. Use jit_stress_splitting or jit_stress_procedure_splitting.

BruceForstall

LGTM.

You should trigger jit-runtime-experimental on this PR to verify.

amanasifkhalid · 2022-05-27T22:11:03Z

/azp run runtime-jit-experimental

azure-pipelines · 2022-05-27T22:11:13Z

Azure Pipelines successfully started running 1 pipeline(s).

amanasifkhalid · 2022-05-31T02:33:12Z

I ran and re-ran runtime-jit-experimental and, like all the other failing checks, it fails to send tests to Helix due to a bad System.AccessToken value. Is the fact that runtime-jit-experimental makes it all the way to the "Send to Helix" step sufficient verification to merge the PR?

BruceForstall · 2022-05-31T18:11:58Z

I ran and re-ran runtime-jit-experimental and, like all the other failing checks, it fails to send tests to Helix due to a bad System.AccessToken value.

You hit #69854, which I believe is now fixed. Try triggering the job again.

Is the fact that runtime-jit-experimental makes it all the way to the "Send to Helix" step sufficient verification to merge the PR?

No -- It's the "Send to Helix" step that actually runs tests, and since it failed, no tests were run.

BruceForstall · 2022-05-31T18:16:58Z

You also hit the "remainingSize == 4" assert which was fixed with #69905

Also, some wasm failures which I don't see issues for, but which are obviously unrelated. You can always try "Re-run" on those.

amanasifkhalid · 2022-05-31T23:37:40Z

/azp run runtime-jit-experimental

azure-pipelines · 2022-05-31T23:37:55Z

Azure Pipelines successfully started running 1 pipeline(s).

amanasifkhalid · 2022-06-01T02:35:29Z

I pulled from main and re-ran runtime-jit-experimental, but it's still failing with 401 responses from Azure DevOps, per #69854 .

BruceForstall · 2022-06-01T03:21:13Z

I pulled from main and re-ran runtime-jit-experimental, but it's still failing with 401 responses from Azure DevOps, per #69854 .

The scheduled run on Sunday did not have those issues (it had other issues, but not those: https://dev.azure.com/dnceng/public/_build/results?buildId=1795838&view=ms.vss-test-web.build-test-results-tab).

My suggestion:

rebase on top of main (don't merge main)
push the updated PR

It shouldn't be necessary, but if that doesn't work, try closing and re-opening the PR, or create an entirely new PR.

amanasifkhalid · 2022-06-01T03:55:51Z

/azp run runtime-jit-experimental

azure-pipelines · 2022-06-01T03:56:01Z

Azure Pipelines successfully started running 1 pipeline(s).

BruceForstall · 2022-06-01T20:02:41Z

/azp run runtime-jit-experimental

azure-pipelines · 2022-06-01T20:02:56Z

Azure Pipelines successfully started running 1 pipeline(s).

BruceForstall · 2022-06-01T20:03:47Z

/azp run runtime-jit-experimental

azure-pipelines · 2022-06-01T20:03:56Z

Azure Pipelines successfully started running 1 pipeline(s).

Fake-splitting currently breaks stack walks by not generating unwind info for cold code. This commit implements unwind info on x86/64 when fake-splitting by generating unwind info for the combined hot/cold section just once, rather than generating unwind info for the separate sections. For reasons to be investigated, this implementation does not work when the code sections are separated by an arbitrary buffer (such as the 4KB buffer previously used). Thus, the buffer has been removed from the fake-splitting implementation: Now, the hot and cold sections are placed contiguously in memory, but the JIT continues to behave as if they are arbitrarily far away (for example, by using long branches between sections). Following this fix, fake-splitting no longer requires the GC to be suppressed by setting `COMPlus_GCgen0size=1000000`. A test job has been added to `runtime-jit-experimental` to ensure fake/stress-splitting does not regress.

amanasifkhalid · 2022-06-07T01:15:54Z

/azp run runtime-jit-experimental

azure-pipelines · 2022-06-07T01:16:04Z

Azure Pipelines successfully started running 1 pipeline(s).

amanasifkhalid · 2022-06-07T01:23:10Z

I have pushed a commit with a fix for unwind info generation when fake-splitting: When calling eeAllocUnwindInfo, we manually set the start offset to point at the beginning of the hot section, and the end offset to point at the end of the cold section, effectively treating them as a single "hot" section. However, for reasons under investigation, this implementation breaks the GC when there is a buffer between the sections; thus, I've adjusted the fake-splitting implementation to leave the hot/cold sections adjacent in memory, while still behaving as if they're arbitrarily far apart (for example, the JIT still generates long branches between sections).

I've tested the x64 implementation locally, and will adjust x86 implementation/add ARM implementation as needed.

BruceForstall

LGTM. If the tests pass, you're good to merge.

amanasifkhalid · 2022-06-07T17:46:56Z

Other than jit_osr_stress_random (see issue), runtime-jit-experimental is passing. Build windows arm64 Release NativeAOT is failing due to a Helix job timing out, but the build otherwise succeeds.

ghost assigned amanasifkhalid May 27, 2022

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 27, 2022

amanasifkhalid requested a review from BruceForstall May 27, 2022 21:30

amanasifkhalid force-pushed the jit-pipeline-tests branch from 671c81a to c2765b9 Compare May 27, 2022 21:45

amanasifkhalid marked this pull request as ready for review May 27, 2022 21:45

BruceForstall approved these changes May 27, 2022

View reviewed changes

JulieLeeMSFT added this to the 7.0.0 milestone May 27, 2022

amanasifkhalid force-pushed the jit-pipeline-tests branch from 9569921 to 00ba968 Compare June 1, 2022 03:54

amanasifkhalid closed this Jun 1, 2022

amanasifkhalid reopened this Jun 1, 2022

amanasifkhalid force-pushed the jit-pipeline-tests branch from 00ba968 to e727fba Compare June 7, 2022 01:15

amanasifkhalid requested a review from BruceForstall June 7, 2022 01:24

BruceForstall approved these changes Jun 7, 2022

View reviewed changes

amanasifkhalid merged commit 08b3170 into dotnet:main Jun 7, 2022

amanasifkhalid deleted the jit-pipeline-tests branch June 7, 2022 17:47

AndyAyersMS mentioned this pull request Jun 29, 2022

Dynamic PGO #43618

Closed

54 tasks

ghost locked as resolved and limited conversation to collaborators Jul 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add hot/cold splitting test job to jit-runtime-experimental #69922

Add hot/cold splitting test job to jit-runtime-experimental #69922

amanasifkhalid commented May 27, 2022

ghost commented May 27, 2022

BruceForstall commented May 27, 2022

BruceForstall left a comment

amanasifkhalid commented May 27, 2022

azure-pipelines bot commented May 27, 2022

amanasifkhalid commented May 31, 2022

BruceForstall commented May 31, 2022

BruceForstall commented May 31, 2022

amanasifkhalid commented May 31, 2022

azure-pipelines bot commented May 31, 2022

amanasifkhalid commented Jun 1, 2022

BruceForstall commented Jun 1, 2022

amanasifkhalid commented Jun 1, 2022

azure-pipelines bot commented Jun 1, 2022

BruceForstall commented Jun 1, 2022

azure-pipelines bot commented Jun 1, 2022

BruceForstall commented Jun 1, 2022

azure-pipelines bot commented Jun 1, 2022

amanasifkhalid commented Jun 7, 2022

azure-pipelines bot commented Jun 7, 2022

amanasifkhalid commented Jun 7, 2022

BruceForstall left a comment

amanasifkhalid commented Jun 7, 2022

Add hot/cold splitting test job to jit-runtime-experimental #69922

Add hot/cold splitting test job to jit-runtime-experimental #69922

Conversation

amanasifkhalid commented May 27, 2022

ghost commented May 27, 2022

BruceForstall commented May 27, 2022

BruceForstall left a comment

Choose a reason for hiding this comment

amanasifkhalid commented May 27, 2022

azure-pipelines bot commented May 27, 2022

amanasifkhalid commented May 31, 2022

BruceForstall commented May 31, 2022

BruceForstall commented May 31, 2022

amanasifkhalid commented May 31, 2022

azure-pipelines bot commented May 31, 2022

amanasifkhalid commented Jun 1, 2022

BruceForstall commented Jun 1, 2022

amanasifkhalid commented Jun 1, 2022

azure-pipelines bot commented Jun 1, 2022

BruceForstall commented Jun 1, 2022

azure-pipelines bot commented Jun 1, 2022

BruceForstall commented Jun 1, 2022

azure-pipelines bot commented Jun 1, 2022

amanasifkhalid commented Jun 7, 2022

azure-pipelines bot commented Jun 7, 2022

amanasifkhalid commented Jun 7, 2022

BruceForstall left a comment

Choose a reason for hiding this comment

amanasifkhalid commented Jun 7, 2022