-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[*64] Allow more fastTailCalls involving structs #3
Conversation
Before this change structs on Arm64 and Amd64 Unix could pessimize when we could fastTailCall if they were engregisterable and took more than one register. This updates the canFastTailCall function to allow this pattern to be fastTailCalled.
LclVarDsc* varDsc = varDscInfo.varDsc; | ||
CORINFO_ARG_LIST_HANDLE localsSig = info.compMethodInfo->locals.args; | ||
unsigned argRegNum = varDscInfo.intRegArgNum; | ||
unsigned otherArgRegNum = varDscInfo.floatRegArgNum; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is confusing at best, and needs a comment. I gather that this handles the case where there may be a mix of int and float regs, so that this is the first of each that is used for this arg?
@@ -262,6 +264,9 @@ void Compiler::lvaInitTypeRef() | |||
} | |||
} | |||
|
|||
info.compArgRegCount = argRegNum; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't make sense to me. Why are you assigning a register number to a count?
unsigned compILargsCount; // Number of arguments (incl. implicit but not hidden) | ||
unsigned compArgsCount; // Number of arguments (incl. implicit and hidden) | ||
unsigned compArgRegCount; // Number of integer locals | ||
unsigned compOtherArgRegCount; // Number of multireg args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be ifdef'd under FEATURE_MULTIREG_ARGS
. Also, is it the number of integer locals or the number of integer arguments in registers?
# This is the 1st commit message: Remove CoreFX runtest dependency This change will start using build-test.sh generatelayoutonly to build the coreoverlay directory for use with runtest.sh. # This is the commit message #2: Missing ./ # This is the commit message #3: Fix untar location # This is the commit message #4: Use portable builds # This is the commit message #5: Get ci green # This is the commit message #6: Small changes # This is the commit message #7: Small changes # This is the commit message #8: Small changes # This is the commit message #9: Add generateonly builds.sh # This is the commit message #10: Ensure correct test location # This is the commit message #11: netci change # This is the commit message #12: netci change # This is the commit message #13: Working on setting up build-test # This is the commit message #14: Remove generatelayoutonly # This is the commit message #15: Undo some of the changes to use generatelayoutonly # This is the commit message #16: Correctly remove build-test.sh invocation # This is the commit message #17: Fix build-test generatelayoutonly # This is the commit message #18: Fix a few netci issues
1. Computing GC roots is a relatively slow operation, and doing it for every state machine object found in a large heap can be time consuming. Making it opt-in with -roots command-line flag. 2. Added -waiting command-line flag. DumpAsync will now retrieve the <>1__state field from the StateMachine, and if -waiting is specified, it'll filter down to state machines that have a state value >= 0, meaning the state machines are waiting at an await point. For example, given this program: ```C# using System.Threading.Tasks; class Program { static async Task Main() { await MethodA(0); await MethodA(int.MaxValue); } static async Task MethodA(int delay) => await MethodB(delay); static async Task MethodB(int delay) { await Task.Yield(); await Task.Delay(delay); } } ``` using `!DumpAsync` outputs: ``` Address MT Size Name #0 0000026848693438 00007ff88ea35e58 120 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodB>d__2, test]] StateMachine: Program+<MethodB>d__2 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000008 0 System.Int32 1 instance -2 <>1__state 00007ff8e9bd82f8 4000009 8 ...TaskMethodBuilder 1 instance 0000026848693490 <>t__builder 00007ff8e9bc4bc0 400000a 4 System.Int32 1 instance 0 delay 00007ff8e9bee4d0 400000b 10 ...able+YieldAwaiter 1 instance 0000026848693498 <>u__1 00007ff8e9bcead0 400000c 18 ...vices.TaskAwaiter 1 instance 00000268486934a0 <>u__2 Continuation: 00000268486934b0 (System.Object) #1 0000026848693e68 00007ff88ea36cc8 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]] StateMachine: Program+<MethodA>d__1 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000004 0 System.Int32 1 instance -2 <>1__state 00007ff8e9bd82f8 4000005 8 ...TaskMethodBuilder 1 instance 0000026848693ec0 <>t__builder 00007ff8e9bc4bc0 4000006 4 System.Int32 1 instance 0 delay 00007ff8e9bcead0 4000007 10 ...vices.TaskAwaiter 1 instance 0000026848693ec8 <>u__1 Continuation: 00000268486934b0 (System.Object) #2 0000026848693ed8 00007ff88ea37188 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<Main>d__0, test]] StateMachine: Program+<Main>d__0 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000001 0 System.Int32 1 instance 1 <>1__state 00007ff8e9bd82f8 4000002 8 ...TaskMethodBuilder 1 instance 0000026848693f30 <>t__builder 00007ff8e9bcead0 4000003 10 ...vices.TaskAwaiter 1 instance 0000026848693f38 <>u__1 Continuation: 0000026848693f48 (System.Threading.Tasks.Task+SetOnInvokeMres) #3 0000026848695d30 00007ff88ea35e58 120 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodB>d__2, test]] StateMachine: Program+<MethodB>d__2 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000008 0 System.Int32 1 instance 1 <>1__state 00007ff8e9bd82f8 4000009 8 ...TaskMethodBuilder 1 instance 0000026848695d88 <>t__builder 00007ff8e9bc4bc0 400000a 4 System.Int32 1 instance 2147483647 delay 00007ff8e9bee4d0 400000b 10 ...able+YieldAwaiter 1 instance 0000026848695d90 <>u__1 00007ff8e9bcead0 400000c 18 ...vices.TaskAwaiter 1 instance 0000026848695d98 <>u__2 Continuation: 0000026848695dd0 (System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]]) #4 0000026848695dd0 00007ff88ea36cc8 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]] StateMachine: Program+<MethodA>d__1 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000004 0 System.Int32 1 instance 0 <>1__state 00007ff8e9bd82f8 4000005 8 ...TaskMethodBuilder 1 instance 0000026848695e28 <>t__builder 00007ff8e9bc4bc0 4000006 4 System.Int32 1 instance 2147483647 delay 00007ff8e9bcead0 4000007 10 ...vices.TaskAwaiter 1 instance 0000026848695e30 <>u__1 Continuation: 0000026848693ed8 (System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<Main>d__0, test]]) Found 5 state machines. ``` while using `!DumpAsync -waiting` outputs only: ``` Address MT Size Name #0 0000026848693ed8 00007ff88ea37188 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<Main>d__0, test]] StateMachine: Program+<Main>d__0 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000001 0 System.Int32 1 instance 1 <>1__state 00007ff8e9bd82f8 4000002 8 ...TaskMethodBuilder 1 instance 0000026848693f30 <>t__builder 00007ff8e9bcead0 4000003 10 ...vices.TaskAwaiter 1 instance 0000026848693f38 <>u__1 Continuation: 0000026848693f48 (System.Threading.Tasks.Task+SetOnInvokeMres) #1 0000026848695d30 00007ff88ea35e58 120 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodB>d__2, test]] StateMachine: Program+<MethodB>d__2 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000008 0 System.Int32 1 instance 1 <>1__state 00007ff8e9bd82f8 4000009 8 ...TaskMethodBuilder 1 instance 0000026848695d88 <>t__builder 00007ff8e9bc4bc0 400000a 4 System.Int32 1 instance 2147483647 delay 00007ff8e9bee4d0 400000b 10 ...able+YieldAwaiter 1 instance 0000026848695d90 <>u__1 00007ff8e9bcead0 400000c 18 ...vices.TaskAwaiter 1 instance 0000026848695d98 <>u__2 Continuation: 0000026848695dd0 (System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]]) #2 0000026848695dd0 00007ff88ea36cc8 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]] StateMachine: Program+<MethodA>d__1 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000004 0 System.Int32 1 instance 0 <>1__state 00007ff8e9bd82f8 4000005 8 ...TaskMethodBuilder 1 instance 0000026848695e28 <>t__builder 00007ff8e9bc4bc0 4000006 4 System.Int32 1 instance 2147483647 delay 00007ff8e9bcead0 4000007 10 ...vices.TaskAwaiter 1 instance 0000026848695e30 <>u__1 Continuation: 0000026848693ed8 (System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<Main>d__0, test]]) Found 3 state machines. ``` skipping the two state machines that have a `<>1__state` field value of -2 (meaning it's completed). Note that this change has the somewhat unfortunate impact of taking a dependency on what's effectively an implementation detail of Roslyn, but the value the filtering provides is deemed to be worth it. This design is unlikely to change in the future, and as with other diagnostic/debugging features that rely on such details, it can be updated if Roslyn ever changes its scheme. In the meantime, the code will output a warning message if it can't find the state field. 3. If a state machine is found to have 0 roots but also to have a <>1__state value >= 0, that suggests it was dropped without having been completed, which is likely a sign of an application bug. The command now prints out an information message to highlight that state. For example, this program: ```C# using System; using System.Threading.Tasks; class Program { static void Main() { Task.Run(async () => await new TaskCompletionSource<bool>().Task); Console.ReadLine(); } } ``` when processed with `!DumpAsync -roots` results in: ``` Address MT Size Name #0 0000020787fb5b30 00007ff88ea1afe8 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Boolean, System.Private.CoreLib],[Program+<>c+<<Main>b__0_0>d, test]] StateMachine: Program+<>c+<<Main>b__0_0>d (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000003 0 System.Int32 1 instance 0 <>1__state 00007ff8e9bd0b88 4000004 8 ...Private.CoreLib]] 1 instance 0000020787fb5b88 <>t__builder 00007ff8e9bffd58 4000005 10 ...Private.CoreLib]] 1 instance 0000020787fb5b90 <>u__1 Continuation: 0000020787fb3fc8 (System.Threading.Tasks.UnwrapPromise`1[[System.Boolean, System.Private.CoreLib]]) GC roots: Incomplete state machine (<>1__state == 0) with 0 roots. Found 1 state machines. ```
* find src/jit -type f -exec sed -i -e 's/gtLclVarCommon\./AsLclVarCommon()\./g' {} \; gtLclVarCommon * Format patch * Format patch * Format patch #3 * Remove redundant /
This change contains the remaining fixes I made while digging through the PR run in the runtime repo: 1) Some more repo-relative vs. coreclr-relative repo adjustments reflecting the migration process; 2) Make sure that XunitTestBinBase always ends with a directory separator, otherwise my recent fix for Common folder exclusion doesn't work; 3) Opportunistically shorten project names of two native interop test components - as the project name is repeated about 3-4 times in some intermediate paths, it quickly exceeds the standard Windows path length limit. Please note there's no functional change in the test. Thanks Tomas
Before this change structs on Arm64 and Amd64 Unix could
pessimize when we could fastTailCall if they were engregisterable
and took more than one register.
This updates the canFastTailCall function to allow this pattern
to be fastTailCalled.