-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Conversation
@dotnet-bot test Windows_NT x64 corefx_baseline |
{ | ||
ref TaskAwaiter ta = ref Unsafe.As<TAwaiter, TaskAwaiter>(ref awaiter); // relies on TaskAwaiter/TaskAwaiter<T> having the same layout | ||
ref TaskAwaiter ta = ref Unsafe.As<TAwaiter, TaskAwaiter>(ref awaiter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably still needs the comment ?
// relies on TaskAwaiter/TaskAwaiter<T> having the same layout
As warning/info on the Unsafe.As
cast?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restored.
typeof(TAwaiter) == typeof(TaskAwaiter<byte>) || | ||
typeof(TAwaiter) == typeof(TaskAwaiter<int>) || | ||
typeof(TAwaiter) == typeof(TaskAwaiter<long>)) | ||
if (awaiter is ITaskAwaiter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WOO HOO!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up wrapping these new is
tests in a value class guard, otherwise cases where the arguments were byref-to-ref classes still ran through these interface tests -- take a look and make sure the new logic is reasonable.
ref TaskAwaiter ta = ref Unsafe.As<TAwaiter, TaskAwaiter>(ref awaiter); | ||
TaskAwaiter.UnsafeOnCompletedInternal(ta.m_task, box, continueOnCapturedContext: true); | ||
} | ||
else if (InterfaceIsCheckWorkaround<TAwaiter>.IsIConfiguredTaskAwaiter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The InterfaceIsCheckWorkaround class could then also be deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's gone now.
TypeHandle fromHnd = (TypeHandle) fromClass; | ||
TypeHandle toHnd = (TypeHandle) toClass; | ||
|
||
#ifdef FEATURE_COMINTEROP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: => Don't try to optimize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix, thanks.
src/vm/jitinterface.cpp
Outdated
else | ||
#endif // FEATURE_COMINTEROP | ||
|
||
// If the types are not shared, we can check directly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the types are not shared => if we don't have shared generic types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also might want to introduce two bools here:
bool toTypeIsShared = toType.IsCanonicalSubtype()
bool fromTypeIsShared = ...
|
||
JIT_TO_EE_TRANSITION(); | ||
|
||
TypeHandle fromHnd = (TypeHandle) fromClass; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to rename:
fromClass => fromHnd
toClass => toHnd
And
fromHnd => fromType
toHnd => toType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As handle implies that we are treating the local as an opaque thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Runtime code doesn't seem super-consistent here; I'll see if Jan has a preference.
Preliminary jit-diff impact below. Unlike many changes this change alters what methods are prejitted, as a number of value type instantiations are no longer needed. Hence the "reconciling methods" impact is fairly large. Beyond that there is still another ~13K reduction in code size in 300 methods that are prejitted in both cases. Still looking at diffs and there are some that need more scrutiny.
|
Some examples of diffs (will edit/update as I catalog these) If we're boxing a byref, we can clean up most of the box, but we leave what is effectively a null check, which is often unnecessary as the byref is dereferenced shortly thereafter. So we end up with -; Total bytes of code 39, prolog size 5 for method System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[Boolean][System.Boolean]:AwaitUnsafeOnCompleted(byref,byref):this
+; Total bytes of code 43, prolog size 5 for method System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[Boolean][System.Boolean]:AwaitUnsafeOnCompleted(byref,byref):this
-; Lcl frame size = 32
+; Lcl frame size = 48
G_M10573_IG01:
push rsi
- sub rsp, 32
+ sub rsp, 48
mov rsi, rdx
G_M10573_IG02:
mov rdx, r8
call System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[Boolean][System.Boolean]:GetStateMachineBox(byref):ref:this
mov rdx, rax
+ movsx r8, byte ptr [rsi]
movzx r8, byte ptr [rsi+8]
mov rcx, gword ptr [rsi]
call System.Runtime.CompilerServices.TaskAwaiter:UnsafeOnCompletedInternal(ref,ref,bool)
nop
G_M10573_IG03:
- add rsp, 32
+ add rsp, 48
pop rsi
ret In the before case we were using type equality to specialize and so did not introduce a box with an early dereference of the byref. Now we have one and can't get rid of it. Think we'll have to live with this one for now. Odd though that the frame size is larger; nothing is touching the extra frame space. I'll see if I can track this part down. |
Here's a good diff example, after version is basically the same as the after version above (with the same extra null check and a new wayward store of rdx into the stack) but the frame is even larger: -; Total bytes of code 427, prolog size 19 for method System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[Boolean][System.Boolean]:AwaitUnsafeOnCompleted(byref,byref):this
+; Total bytes of code 48, prolog size 10 for method System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[Boolean][System.Boolean]:AwaitUnsafeOnCompleted(byref,byref):this
-; Lcl frame size = 168
-G_M10575_IG01:
- push rdi
+; Lcl frame size = 96
+G_M10573_IG01:
push rsi
- push rbp
- push rbx
- sub rsp, 168
- mov qword ptr [rsp+A0H], rdx
- mov rsi, rdx
- mov rdi, r8
-G_M10575_IG02:
+ sub rsp, 96
+ mov qword ptr [rsp+58H], rdx
+ mov rsi, r8
+G_M10573_IG02:
mov rdx, r9
call System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[Boolean][System.Boolean]:GetStateMachineBox(byref):ref:this
- mov rbx, rax
- mov rbp, qword ptr [rsi+56]
- mov r8, qword ptr [rbp]
- mov rcx, r8
- and ecx, 1
- mov rdx, r8
- test ecx, ecx
- je SHORT G_M10575_IG03
- mov rdx, qword ptr [rdx-1]
-G_M10575_IG03:
- lea rax, [(reloc)]
- cmp rdx, rax
- je G_M10575_IG10
- mov rdx, r8
- test ecx, ecx
- je SHORT G_M10575_IG04
- mov rdx, qword ptr [rdx-1]
-G_M10575_IG04:
- lea rax, [(reloc)]
- cmp rdx, rax
- je SHORT G_M10575_IG10
- mov rdx, r8
- test ecx, ecx
- je SHORT G_M10575_IG05
- mov rdx, qword ptr [rdx-1]
-G_M10575_IG05:
- lea rax, [(reloc)]
- cmp rdx, rax
- je SHORT G_M10575_IG10
- mov rdx, r8
- test ecx, ecx
- je SHORT G_M10575_IG06
- mov rdx, qword ptr [rdx-1]
-G_M10575_IG06:
- lea rax, [(reloc)]
- cmp rdx, rax
- je SHORT G_M10575_IG10
- mov rdx, r8
- test ecx, ecx
- je SHORT G_M10575_IG07
- mov rdx, qword ptr [rdx-1]
-G_M10575_IG07:
- lea rax, [(reloc)]
- cmp rdx, rax
- je SHORT G_M10575_IG10
- mov rdx, r8
- test ecx, ecx
- je SHORT G_M10575_IG08
- mov rdx, qword ptr [rdx-1]
-G_M10575_IG08:
- lea rax, [(reloc)]
- cmp rdx, rax
- je SHORT G_M10575_IG10
- test ecx, ecx
- je SHORT G_M10575_IG09
- mov r8, qword ptr [r8-1]
-G_M10575_IG09:
- lea rcx, [(reloc)]
- cmp r8, rcx
- jne SHORT G_M10575_IG12
-G_M10575_IG10:
- movzx r8, byte ptr [rdi+8]
- mov rcx, gword ptr [rdi]
- mov rdx, rbx
+ mov rdx, rax
+ movsx r8, byte ptr [rsi]
+ movzx r8, byte ptr [rsi+8]
+ mov rcx, gword ptr [rsi]
call System.Runtime.CompilerServices.TaskAwaiter:UnsafeOnCompletedInternal(ref,ref,bool)
nop
-G_M10575_IG11:
- add rsp, 168
- pop rbx
- pop rbp
+G_M10573_IG03:
+ add rsp, 96
pop rsi
- pop rdi
ret
-G_M10575_IG12:
- mov rcx, qword ptr [rbp+24]
- test rcx, rcx
- jne SHORT G_M10575_IG13
- mov rcx, rsi
- lea rdx, [(reloc)]
- call CORINFO_HELP_RUNTIMEHANDLE_METHOD
- mov rcx, rax
-G_M10575_IG13:
- call CORINFO_HELP_GETGENERICS_NONGCSTATIC_BASE
- cmp byte ptr [rax+8], 0
- je SHORT G_M10575_IG15
- mov rcx, gword ptr [rdi]
- mov rdx, rbx
- mov r8d, 1
- call System.Runtime.CompilerServices.TaskAwaiter:UnsafeOnCompletedInternal(ref,ref,bool)
- nop
-G_M10575_IG14:
- add rsp, 168
- pop rbx
- pop rbp
- pop rsi
- pop rdi
- ret
-G_M10575_IG15:
- mov rcx, qword ptr [rbp+24]
- test rcx, rcx
- jne SHORT G_M10575_IG16
- mov rcx, rsi
- lea rdx, [(reloc)]
- call CORINFO_HELP_RUNTIMEHANDLE_METHOD
- mov rcx, rax
-G_M10575_IG16:
- call CORINFO_HELP_GETGENERICS_NONGCSTATIC_BASE
- cmp byte ptr [rax+9], 0
- je SHORT G_M10575_IG18
- movzx r8, byte ptr [rdi+8]
- mov rcx, gword ptr [rdi]
- mov rdx, rbx
- call System.Runtime.CompilerServices.TaskAwaiter:UnsafeOnCompletedInternal(ref,ref,bool)
- nop
-G_M10575_IG17:
- add rsp, 168
- pop rbx
- pop rbp
- pop rsi
- pop rdi
- ret
-G_M10575_IG18:
- mov rcx, qword ptr [rbp+32]
- test rcx, rcx
- jne SHORT G_M10575_IG19
- mov rcx, rsi
- lea rdx, [(reloc)]
- call CORINFO_HELP_RUNTIMEHANDLE_METHOD
- mov rcx, rax
-G_M10575_IG19:
- mov rdx, rdi
- mov r8, rbx
- call System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[Boolean][System.Boolean]:AwaitArbitraryAwaiterUnsafeOnCompleted(byref,ref)
- nop
-G_M10575_IG20:
- add rsp, 168
- pop rbx
- pop rbp
- pop rsi
- pop rdi
- ret |
Calculates frame size on pre-optimized code? https://github.com/dotnet/coreclr/issues/9066 |
Probably a bigger issue is that the jit does not track types for byrefs to reference types, so we end up doing interface checks where the old code could do equality checks. -; Total bytes of code 226, prolog size 19 for method System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[VoidTaskResult][System.Threading.Tasks.VoidTaskResult]:AwaitUnsafeOnCompleted(byref,byref):this
-; Lcl frame size = 248
+; Total bytes of code 178, prolog size 13 for method System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[VoidTaskResult][System.Threading.Tasks.VoidTaskResult]:AwaitUnsafeOnCompleted(byref,byref):this
+; Lcl frame size = 120
G_M63121_IG01:
push rdi
push rsi
push rbp
push rbx
- sub rsp, 248
- mov qword ptr [rsp+F0H], rdx
+ sub rsp, 120
+ mov qword ptr [rsp+70H], rdx
mov rsi, rdx
mov rdi, r8
G_M63121_IG02:
mov rdx, r9
call System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[VoidTaskResult][System.Threading.Tasks.VoidTaskResult]:GetStateMachineBox(byref):ref:this
mov rbx, rax
- mov rbp, qword ptr [rsi+56]
- mov rcx, qword ptr [rbp+16]
- test rcx, rcx
- jne SHORT G_M63121_IG03
- mov rcx, rsi
- lea rdx, [(reloc)]
- call CORINFO_HELP_RUNTIMEHANDLE_METHOD
- mov rcx, rax
-G_M63121_IG03:
- call CORINFO_HELP_GETGENERICS_NONGCSTATIC_BASE
- cmp byte ptr [rax+8], 0
- je SHORT G_M63121_IG05
- mov rcx, gword ptr [rdi]
+ mov rbp, gword ptr [rdi]
+ mov rdx, rbp
+ lea rcx, [(reloc)]
+ call CORINFO_HELP_ISINSTANCEOFINTERFACE
+ test rax, rax
+ je SHORT G_M63121_IG04
+ mov rcx, rbp
mov rdx, rbx
mov r8d, 1
call System.Runtime.CompilerServices.TaskAwaiter:UnsafeOnCompletedInternal(ref,ref,bool)
nop
-G_M63121_IG04:
- add rsp, 248
+G_M63121_IG03:
+ add rsp, 120
pop rbx
pop rbp
pop rsi
pop rdi
ret
-G_M63121_IG05:
- mov rcx, qword ptr [rbp+16]
- test rcx, rcx
- jne SHORT G_M63121_IG06
- mov rcx, rsi
- lea rdx, [(reloc)]
- call CORINFO_HELP_RUNTIMEHANDLE_METHOD
- mov rcx, rax
-G_M63121_IG06:
- call CORINFO_HELP_GETGENERICS_NONGCSTATIC_BASE
- cmp byte ptr [rax+9], 0
- je SHORT G_M63121_IG08
+G_M63121_IG04:
+ mov rdx, rbp
+ lea rcx, [(reloc)]
+ call CORINFO_HELP_ISINSTANCEOFINTERFACE
+ test rax, rax
+ je SHORT G_M63121_IG06
movzx r8, byte ptr [rdi+8]
- mov rcx, gword ptr [rdi]
+ mov rcx, rbp
mov rdx, rbx
call System.Runtime.CompilerServices.TaskAwaiter:UnsafeOnCompletedInternal(ref,ref,bool)
nop
-G_M63121_IG07:
- add rsp, 248
+G_M63121_IG05:
+ add rsp, 120
pop rbx
pop rbp
pop rsi
pop rdi
ret
-G_M63121_IG08:
- mov rcx, qword ptr [rbp+24]
+G_M63121_IG06:
+ mov rcx, qword ptr [rsi+56]
+ mov rcx, qword ptr [rcx+16]
test rcx, rcx
- jne SHORT G_M63121_IG09
+ jne SHORT G_M63121_IG07
mov rcx, rsi
lea rdx, [(reloc)]
call CORINFO_HELP_RUNTIMEHANDLE_METHOD
mov rcx, rax
-G_M63121_IG09:
+G_M63121_IG07:
mov rdx, rdi
mov r8, rbx
call System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[VoidTaskResult][System.Threading.Tasks.VoidTaskResult]:AwaitArbitraryAwaiterUnsafeOnCompleted(byref,ref)
nop
-G_M63121_IG10:
- add rsp, 248
+G_M63121_IG08:
+ add rsp, 120
pop rbx
pop rbp
pop rsi
pop rdi
ret I think it is should not be that hard to extend type tracking to byrefs of reference types, at least for locals and args and temps, and hopefully end up in a better place. We already have a "byref" bit in the locals table and room there for a class handle. But that is perhaps best done as a prerequisite change. [Edit] in this case we have a byref to canon, so there's no value add in tracking that, as we will not be able to resolve the casts. Instead we can guard the interface checks so they only apply to value classes. |
Think this is getting close, the extra "value class" guard in AwaitUnsafeOnCompleted has removed the regressions from the cases where the arguments are byrefs to ref classes. Jit-diffs now shows:
The big regression in This points out a shortcoming in the logic used by the "never inline" classifier in prejitting -- it should be setting the bar high enough that no matter what order the methods are seen when prejitting, the classifier will not end up blocking any inlines -- it should just help the jit reject failing inlines faster. Clearly we're not there yet, but this is the first time I've run across this. Tracking it via #14441. |
Bad example from above fixed with latest changes: -; Total bytes of code 226, prolog size 19 for method System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[VoidTaskResult][System.Threading.Tasks.VoidTaskResult]:AwaitUnsafeOnCompleted(byref,byref):this
-; Lcl frame size = 248
+; Total bytes of code 80, prolog size 12 for method System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[VoidTaskResult][System.Threading.Tasks.VoidTaskResult]:AwaitUnsafeOnCompleted(byref,byref):this
+; Lcl frame size = 96
G_M63121_IG01:
push rdi
push rsi
- push rbp
push rbx
- sub rsp, 248
- mov qword ptr [rsp+F0H], rdx
+ sub rsp, 96
+ mov qword ptr [rsp+58H], rdx
mov rsi, rdx
mov rdi, r8
G_M63121_IG02:
mov rdx, r9
call System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[VoidTaskResult][System.Threading.Tasks.VoidTaskResult]:GetStateMachineBox(byref):ref:this
mov rbx, rax
- mov rbp, qword ptr [rsi+56]
- mov rcx, qword ptr [rbp+16]
+ mov rcx, qword ptr [rsi+56]
+ mov rcx, qword ptr [rcx+16]
test rcx, rcx
jne SHORT G_M63121_IG03
mov rcx, rsi
@@ -22,62 +21,14 @@ G_M63121_IG02:
call CORINFO_HELP_RUNTIMEHANDLE_METHOD
mov rcx, rax
G_M63121_IG03:
- call CORINFO_HELP_GETGENERICS_NONGCSTATIC_BASE
- cmp byte ptr [rax+8], 0
- je SHORT G_M63121_IG05
- mov rcx, gword ptr [rdi]
- mov rdx, rbx
- mov r8d, 1
- call System.Runtime.CompilerServices.TaskAwaiter:UnsafeOnCompletedInternal(ref,ref,bool)
- nop
-G_M63121_IG04:
- add rsp, 248
- pop rbx
- pop rbp
- pop rsi
- pop rdi
- ret
-G_M63121_IG05:
- mov rcx, qword ptr [rbp+16]
- test rcx, rcx
- jne SHORT G_M63121_IG06
- mov rcx, rsi
- lea rdx, [(reloc)]
- call CORINFO_HELP_RUNTIMEHANDLE_METHOD
- mov rcx, rax
-G_M63121_IG06:
- call CORINFO_HELP_GETGENERICS_NONGCSTATIC_BASE
- cmp byte ptr [rax+9], 0
- je SHORT G_M63121_IG08
- movzx r8, byte ptr [rdi+8]
- mov rcx, gword ptr [rdi]
- mov rdx, rbx
- call System.Runtime.CompilerServices.TaskAwaiter:UnsafeOnCompletedInternal(ref,ref,bool)
- nop
-G_M63121_IG07:
- add rsp, 248
- pop rbx
- pop rbp
- pop rsi
- pop rdi
- ret
-G_M63121_IG08:
- mov rcx, qword ptr [rbp+24]
- test rcx, rcx
- jne SHORT G_M63121_IG09
- mov rcx, rsi
- lea rdx, [(reloc)]
- call CORINFO_HELP_RUNTIMEHANDLE_METHOD
- mov rcx, rax
-G_M63121_IG09:
mov rdx, rdi
mov r8, rbx
call System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[VoidTaskResult][System.Threading.Tasks.VoidTaskResult]:AwaitArbitraryAwaiterUnsafeOnCompleted(byref,ref)
nop
-G_M63121_IG10:
- add rsp, 248
+G_M63121_IG04:
+ add rsp, 96
pop rbx
- pop rbp
pop rsi
pop rdi
ret |
Failures in CI seem unrelated -- arm timeouts and x86 perf post processing data format issues. May retry tomorrow or let it happen organically after PR feedback. Changes have also passed desktop DDRs (save for SPMI which is still on the old guid) Removing the do not merge and submitting for official review. @dotnet/jit-contrib PTAL |
Also @stephentoub PTAL at changes to the async method builder code. |
typeof(TAwaiter) == typeof(TaskAwaiter<byte>) || | ||
typeof(TAwaiter) == typeof(TaskAwaiter<int>) || | ||
typeof(TAwaiter) == typeof(TaskAwaiter<long>)) | ||
if (null != (object)default(TAwaiter)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any benefit in extending this guard to also cover the branches for ValueTaskAwaiter and ConfiguredValueTaskAwaiter? Those are both structs as well. Or maybe it doesn't matter because the JIT is already good at the explicit type comparisons used there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I don't think as written this is correct. This condition will effectively be true for any non-nullable struct, right? In which case, this change causes the below ValueTaskAwaiter optimizations to be skipped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think you're right. I should inline the value class checks instead, something like:
if ((null != (object)default(TAwaiter)) && (awaiter is ITaskAwaiter))
{
ref TaskAwaiter ta = ref Unsafe.As<TAwaiter, TaskAwaiter>(ref awaiter); // relies on TaskAwaiter/TaskAwaiter<T> having the same layout
TaskAwaiter.UnsafeOnCompletedInternal(ta.m_task, box, continueOnCapturedContext: true);
}
else if ((null != (object)default(TAwaiter)) && (awaiter is IConfiguredTaskAwaiter))
{
ref ConfiguredTaskAwaitable.ConfiguredTaskAwaiter ta = ref Unsafe.As<TAwaiter, ConfiguredTaskAwaitable.ConfiguredTaskAwaiter>(ref awaiter);
TaskAwaiter.UnsafeOnCompletedInternal(ta.m_task, box, ta.m_continueOnCapturedContext);
}
@dotnet-bot test Windows_NT x64 corefx_baseline |
typeof(TAwaiter) == typeof(TaskAwaiter<byte>) || | ||
typeof(TAwaiter) == typeof(TaskAwaiter<int>) || | ||
typeof(TAwaiter) == typeof(TaskAwaiter<long>)) | ||
if ((null != (object)default(TAwaiter)) && (awaiter is ITaskAwaiter)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth a comment indicating why the null check is necessary? Or you'll be following up with subsequent changes to remove that, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The jit needs this check or it will leave casts around in some instantations. I will add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long the boxing is gone / the resulting asm looks good to you, the AsyncMethodBuilder.cs changes look good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas PTAL at runtime side
LGTM
@stephentoub here are method sizes for various instantations of The value class check is needed for the cases where the byref args are
|
@dotnet/jit-contrib anybody up for approving this? I will squash down when I merge. |
Implement the jit interface compareTypesForEquality method to handle casts from known types to known types, and from shared types to certain interface types. Call this method in the jit for castclass and isinst, using `gtGetClassHandle` to obtain the from type. Optimize sucessful casts and unsuccessful isinsts when the from type is known exactly. Undo part of the type-equality based optimization/workaround in the AsyncMethodBuilder code that was introduced in dotnet#14178 in favor of interface checks. There is more here that can be done here before this issue is entirely closed and I will look at this subsequently. This optimization allows the jit to remove boxes that are used solely to feed type casts, and so closes #12877.
129606c
to
7532f08
Compare
Squashed and updated commit text. @dotnet-bot test Windows_NT x64 corefx_baseline |
@sandreenko can you review? In particular the changes to the jit. |
|
||
op2 = impTokenToHandle(&resolvedToken, nullptr, FALSE); | ||
if (op2 == nullptr) | ||
{ // compDonotInline() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the comment compDonotInline() mean here?
Should it be deleted or tested?
It is tested later on on line 14400
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can change it to an assert that compilation failed. assert(compDonotInline());
|
||
op2 = impTokenToHandle(&resolvedToken, nullptr, FALSE); | ||
if (op2 == nullptr) | ||
{ // compDonotInline() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again we have a comment compDonotInline()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's address those in a subsequent PR, as there are a bunch more of them scattered about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Jit part looks good.
src/jit/importer.cpp
Outdated
// impOptimizeCastClassOrIsInst: attempt to resolve a cast when jitting | ||
// | ||
// Arguments: | ||
// op1 -- value to cast |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Arguments:
// <argument1-name> - Description of argument 1
// <argument2-name> - Description of argument 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed (though I happen to like --
better).
src/jit/importer.cpp
Outdated
@@ -10203,6 +10296,7 @@ void Compiler::impImportBlockCode(BasicBlock* block) | |||
var_types lclTyp, ovflType = TYP_UNKNOWN; | |||
GenTreePtr op1 = DUMMY_INIT(NULL); | |||
GenTreePtr op2 = DUMMY_INIT(NULL); | |||
GenTree* optTree = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to create new scopes under CEE_ISINST
and CASTCLASS:
and declare optTree
there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do.
Updated to address PR feedback. Also added a few targeted test cases I'd worked up. |
🎉 |
Work in progress. PR to get some broader testing.