Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Mark replacements as dirty after setting GTF_VAR_DEATH #88669

Merged
merged 2 commits into from
Jul 13, 2023

Conversation

jakobbotsch
Copy link
Member

Physically promoted struct locals are marked as dying when their remainder dies since all other state is stored in other locals. However, when we do this we must also mark all replacements as stale; otherwise a future struct use could skip writebacks and effectively introduce new uses, invalidating the previously marked last use bit.

Fix #88616

Note that the regression test introduced here requires #88665 to actually fail.

Physically promoted struct locals are marked as dying when their
remainder dies since all other state is stored in other locals. However,
when we do this we must also mark all replacements as stale; otherwise
a future struct use could skip writebacks and effectively introduce new
uses, invalidating the previously marked last use bit.

Fix dotnet#88616
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 11, 2023
@ghost ghost assigned jakobbotsch Jul 11, 2023
@ghost
Copy link

ghost commented Jul 11, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Physically promoted struct locals are marked as dying when their remainder dies since all other state is stored in other locals. However, when we do this we must also mark all replacements as stale; otherwise a future struct use could skip writebacks and effectively introduce new uses, invalidating the previously marked last use bit.

Fix #88616

Note that the regression test introduced here requires #88665 to actually fail.

Author: jakobbotsch
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Jul 11, 2023

cc @dotnet/jit-contrib PTAL @AndyAyersMS

No diffs.

Looks like I'll need to rerun CI; license/cla seems to be stuck.

Diff in test case:

@@ -1,30 +1,30 @@
 ; Assembly listing for method Runtime_88616:TestEntryPoint():int (FullOpts)
 ; Emitting BLENDED_CODE for X64 with AVX - Windows
 ; FullOpts code
 ; optimized code
 ; rsp based frame
 ; partially interruptible
 ; No PGO data
 ; Final local variable assignments
 ;
-;  V00 loc0         [V00    ] (  7,  7   )  struct (40) [rsp+20H]   do-not-enreg[XSF] must-init addr-exposed ld-addr-op
+;  V00 loc0         [V00    ] ( 12, 12   )  struct (40) [rsp+20H]   do-not-enreg[XSF] must-init addr-exposed ld-addr-op
 ;* V01 loc1         [V01    ] (  0,  0   )    long  ->  zero-ref   
 ;  V02 OutArgs      [V02    ] (  1,  1   )  struct (32) [rsp+00H]   do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
 ;* V03 tmp1         [V03    ] (  0,  0   )    long  ->  zero-ref    "dup spill"
 ;* V04 tmp2         [V04    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
 ;* V05 tmp3         [V05    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
 ;* V06 tmp4         [V06    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
 ;* V07 tmp5         [V07    ] (  0,  0   )    long  ->  zero-ref    "dup spill"
 ;* V08 tmp6         [V08    ] (  0,  0   )    long  ->  zero-ref    "dup spill"
 ;* V09 tmp7         [V09    ] (  0,  0   )    long  ->  zero-ref    "dup spill"
 ;* V10 tmp8         [V10    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
 ;* V11 tmp9         [V11    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
 ;* V12 tmp10        [V12    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
 ;* V13 tmp11        [V13    ] (  0,  0   )    long  ->  zero-ref    "dup spill"
 ;* V14 tmp12        [V14    ] (  0,  0   )    long  ->  zero-ref    "dup spill"
 ;* V15 tmp13        [V15    ] (  0,  0   )    long  ->  zero-ref    "dup spill"
 ;* V16 tmp14        [V16    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
 ;* V17 tmp15        [V17    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
 ;* V18 tmp16        [V18    ] (  0,  0   )    long  ->  zero-ref    "impAppendStmt"
 ;* V19 tmp17        [V19    ] (  0,  0   )    long  ->  zero-ref    "dup spill"
 ;* V20 tmp18        [V20    ] (  0,  0   )    long  ->  zero-ref    "dup spill"
@@ -47,38 +47,37 @@
 ;* V37 tmp35        [V37    ] (  0,  0   )    long  ->  zero-ref    "V00.[032..040)"
 ;
 ; Lcl frame size = 72
 
 G_M45268_IG01:  ;; offset=0000H
        sub      rsp, 72
        vxorps   xmm4, xmm4, xmm4
        vmovdqa  xmmword ptr [rsp+20H], xmm4
        vmovdqa  xmmword ptr [rsp+30H], xmm4
        xor      eax, eax
        mov      qword ptr [rsp+40H], rax
 						;; size=27 bbWeight=1 PerfScore 5.83
 G_M45268_IG02:  ;; offset=001BH
        mov      qword ptr [rsp+20H], 50
        mov      qword ptr [rsp+28H], 100
        mov      qword ptr [rsp+30H], 150
        mov      qword ptr [rsp+38H], 200
        mov      qword ptr [rsp+40H], 250
        lea      rcx, [rsp+20H]
        call     [Runtime_88616:Mutate(Runtime_88616+S)]
+       mov      qword ptr [rsp+20H], 50
+       mov      qword ptr [rsp+28H], 100
+       mov      qword ptr [rsp+30H], 150
+       mov      qword ptr [rsp+38H], 200
+       mov      qword ptr [rsp+40H], 250
        lea      rcx, [rsp+20H]
        call     [Runtime_88616:Check(Runtime_88616+S):int]
        nop      
-						;; size=68 bbWeight=1 PerfScore 12.25
-G_M45268_IG03:  ;; offset=005FH
+						;; size=113 bbWeight=1 PerfScore 17.25
+G_M45268_IG03:  ;; offset=008CH
        add      rsp, 72
        ret      
 						;; size=5 bbWeight=1 PerfScore 1.25
 
-; Total bytes of code 100, prolog size 27, PerfScore 29.33, instruction count 18, allocated bytes for code 100 (MethodHash=35b94f2b) for method Runtime_88616:TestEntryPoint():int (FullOpts)
+; Total bytes of code 145, prolog size 27, PerfScore 38.83, instruction count 23, allocated bytes for code 145 (MethodHash=35b94f2b) for method Runtime_88616:TestEntryPoint():int (FullOpts)
 ; ============================================================
 
-FAIL: s.A == -42
-Xunit.Sdk.EqualException: Assert.Equal() Failure
-Expected: 100
-Actual:   -1
-   at Xunit.Assert.Equal[T](T expected, T actual, IEqualityComparer`1 comparer) in /_/src/xunit.assert/Asserts/EqualityAsserts.cs:line 96
-   at __GeneratedMainWrapper.Main()

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused about something here -- are we somehow avoiding making a copy of foo to pass to Mutate?

@AndyAyersMS
Copy link
Member

I am confused about something here -- are we somehow avoiding making a copy of foo to pass to Mutate?

Your repro case passes for me w/o your fix and makes a copy to pass to Mutate

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Jul 13, 2023

I am confused about something here -- are we somehow avoiding making a copy of foo to pass to Mutate?

Your repro case passes for me w/o your fix and makes a copy to pass to Mutate

Yes, we avoid the copy, but the particular test I have introduced requires #88665 to actually avoid the copy and expose the problem. Without that change we do not mark foo as a last use when passed to the first call because we think the second appearance uses the remainder (even though there is none).

The Fuzzlyn found test case is more complex but repros the bug without #88665. The found test case there has an actual unpromoted remainder and redefines it between the two calls which has the effect of both uses of vr37 being marked as last uses.

@jakobbotsch
Copy link
Member Author

/azp run runtime, runtime-coreclr superpmi-diffs

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@jakobbotsch
Copy link
Member Author

There's a couple of diffs now with this change after #88665 was merged. The diffs failure is some internal server error from Helix. The other failures are the known file check ones, it seems.

@jakobbotsch jakobbotsch merged commit ddbce91 into dotnet:main Jul 13, 2023
128 of 135 checks passed
@jakobbotsch jakobbotsch deleted the fix-88616 branch July 13, 2023 20:36
@ghost ghost locked as resolved and limited conversation to collaborators Aug 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fuzzlyn failure on linux arm64
2 participants