JIT: Propagate assertions into natural loops in global morph #110501

hez2010 · 2024-12-07T12:22:00Z

Inspired by #109190, propagate assertions into natural loops in global morph as well, but don't publish the state.
Reuse the m_loop computed before.

Also, track vector count flag in assertions becuase we can propagate a vector count constant into a loop bound to enable unrolling now.

hez2010 · 2024-12-07T14:46:14Z

@MihuBot

hez2010 · 2024-12-07T14:58:22Z

Example diffs:

var config = new Config { MaxSize = 4 };
var sum = 0;
for (var i = 0; i < config.MaxSize; i++) sum += 42;
return sum;

struct Config
{
    public int MaxSize;
}

Before:

G_M24375_IG01:  ;; offset=0x0000
						;; size=0 bbWeight=0.25 PerfScore 0.00
G_M24375_IG02:  ;; offset=0x0000
       xor      eax, eax
       mov      ecx, 4
						;; size=7 bbWeight=0.25 PerfScore 0.12
G_M24375_IG03:  ;; offset=0x0007
       add      eax, 42
       dec      ecx
       jne      SHORT G_M24375_IG03
						;; size=7 bbWeight=4 PerfScore 6.00
G_M24375_IG04:  ;; offset=0x000E
       ret      
						;; size=1 bbWeight=1 PerfScore 1.00

After:

G_M24375_IG01:  ;; offset=0x0000
                                                ;; size=0 bbWeight=1 PerfScore 0.00
G_M24375_IG02:  ;; offset=0x0000
       mov      eax, 168
                                                ;; size=5 bbWeight=1 PerfScore 0.25
G_M24375_IG03:  ;; offset=0x0005
       ret
                                                ;; size=1 bbWeight=1 PerfScore 1.00

Because now icon 4 gets propagated into the loop bound, which allows unrolling to kick in.

hez2010 · 2024-12-07T19:04:29Z

How can the library tests fail only on Linux-musl-arm while passing on all other targets? Does this ring any bells?

EgorBo · 2024-12-07T19:14:29Z

How can the library tests failed only on Linux-musl-arm while passed on all other targets? Does this ring any bells?

Afair, that is the only target on our CI where we build everything on 64bit host (Linux-x64) so you need to be careful with (s)size_t etc (JIT is 64bit binary prejitting for 32bit target).

hez2010 · 2024-12-08T15:35:14Z

The GenTree created on x64 is:

               [000058] DACXG------                         *  STORE_LCL_VAR struct<System.Int128, 16>(P) V03 loc2
                                                            *    long   field V03._lower (fldOffset=0x0) -> V14 tmp9
                                                            *    long   field V03._upper (fldOffset=0x8) -> V15 tmp10
               [000055] --CXG------                         \--*  CALL r2r_ind struct System.Int128:op_Addition(System.Int128,System.Int128):System.Int128
               [000053] ----------- arg0                       +--*  LCL_VAR   struct<System.Int128, 16>(P) V07 tmp2
                                                               +--*    long   field V07._lower (fldOffset=0x0) -> V18 tmp13         (last use)
                                                               +--*    long   field V07._upper (fldOffset=0x8) -> V19 tmp14         (last use)
               [000054] -------N--- arg1                       \--*  LCL_VAR   struct<System.Int128, 16>(P) V06 tmp1
                                                                  *    long   field V06._lower (fldOffset=0x0) -> V16 tmp11         (last use)
                                                                  *    long   field V06._upper (fldOffset=0x8) -> V17 tmp12         (last use)

while on arm32 it's:

               [000053] S-C-G------                         *  CALL r2r_ind void   System.Int128:op_Addition(System.Int128,System.Int128):System.Int128
               [000058] D---------- retbuf                  +--*  LCL_ADDR  byref  V03 loc2         [+0]
               [000052] ----------- arg1                    +--*  LCL_VAR   struct<System.Int128, 16>(P) V06 tmp1
                                                            +--*    long   field V06._lower (fldOffset=0x0) -> V09 tmp4          (last use)
                                                            +--*    long   field V06._upper (fldOffset=0x8) -> V10 tmp5          (last use)
               [000056] ----------- arg2                    \--*  LCL_VAR   struct<System.Int128, 16>(RB) V07 tmp2          (last use)

So we end up not clearing assertions for V03 which leads to bad codegen.
It seems that we also need to clear assertions for LCL_ADDR in the pred if the user has any side effects while doing cross-block assertion propagation.

hez2010 · 2024-12-08T19:20:21Z

Diffs

jakobbotsch · 2024-12-09T10:12:31Z

This is the same as #101763. I would pick that PR up and work off that as it had correctness fixes.

It seems the same regressions still exist, so the real work here is to figure out why they happen and how to avoid them.

hez2010 · 2024-12-09T17:27:43Z

@MihuBot

hez2010 · 2024-12-09T19:34:37Z

New diffs after I increase the assertions limit to 2x

Now regressions are mostly caused by newly unrolled loops, despite that the diff are majorly improvements.

So the regressions we have before were caused by more assertions being created and unfortunately hit the limit, so that some assertions not being propagated anymore. We need to tune the limit if we want to make this go in.

EgorBo · 2024-12-09T21:30:40Z

New diffs after I increase the assertions limit to 2x

how do diffs look like if you increase the limit without your changes? As expected, the increased limit comes with a noticeable TP regression (~0.5%)

JulieLeeMSFT · 2025-01-06T17:26:31Z

@hez2010, the new change hides the perf regression. Please make the same change on base and rerun the diff and share with us.

New diffs after I increase the assertions limit to 2x

how do diffs look like if you increase the limit without your changes? As expected, the increased limit comes with a noticeable TP regression (~0.5%)

hez2010 · 2025-01-06T17:34:37Z

I'm converting this into draft while finding out why the regression happens.

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Dec 7, 2024

hez2010 changed the title ~~Propagate assertions into natural loops in global morph~~ JIT: Propagate assertions into natural loops in global morph Dec 7, 2024

dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Dec 7, 2024

This comment was marked as outdated.

Sign in to view

MihuBot mentioned this pull request Dec 7, 2024

[JitDiff X64] [hez2010] JIT: Propagate assertions into natural loops in global morph MihuBot/runtime-utils#805

Open

MihuBot mentioned this pull request Dec 7, 2024

[JitDiff X64] [hez2010] JIT: Propagate assertions into natural loops in global morph MihuBot/runtime-utils#806

Open

hez2010 force-pushed the global-morph-natural-loop branch from 00efebf to decda0b Compare December 7, 2024 17:06

hez2010 marked this pull request as ready for review December 7, 2024 19:03

This was referenced Dec 7, 2024

iOS test fails with "App is not signed" #110395

Closed

[OSX]: AMDeviceSecureInstallApplicationBundle returned: 0xe800801c #110403

Open

This comment was marked as resolved.

Sign in to view

build-analysis bot mentioned this pull request Dec 8, 2024

System.Formats.Nrbf.Tests timeouts #110285

Closed

hez2010 added 7 commits December 9, 2024 15:08

Propagate assertions into natural loops in global morph

daefef8

Format

defa118

Clear pred assertions if necessary

ab6c6b8

Use a better name

218fbd4

Keep tracking SIMD flag in assertion

e12478e

Fix 32 bit codegen

feef30a

Make it less conservative

4298c14

hez2010 force-pushed the global-morph-natural-loop branch from a76b7fc to 4298c14 Compare December 9, 2024 06:09

build-analysis bot mentioned this pull request Dec 9, 2024

linux-armel checked CoreCLR_NonPortable build failing in CI #110517

Closed

Fix the assertion count

dc8e022

hez2010 force-pushed the global-morph-natural-loop branch from d84fc00 to dc8e022 Compare December 9, 2024 15:15

Experiment

5fdbbe1

MihuBot mentioned this pull request Dec 9, 2024

[JitDiff X64] [hez2010] JIT: Propagate assertions into natural loops in global morph MihuBot/runtime-utils#809

Open

This was referenced Dec 10, 2024

slow macOS - "##[error]The job running on agent Azure Pipelines 9 ran longer than the maximum time of 60 minutes." dotnet/dnceng#1883

Open

The Operation will be canceled. The next steps may not contain expected logs. dotnet/dnceng#3008

Open

hez2010 mentioned this pull request Dec 19, 2024

JIT: Extend escape analysis to account for arrays with non-gcref elements #104906

Open

JulieLeeMSFT added the needs-author-action An issue or pull request that requires more info or actions from the author. label Jan 6, 2025

dotnet-policy-service bot removed the needs-author-action An issue or pull request that requires more info or actions from the author. label Jan 6, 2025

hez2010 marked this pull request as draft January 6, 2025 17:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Propagate assertions into natural loops in global morph #110501

JIT: Propagate assertions into natural loops in global morph #110501

hez2010 commented Dec 7, 2024 •

edited

Loading

This comment was marked as outdated.

hez2010 commented Dec 7, 2024

hez2010 commented Dec 7, 2024

hez2010 commented Dec 7, 2024 •

edited

Loading

EgorBo commented Dec 7, 2024 •

edited

Loading

This comment was marked as resolved.

hez2010 commented Dec 8, 2024 •

edited

Loading

hez2010 commented Dec 8, 2024

jakobbotsch commented Dec 9, 2024

hez2010 commented Dec 9, 2024

hez2010 commented Dec 9, 2024 •

edited

Loading

EgorBo commented Dec 9, 2024

JulieLeeMSFT commented Jan 6, 2025

hez2010 commented Jan 6, 2025

JIT: Propagate assertions into natural loops in global morph #110501

Are you sure you want to change the base?

JIT: Propagate assertions into natural loops in global morph #110501

Conversation

hez2010 commented Dec 7, 2024 • edited Loading

This comment was marked as outdated.

hez2010 commented Dec 7, 2024

hez2010 commented Dec 7, 2024

hez2010 commented Dec 7, 2024 • edited Loading

EgorBo commented Dec 7, 2024 • edited Loading

This comment was marked as resolved.

hez2010 commented Dec 8, 2024 • edited Loading

hez2010 commented Dec 8, 2024

jakobbotsch commented Dec 9, 2024

hez2010 commented Dec 9, 2024

hez2010 commented Dec 9, 2024 • edited Loading

EgorBo commented Dec 9, 2024

JulieLeeMSFT commented Jan 6, 2025

hez2010 commented Jan 6, 2025

hez2010 commented Dec 7, 2024 •

edited

Loading

hez2010 commented Dec 7, 2024 •

edited

Loading

EgorBo commented Dec 7, 2024 •

edited

Loading

hez2010 commented Dec 8, 2024 •

edited

Loading

hez2010 commented Dec 9, 2024 •

edited

Loading