AggressiveInlining not respected when Method grow too large #41692

HurricanKai · 2020-09-01T19:34:59Z

Description

JIT will refuse to inline AggressiveInlining methods when Methods grow too large (IL size / too many branches, I'm unsure which one, but I expect IL size)
If I understand AggressiveInlining correctly, this is not intended.

This happened to me in a method like the below, but I've encountered it before:

public IntPtr Load(int slot) {
    if (slot == SLOT_ONE)
    {
         return _slotOne;
    }
    // x 1000 or so

    ThrowHelper();
    return default;
}

This does not inline/constant fold, leading to quite the performance hit since now all those comparisons are actual comparisons.
Splitting this into a binary tree of sorts has led to JIT inlining it, and a good performance increase.

Analysis

I think this can be related to #38106
I don't understand JIT very well, but I think this can have multiple causes.

a deeper call tree, which stops the changes introduced in JIT: allow some aggressive inlines to go over budget #38163 from helping

It might be an idea to look for comparisons with constants that "end" the method and prioritize inlining? Looking at the inlinepolicy, which I think is where this kind of thing is decided, this would be difficult, but possible?

category:cq
theme:inlining
skill-level:intermediate
cost:medium

AndyAyersMS · 2020-09-01T19:55:01Z

Aggressive inline candidates are assumed to be profitable; they bypass the normal profitability checks. So in particular we won't analyze their IL for patterns that might be optimizable.

Can you provide a repro for the issues you're seeing? We are trying to be as aggressive as we can without getting into situations where the jit will create enormous methods or use huge amounts of memory while jitting.

HurricanKai · 2020-09-01T20:21:56Z

~~Sample.zip~~
the B struct contains a "simple" method with 4000 lookups, and while this is extreme, this is a real-world-like case. I'm writing a Generator that binds native methods, and am generating a table to cache the function pointers to those methods. I generate well over 4000 such cache slots.

EgorBo · 2020-09-01T20:34:36Z

Your repro doesn't have AggressiveInlining anywhere. While Lookup looks like it should be inlined to a single instruction when the arg is a constant I believe JIT doesn't want to spend time on analyzing big methods (because in 99% cases those won't be profitable to inline), e.g. in your case:

Inlines into 06000004 Test.Samples:AFive(Test.A):long:this
  [0 IL=0003 TR=000003 06000002] [FAILED: too many il bytes] Test.A:Lookup(int):long:this
Budget: initialTime=87, finalTime=87, initialBudget=870, currentBudget=870
Budget: initialSize=336, finalSize=33

if you put AggressiveInlining on top of Lookup it should work fine.

HurricanKai · 2020-09-01T21:16:29Z

Sorry. Not sure what I thought with that sample.

I've just reverted some changes, and it appears what is actually happening (I didn't know this was a thing when I first encountered this issue) is I'm hitting the m_CallsiteDepth == 1 limitation.
I think it works something like this: I'm generating a Load(int slot, string entrypoint) method, which just calls another method, which is the actual lookup table. This simple call is of course inlined, which "consumes" the top-level inline possibility? The actual lookup is way too big to be considered for lookup, and can't be considered for inlining anyways cause m_CallsiteDepth exceeds 1 already.

This is unfortunate, but I'm not too sure if JIT can do much about this (besides raising the limit, but I'm sure there are some reasons for the limit being this low). I'll definitely have to keep this in mind, since I often rely heavily on JIT eliminating branches like this at runtime...

Apologies again for the incorrect sample 😅
Also, here is the actual generated code + disassembly.
(Relevant method is GeneratedVTable:Load being inlined to GeneratedVTable:Load_2575_67097892)

JulieLeeMSFT · 2020-09-01T23:22:35Z

@AndyAyersMS Since you are looking into it, assigning this to you. Please help triage this.

AndyAyersMS · 2020-09-01T23:49:14Z

; Total bytes of code 119642, prolog size 10, PerfScore 43637.45, 
   (MethodHash=d89b1d10) for method GeneratedVTable:Load_2575_67097892(int,String):long:this

@HurricanKai thanks for the example. The jit should be able to tell after inlining such a method that it wasn't nearly as big as it seemed; I'm trying to think of how it could figure this out ahead of time, or along the way. Ahead of time seems hard; as @EgorBo says, we don't want to spend a lot of time analyzing large methods; and our analysis isn't all that powerful.

But checking along the way seems viable -- I think something like the following can work. When the jit sees an AggressiveInlining candidate that is projected to go over budget, it can go ahead and starts inlining anyways, keeping track in real time of how much of the method has been imported and continually checking that versus the budget. If in the middle of inline the jit goes over budget, wil then abandon the inline.

That way if the jit is really just touching a fraction of a huge method during importation then the inline can succeed. The only downside is that the jit may spend a bunch of cycles attempting inlines that later are abandoned, but that shouldn't happen too often, and the use of AggressiveInlining gives the jit some license to spend a bit more time jitting.

JulieLeeMSFT · 2020-09-02T17:28:33Z

Moving to .NET 6.

mikernet · 2020-12-02T15:29:02Z

@HurricanKai

Splitting this into a binary tree of sorts

Splitting methods with trial and error until they inline is somewhat annoying, but if I'm accurately picturing what you mean by a "binary tree of sorts" then breaking it up in a linear fashion like so may be easier and less tedious.

HurricanKai · 2020-12-02T15:31:56Z

Thank you, that is what I did for the functions were I did this manually, and only have few branches, but in some cases I have 1000+ cases and I believe a binary tree like that will yield less IL, so less binary size.

AndyAyersMS · 2021-04-21T02:27:45Z

@HurricanKai any way I can get a repro for this to test out some jit changes?

Is the one up above that is ~~crossed out~~ viable?

HurricanKai · 2021-04-21T07:10:45Z

I believe the https://github.com/dotnet/runtime/files/5158907/OpenGL.zip is the only one that actually reproduced this. I can look at making a simpler repro, if needed.

AndyAyersMS · 2021-04-21T15:40:12Z

It doesn't need to be simpler.

I'm not sure how to turn that into something I can compile.

… opportunities Look for IL patterns in an inlinee that are indicative of type folding. Also track when an inlinee has calls that are intrinsics. Use this to boost the inlinee profitability estimate. Also, allow an aggressive inline inline candidate method to go over the budget when it contains type folding or intrinsics (and so the method may be simpler than it appears). Remove the old criteria (top-level inline) which was harder to reason about. Closes dotnet#43761. Closes dotnet#41692. Closes dotnet#51587. Closes dotnet#47434.

AndyAyersMS · 2021-06-14T20:27:00Z

@EgorBo perhaps you can follow up here?

AndyAyersMS · 2021-07-12T19:49:00Z

Not sure we're going to get to this in 6.0, so moving to future.

HurricanKai added the tenet-performance Performance related issue label Sep 1, 2020

Dotnet-GitSync-Bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels Sep 1, 2020

JulieLeeMSFT assigned AndyAyersMS Sep 1, 2020

AndyAyersMS removed the untriaged New issue has not been triaged by the area owner label Sep 1, 2020

JulieLeeMSFT added this to the 6.0.0 milestone Sep 2, 2020

AndyAyersMS mentioned this issue Oct 23, 2020

System.Numerics.Vector: Recognize division by a constant and inline #43761

Closed

BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020

AndyAyersMS mentioned this issue Nov 4, 2020

JIT: minor inliner refactoring #44215

Merged

BruceForstall removed the JitUntriaged CLR JIT issues needing additional triage label Nov 10, 2020

SingleAccretion mentioned this issue Jan 26, 2021

Weird inlining of methods optimized by JIT #47434

Closed

tonyredondo mentioned this issue Mar 22, 2021

CallTargetState and CallTargetReturn micro-optimizations DataDog/dd-trace-dotnet#1292

Merged

AndyAyersMS mentioned this issue Apr 21, 2021

JIT code optimization inconsistency #51587

Closed

AndyAyersMS mentioned this issue Apr 21, 2021

JIT: revise criteria for inline candidates with intrinsics or folding… #51634

Closed

ghost added the in-pr There is an active PR which will close this issue when it is merged label Apr 21, 2021

ghost removed the in-pr There is an active PR which will close this issue when it is merged label May 10, 2021

EgorBo mentioned this issue Jun 29, 2021

[JIT] Improve inliner: new heuristics, rely on PGO data #52708

Merged

AndyAyersMS modified the milestones: 6.0.0, Future Jul 12, 2021

EgorBo mentioned this issue Sep 12, 2021

Inliner: next steps #59002

Open

AndyAyersMS mentioned this issue Sep 22, 2021

Potential performance regression in OrdinalIgnoreCase string comparision #59087

Closed

This was referenced Nov 21, 2022

Casting via generic math doesn't always inline as expected #78648

Closed

Allow inlining in some over-bugdet cases #78874

Merged

ghost added the in-pr There is an active PR which will close this issue when it is merged label Nov 26, 2022

EgorBo modified the milestones: Future, 8.0.0 Nov 26, 2022

EgorBo assigned EgorBo and unassigned AndyAyersMS Nov 26, 2022

EgorBo closed this as completed in #78874 Dec 1, 2022

ghost removed the in-pr There is an active PR which will close this issue when it is merged label Dec 1, 2022

ghost locked as resolved and limited conversation to collaborators Dec 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AggressiveInlining not respected when Method grow too large #41692

AggressiveInlining not respected when Method grow too large #41692

HurricanKai commented Sep 1, 2020 •

edited by AndyAyersMS

Loading

AndyAyersMS commented Sep 1, 2020

HurricanKai commented Sep 1, 2020 •

edited

Loading

EgorBo commented Sep 1, 2020 •

edited

Loading

HurricanKai commented Sep 1, 2020

JulieLeeMSFT commented Sep 1, 2020

AndyAyersMS commented Sep 1, 2020

JulieLeeMSFT commented Sep 2, 2020

mikernet commented Dec 2, 2020 •

edited

Loading

HurricanKai commented Dec 2, 2020

AndyAyersMS commented Apr 21, 2021

HurricanKai commented Apr 21, 2021

AndyAyersMS commented Apr 21, 2021

AndyAyersMS commented Jun 14, 2021

AndyAyersMS commented Jul 12, 2021

AggressiveInlining not respected when Method grow too large #41692

AggressiveInlining not respected when Method grow too large #41692

Comments

HurricanKai commented Sep 1, 2020 • edited by AndyAyersMS Loading

Description

Analysis

AndyAyersMS commented Sep 1, 2020

HurricanKai commented Sep 1, 2020 • edited Loading

EgorBo commented Sep 1, 2020 • edited Loading

HurricanKai commented Sep 1, 2020

JulieLeeMSFT commented Sep 1, 2020

AndyAyersMS commented Sep 1, 2020

JulieLeeMSFT commented Sep 2, 2020

mikernet commented Dec 2, 2020 • edited Loading

HurricanKai commented Dec 2, 2020

AndyAyersMS commented Apr 21, 2021

HurricanKai commented Apr 21, 2021

AndyAyersMS commented Apr 21, 2021

AndyAyersMS commented Jun 14, 2021

AndyAyersMS commented Jul 12, 2021

HurricanKai commented Sep 1, 2020 •

edited by AndyAyersMS

Loading

HurricanKai commented Sep 1, 2020 •

edited

Loading

EgorBo commented Sep 1, 2020 •

edited

Loading

mikernet commented Dec 2, 2020 •

edited

Loading