optimizer: supports callsite annotations of inlining, fixes #18773 #41328

aviatesk · 2021-06-23T11:26:20Z

Enable @inline/@noinline annotations on function callsites.
From #40754.

Now @inline and @noinline can be applied to a code block and then
the compiler will try to (or try not to) inline calls within the block:

@inline f(...) # The compiler will try to inline `f`

@inline f(...) + g(...) # The compiler will try to inline `f`, `g` and `+`

@inline f(args...) = ... # Of course annotations on a definition is still allowed

Here are couple of notes on how those callsite annotations will work:

callsite annotation always has the precedence over the annotation
applied to the definition of the called method, whichever we use
@inline/@noinline:

@inline function explicit_inline(args...)
    # body
end

let
    @noinline explicit_inline(args...) # this call will not be inlined
end

when callsite annotations are nested, the innermost annotations has
the precedence

@noinline let a0, b0 = ...
    a = @inline f(a0)  # the compiler will try to inline this call
    b = notinlined(b0) # the compiler will NOT try to inline this call
    return a, b
end

They're both tested and included in documentations.

Co-authored-by: Joseph Tan jdtan638@gmail.com

aviatesk · 2021-06-26T07:01:26Z

base/compiler/optimize.jl

+        # when the source isn't available at this moment, try to re-infer and inline it
+        # NOTE we can make inference try to keep the source if the call is going to be inlined,
+        # but then inlining will depend on local state of inference and so the first entry
+        # and the succeeding ones may generate different code; rather we always re-infer
+        # the source to avoid the problem while it's obviously not most efficient
+        # HACK disable inlining for the re-inference to avoid cycles by making sure the following inference never comes here again
+        interp = NativeInterpreter(get_world_counter(interp); opt_params = OptimizationParams(; inlining = false))
+        src, rt = typeinf_code(interp, match.method, match.spec_types, match.sparams, true)
+        return src


I tried the approach to use uncached source when the callsite is forced to be inlined (a1b1435), but it turns out that then inlining will depend o local state of inference and so the first entry and the succeeding ones may generate different code, e.g. from a test case I added:

function multilimited(a) if @noinline(isType(a)) return @inline(multilimited(a.parameters[1])) else return rand(Bool) ? rand(a) : @inline(multilimited(a)) end end # output might be different if we depend on inference local state code_typed(m.multilimited, (Any,)) code_typed(m.multilimited, (Any,))

We might be able to keep source so that it doesn't depend on the local state, but it could be tricky.
Rather I'd like to always just re-infer when a source is unavailable, at least in this first iteration.

/cc @vtjnash

I think this needs to happen during inference. It seems quite wrong to make a new NativeInterpreter here, and can probably have many other problems. We used to do some amount of recursion here, and we have gotten rid of it.

What do you think about fea41f2 ? We still need fresh optimization w/o inlining hack because those uncached sources are not optimized.

Well, now I'm wondering if it's really useful. I benchmarked some functions that hit this pass, and I couldn't find any cases where this "single-level-inlining" can help the performance.. Honestly I don't think it's worth the complexity that comes with this support.

#41328 (comment)

tisztamo · 2021-06-28T09:35:36Z

Does callsite @inline force inlining completely, ignoring the cost model? (Also close #40292?) (re-commenting after accidental delete)

aviatesk · 2021-06-28T09:41:09Z

Does callsite @inline force inlining completely, ignoring the cost model?

Callsite inlining will happen in regardless of the cost model, but not "force inlining completely", since there are some cases when a method body itself is unavailable (think when there are recursive calls but annotated as @inline).

I would appreciate if you could test this PR for your target, though.

tisztamo · 2021-06-28T10:37:28Z

Just tested and it works, so this also closes #40292. Thank you very much!

base/compiler/optimize.jl

src/method.c

After #41328, inference can observe statement flags and try to re-infer a discarded source if it's going to be inlined. The re-inferred source will only be cached into the inference-local cache, and won't be cached globally.

aviatesk · 2021-09-04T07:55:01Z

#42082 should fix the "serious" flaw with this PR, and now callsite inlining should work more reliably.

xrefs: - <JuliaLang/julia#41312> - <JuliaLang/julia#41328> Built on top of <#752>.

xrefs: - <JuliaLang/julia#41312> - <JuliaLang/julia#41328>

chriselrod · 2021-09-23T22:40:07Z

@oscardssmith If you haven't tried this out yet:

julia> function vexp!(y,x)
           @inbounds for i ∈ eachindex(y,x)
               y[i] = exp(x[i])
           end
       end
vexp! (generic function with 1 method)

julia> function vexp_inline!(y,x)
           @inbounds for i ∈ eachindex(y,x)
               y[i] = @inline exp(x[i])
           end
       end
vexp_inline! (generic function with 1 method)

julia> x = randn(256); y = similar(x);

julia> @benchmark vexp!($y, $x)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min … max):  1.234 μs …  2.336 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     1.329 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.334 μs ± 49.344 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                       █▄                                    ▁
  ▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅██▁▇▃▁▆▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▃▁▁▁▁▁▆▃▄▄▇▆▅▆ █
  1.23 μs      Histogram: log(frequency) by time     1.49 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark vexp_inline!($y, $x)
BenchmarkTools.Trial: 10000 samples with 540 evaluations.
 Range (min … max):  212.893 ns … 316.361 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     213.315 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   214.200 ns ±   6.480 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▂▇█▇▃                   ▁▂▃▂▂▂▂                              ▂
  ▅█████▃▁▃▃▁▁▁▁▁▁▄▁▁▁▁▁▃▄▆████████▇██▇▇▇▆▅▄▃▅▄▅▃▁▄▄▃▄▄▄▄▄▅▃▄▄▅ █
  213 ns        Histogram: log(frequency) by time        220 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

@inline

…#18773 (JuliaLang#41328) * optimizer: supports callsite annotations of inlining, fixes JuliaLang#18773 Enable `@inline`/`@noinline` annotations on function callsites. From JuliaLang#40754. Now `@inline` and `@noinline` can be applied to a code block and then the compiler will try to (not) inline calls within the block: ```julia @inline f(...) # The compiler will try to inline `f` @inline f(...) + g(...) # The compiler will try to inline `f`, `g` and `+` @inline f(args...) = ... # Of course annotations on a definition is still allowed ``` Here are couple of notes on how those callsite annotations will work: - callsite annotation always has the precedence over the annotation applied to the definition of the called function, whichever we use `@inline`/`@noinline`: ```julia @inline function explicit_inline(args...) # body end let @noinline explicit_inline(args...) # this call will not be inlined end ``` - when callsite annotations are nested, the innermost annotations has the precedence ```julia @noinline let a0, b0 = ... a = @inline f(a0) # the compiler will try to inline this call b = notinlined(b0) # the compiler will NOT try to inline this call return a, b end ``` They're both tested and included in documentations. * set ssaflags on `CodeInfo` construction * try to keep source if it will be force-inlined * give inlining when source isn't available * style nits * Update base/compiler/ssair/inlining.jl Co-authored-by: Jameson Nash <vtjnash@gmail.com> * Update src/method.c Co-authored-by: Jameson Nash <vtjnash@gmail.com> * fixup - remove preprocessed flags from `jl_code_info_set_ir` - fix duplicated definition warning - add and fix comments * more clean up * add caveat about the recursive call limitation * update NEWS.md Co-authored-by: Jameson Nash <vtjnash@gmail.com>

…liaLang#42082) After JuliaLang#41328, inference can observe statement flags and try to re-infer a discarded source if it's going to be inlined. The re-inferred source will only be cached into the inference-local cache, and won't be cached globally.

@inline

…#18773 (JuliaLang#41328) * optimizer: supports callsite annotations of inlining, fixes JuliaLang#18773 Enable `@inline`/`@noinline` annotations on function callsites. From JuliaLang#40754. Now `@inline` and `@noinline` can be applied to a code block and then the compiler will try to (not) inline calls within the block: ```julia @inline f(...) # The compiler will try to inline `f` @inline f(...) + g(...) # The compiler will try to inline `f`, `g` and `+` @inline f(args...) = ... # Of course annotations on a definition is still allowed ``` Here are couple of notes on how those callsite annotations will work: - callsite annotation always has the precedence over the annotation applied to the definition of the called function, whichever we use `@inline`/`@noinline`: ```julia @inline function explicit_inline(args...) # body end let @noinline explicit_inline(args...) # this call will not be inlined end ``` - when callsite annotations are nested, the innermost annotations has the precedence ```julia @noinline let a0, b0 = ... a = @inline f(a0) # the compiler will try to inline this call b = notinlined(b0) # the compiler will NOT try to inline this call return a, b end ``` They're both tested and included in documentations. * set ssaflags on `CodeInfo` construction * try to keep source if it will be force-inlined * give inlining when source isn't available * style nits * Update base/compiler/ssair/inlining.jl Co-authored-by: Jameson Nash <vtjnash@gmail.com> * Update src/method.c Co-authored-by: Jameson Nash <vtjnash@gmail.com> * fixup - remove preprocessed flags from `jl_code_info_set_ir` - fix duplicated definition warning - add and fix comments * more clean up * add caveat about the recursive call limitation * update NEWS.md Co-authored-by: Jameson Nash <vtjnash@gmail.com>

…liaLang#42082) After JuliaLang#41328, inference can observe statement flags and try to re-infer a discarded source if it's going to be inlined. The re-inferred source will only be cached into the inference-local cache, and won't be cached globally.

aviatesk requested review from vtjnash and Keno June 23, 2021 11:26

aviatesk mentioned this pull request Jun 23, 2021

optimizer: support callsite annotations of @inline and @noinline #40754

Closed

aviatesk requested a review from JeffBezanson June 23, 2021 11:30

aviatesk force-pushed the avi/callsiteinline branch 4 times, most recently from 9a3673d to fe3e8c2 Compare June 26, 2021 06:51

aviatesk commented Jun 26, 2021

View reviewed changes

aviatesk force-pushed the avi/callsiteinline branch from fe3e8c2 to 830cd9f Compare June 26, 2021 07:19

tisztamo mentioned this pull request Jun 28, 2021

More control over inlining #40292

Closed

tisztamo mentioned this pull request Jun 28, 2021

Improve performance JuliaFolds/FoldsCatwalk.jl#1

Open

aviatesk force-pushed the avi/callsiteinline branch 2 times, most recently from 0bb8e7d to ec1d492 Compare July 3, 2021 14:20

aviatesk force-pushed the avi/callsiteinline branch from ec1d492 to 147c1ea Compare July 7, 2021 05:14

aviatesk force-pushed the avi/callsiteinline branch from 147c1ea to 83b77e9 Compare July 14, 2021 16:44

aviatesk force-pushed the avi/callsiteinline branch from 83b77e9 to d6942b1 Compare July 25, 2021 04:09

timholy mentioned this pull request Aug 10, 2021

precompile interacts badly with Const-specialization #38983

Closed

aviatesk force-pushed the avi/callsiteinline branch from d6942b1 to 0c338b7 Compare August 13, 2021 15:22

vtjnash reviewed Aug 19, 2021

View reviewed changes

base/compiler/optimize.jl Show resolved Hide resolved

aviatesk force-pushed the avi/callsiteinline branch from 0c338b7 to 6406503 Compare August 20, 2021 16:37

vtjnash reviewed Aug 20, 2021

View reviewed changes

src/method.c Show resolved Hide resolved

vtjnash reviewed Aug 20, 2021

View reviewed changes

src/method.c Outdated Show resolved Hide resolved

aviatesk force-pushed the avi/callsiteinline branch 2 times, most recently from c251d3d to fea41f2 Compare August 21, 2021 16:47

aviatesk mentioned this pull request Aug 21, 2021

introduce @nospecializeinfer macro to tell the compiler to avoid excess inference #41931

Merged

aviatesk force-pushed the avi/callsiteinline branch from fea41f2 to 5557c2f Compare August 21, 2021 17:27

aviatesk added a commit to aviatesk/JET.jl that referenced this pull request Sep 1, 2021

update to JuliaLang/julia#41328

62b82e2

aviatesk added a commit to aviatesk/JET.jl that referenced this pull request Sep 1, 2021

update to JuliaLang/julia#41328 (#245)

195a403

aviatesk added a commit to JuliaDebug/Cthulhu.jl that referenced this pull request Sep 1, 2021

update to JuliaLang/julia#41328 (#223)

c72feb1

This comment has been minimized.

Sign in to view

ranocha mentioned this pull request Sep 1, 2021

Think about callsite inlining when it's shipped officially trixi-framework/Trixi.jl#836

Open

This comment has been minimized.

Sign in to view

aviatesk mentioned this pull request Sep 1, 2021

fix #42078, improve the idempotency of callsite inlining #42082

Merged

aviatesk mentioned this pull request Sep 4, 2021

supports @inline/@noinline annotations within a function body #41312

Merged

aviatesk mentioned this pull request Sep 4, 2021

update to https://github.com/JuliaLang/julia/pull/42082 JuliaDebug/Cthulhu.jl#224

Merged

timholy added the Add to Compat.jl label Sep 4, 2021

DilumAluthge removed the status:merge me PR is reviewed. Merge when all tests are passing label Sep 6, 2021

aviatesk added a commit to JuliaLang/Compat.jl that referenced this pull request Sep 11, 2021

add supports for new @inline and @noinline features

cfe0078

xrefs: - <JuliaLang/julia#41312> - <JuliaLang/julia#41328> Built on top of <#752>.

aviatesk mentioned this pull request Sep 11, 2021

add supports for new @inline and @noinline features JuliaLang/Compat.jl#753

Merged

aviatesk added a commit to JuliaLang/Compat.jl that referenced this pull request Sep 11, 2021

add supports for new @inline and @noinline features

a572195

xrefs: - <JuliaLang/julia#41312> - <JuliaLang/julia#41328> Built on top of <#752>.

aviatesk added a commit to JuliaLang/Compat.jl that referenced this pull request Sep 11, 2021

add supports for new @inline and @noinline features (#753)

cf3426b

xrefs: - <JuliaLang/julia#41312> - <JuliaLang/julia#41328>

JeffreySarnoff mentioned this pull request Mar 4, 2022

copypaste error in HISTORY.md #44458

Closed

JeffBezanson removed the needs news A NEWS entry is required for this change label Nov 10, 2022

aviatesk mentioned this pull request Feb 7, 2023

Callsite @constprop support #48570

Open

xlxs4 mentioned this pull request Mar 22, 2023

[doc] Outdated devdocs for :meta #49104

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimizer: supports callsite annotations of inlining, fixes #18773 #41328

optimizer: supports callsite annotations of inlining, fixes #18773 #41328

aviatesk commented Jun 23, 2021 •

edited

Loading

aviatesk Jun 26, 2021

vtjnash Aug 19, 2021

aviatesk Aug 21, 2021

aviatesk Aug 24, 2021

tisztamo commented Jun 28, 2021

aviatesk commented Jun 28, 2021 •

edited

Loading

tisztamo commented Jun 28, 2021

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

aviatesk commented Sep 4, 2021

chriselrod commented Sep 23, 2021

optimizer: supports callsite annotations of inlining, fixes #18773 #41328

optimizer: supports callsite annotations of inlining, fixes #18773 #41328

Conversation

aviatesk commented Jun 23, 2021 • edited Loading

aviatesk Jun 26, 2021

Choose a reason for hiding this comment

vtjnash Aug 19, 2021

Choose a reason for hiding this comment

aviatesk Aug 21, 2021

Choose a reason for hiding this comment

aviatesk Aug 24, 2021

Choose a reason for hiding this comment

tisztamo commented Jun 28, 2021

aviatesk commented Jun 28, 2021 • edited Loading

tisztamo commented Jun 28, 2021

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

aviatesk commented Sep 4, 2021

chriselrod commented Sep 23, 2021

aviatesk commented Jun 23, 2021 •

edited

Loading

aviatesk commented Jun 28, 2021 •

edited

Loading