perform inference using optimizer-derived type information #56687

aviatesk · 2024-11-26T16:30:42Z

In certain cases, the optimizer can introduce new type information. This is particularly evident in SROA, where load forwarding can reveal type information that was not visible during abstract interpretation. In such cases, re-running abstract interpretation using this new type information can be highly valuable, however, currently, this only occurs when semi-concrete interpretation happens to be triggered.

This commit introduces a new "post-optimization inference" phase at the end of the optimizer pipeline. When the optimizer derives new type information, this phase performs IR abstract interpretation to further optimize the IR.

Such cases are especially common in patterns involving captured
variables, as discussed in #15276. By combining the
"post-optimization inference" implemented in this commit with EA-based
load forwarding, it is anticipated that the issue described in
#15276 can largely be resolved.

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2024-11-26T16:48:05Z

Your job failed.

oscardssmith · 2024-11-26T23:30:31Z

does this ensure that we don't end up inferring non-IPO-safe things in an IPO context?

aviatesk · 2024-11-27T06:41:46Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2024-11-27T06:58:57Z

Your job failed.

aviatesk · 2024-12-26T17:26:17Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2024-12-26T17:43:28Z

Your job failed.

aviatesk · 2024-12-26T18:15:10Z

@nanosoldier runbenchmarks(["inference", "allinference", "Base.init_stdio(::Ptr{Cvoid})"], vs=":master")

aviatesk · 2024-12-26T19:07:31Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2024-12-26T20:53:13Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

aviatesk · 2024-12-27T20:09:27Z

@nanosoldier runbenchmarks(!"scalar", vs=":master")

nanosoldier · 2024-12-28T06:45:47Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

aviatesk · 2024-12-31T17:24:39Z

@nanosoldier runbenchmarks(!"scalar", vs=":master")

aviatesk · 2024-12-31T17:30:33Z

does this ensure that we don't end up inferring non-IPO-safe things in an IPO context?

In the current implementation of the optimizer, non-IPO-safe optimizations are not performed. So I understand that performing re-inference on such IPO-safe IRCode does not lead to IPO-validity issues.

nanosoldier · 2025-01-01T04:01:40Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

aviatesk · 2025-01-01T08:03:03Z

@nanosoldier runbenchmarks("collection" || "simd" || "sort", vs=":master")

Performance issues related to code containing closures are frequently discussed (e.g., #15276, #56561). Several approaches can be considered to address this problem, one of which involves re-inferring code containing closures (#56687). To implement this idea, it is necessary to determine whether a given piece of code includes a closure. However, there is currently no explicit mechanism for making this determination (although there are some code that checks whether the function name contains `"#"` for this purpose, but this is an ad hoc solution). To address this, this commit lays the foundation for future optimizations targeting closures by defining closure functions as a subtype of the new type `Core.Closure <: Function`. This change allows the optimizer to apply targeted optimizations to code containing calls to functions that are subtype of `Core.Closure`.

nanosoldier · 2025-01-01T10:25:29Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

aviatesk · 2025-01-02T01:09:13Z

I have confirmed that there are no test failures or performance regressions. This PR is ready to be merged.

vtjnash · 2025-01-02T20:25:58Z

Compiler/src/typeinfer.jl

+        ccall(:jl_fill_codeinst, Cvoid, (
+            Any, Any, Any, Any, Int32, UInt, UInt,
+            UInt32, Any, Any, Any),
+            ci, rettype, exctype, rettype_const, const_flags, min_world, max_world,
            encode_effects(result.ipo_effects), result.analysis_results, di, edges)
        if is_cached(me)
            cached_result = cache_result!(me.interp, result, ci)


I think the main question for review is whether this PR makes codegen unsound. Right here, we publish the ABI that will be used by this CodeInstance, for use by the optimizer in creating invoke edges. Any future changes to that info can result in codegen becoming unsound (as well as thread-unsafe).

vtjnash · 2025-01-02T20:28:37Z

Compiler/src/typeinfer.jl

+        ccall(:jl_update_codeinst, Cvoid, (
+            Any, Any, Any, Any, Any, Int32, UInt, UInt,
+            UInt32, Any, UInt8, Any, Any),
+            ci, inferred_result, rettype, exctype, rettype_const, const_flags, min_world, max_world,


Changing any non-atomic fields here (after cache_result) causes codegen to become unsound. I am uncertain how we would preserve the expectation that the optimizer only uses CodeInstances found in the cache, while also delaying publishing them into the cache until after this point in the code. I don't believe it to be impossible to change that (since we are holding the "engine lock" on this until after the optimizer finishes, so we can just predict which CodeInstance will get added to the cache and which are only for local use), but just tricky to figure out how to best correctly adapt the cache to this PR.

aviatesk · 2025-01-03T15:40:10Z

@vtjnash Thank you for the review.

If I understand correctly, this PR has two main issues:

Updating rettype or other (non-atomic) information defining the CodeInstance's ABI after cache_result!.
Performing re-inference during optimization breaks the assumption that the optimizer operates with the global cache stabilized

Regarding (1), it seems that only updating the inferred field may be the viable solution? (Honestly, even if the ABI changes between abstract interpretation and the optimizer, I don’t think it would cause significant issues in most cases, but I understand your concern that introducing a fundamental error could create future risks.)

As for (2), I don’t understand why the optimizer must rely on a stabilized cache. Is the concern that optimization itself might no longer guarantee termination? Even so, it might be possible to resolve the issue by preventing re-inference from performing new inferences and restricting it to use only the results available in the global cache.

vtjnash · 2025-01-03T17:36:23Z

For (2), the main issue is just that we want the edges to normally come from the global cache, so that we don't accidentally make multiple copies of it and greatly slow down the whole system by accident. The main way we avoid this is with locks (e.g. the engine_lock) so that it can wait for in-progress work to finish at the necessary time, and a local cache to store any in-progress work.

So for both issues, we mainly just need to move the cache_result call until after optimization finishes, and then provide an additional intermediate "future global cache" cache that the optimizer can examine

Performance issues related to code containing closures are frequently discussed (e.g., #15276, #56561). Several approaches can be considered to address this problem, one of which involves re-inferring code containing closures (#56687). To implement this idea, it is necessary to determine whether a given piece of code includes a closure. However, there is currently no explicit mechanism for making this determination (although there are some code that checks whether the function name contains `"#"` for this purpose, but this is an ad hoc solution). To address this, this commit lays the foundation for future optimizations targeting closures by defining closure functions as a subtype of the new type `Core.Closure <: Function`. This change allows the optimizer to apply targeted optimizations to code containing calls to functions that are subtype of `Core.Closure`.

In certain cases, the optimizer can introduce new type information. This is particularly evident in SROA, where load forwarding can reveal type information that was not visible during abstract interpretation. In such cases, re-running abstract interpretation using this new type information can be highly valuable, however, currently, this only occurs when semi-concrete interpretation happens to be triggered. This commit introduces a new "post-optimization inference" phase at the end of the optimizer pipeline. When the optimizer derives new type information, this phase performs IR abstract interpretation to further optimize the IR.

MasonProtter · 2025-01-14T12:07:55Z

Interestingly, this doesn't seem to be able to catch cases where the boxed variable is the function itself (i.e. #53295)

julia> function outer()
           function inner()
               false && inner()
           end
           inner()
       end;

julia> @btime outer()
  14.758 ns (2 allocations: 32 bytes)
false

julia> @code_warntype outer()
MethodInstance for outer()
  from outer() @ Main REPL[13]:1
Arguments
  #self#::Core.Const(Main.outer)
Locals
  inner@_2::Core.Box
  inner@_3::Union{}
  inner@_4::Union{}
Body::Any
1 ─       (inner@_2 = Core.Box())
│   %2  = Main.:(var"#inner#8")::Core.Const(var"#inner#8")
│   %3  = inner@_2::Core.Box
│   %4  = %new(%2, %3)::var"#inner#8"
│   %5  = inner@_2::Core.Box
│         Core.setfield!(%5, :contents, %4)
│   %7  = inner@_2::Core.Box
│   %8  = Core.isdefined(%7, :contents)::Bool
└──       goto #3 if not %8
2 ─       goto #4
3 ─       Core.NewvarNode(:(inner@_4))
└──       inner@_4
4 ┄ %13 = inner@_2::Core.Box
│   %14 = Core.getfield(%13, :contents)::Any
│   %15 = (%14)()::Any
└──       return %15

even though this seems like it should be closely related to equivalent things with non-callables being boxed.

Base automatically changed from avi/fix-cfg_simplify! to master November 27, 2024 05:45

aviatesk force-pushed the avi/opt-inf branch from 2842416 to 231b196 Compare November 27, 2024 05:46

aviatesk mentioned this pull request Dec 17, 2024

wip: overhaul EscapeAnalysis.jl #56849

Open

8 tasks

aviatesk force-pushed the avi/opt-inf branch from 231b196 to 1bf7837 Compare December 26, 2024 17:26

aviatesk force-pushed the avi/opt-inf branch from 1bf7837 to b167eb1 Compare December 26, 2024 19:03

aviatesk force-pushed the avi/opt-inf branch from b167eb1 to c553fe4 Compare December 31, 2024 17:23

aviatesk mentioned this pull request Jan 1, 2025

define Core.Closure and make compiler-generated closures subtype of it #56928

Open

vtjnash reviewed Jan 2, 2025

View reviewed changes

aviatesk force-pushed the avi/opt-inf branch from c553fe4 to d88df2f Compare January 3, 2025 15:12

aviatesk force-pushed the avi/opt-inf branch from d88df2f to 009ba04 Compare January 6, 2025 17:22

aviatesk force-pushed the avi/opt-inf branch from 009ba04 to 44b7d1e Compare January 13, 2025 09:40

aviatesk force-pushed the avi/opt-inf branch from 44b7d1e to b775f20 Compare January 13, 2025 10:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perform inference using optimizer-derived type information #56687

perform inference using optimizer-derived type information #56687

aviatesk commented Nov 26, 2024

nanosoldier commented Nov 26, 2024

oscardssmith commented Nov 26, 2024

aviatesk commented Nov 27, 2024

nanosoldier commented Nov 27, 2024

aviatesk commented Dec 26, 2024

nanosoldier commented Dec 26, 2024

aviatesk commented Dec 26, 2024

aviatesk commented Dec 26, 2024

nanosoldier commented Dec 26, 2024

aviatesk commented Dec 27, 2024

nanosoldier commented Dec 28, 2024

aviatesk commented Dec 31, 2024

aviatesk commented Dec 31, 2024

nanosoldier commented Jan 1, 2025

aviatesk commented Jan 1, 2025

nanosoldier commented Jan 1, 2025

aviatesk commented Jan 2, 2025

vtjnash Jan 2, 2025

vtjnash Jan 2, 2025 •

edited

Loading

aviatesk commented Jan 3, 2025

vtjnash commented Jan 3, 2025

MasonProtter commented Jan 14, 2025

perform inference using optimizer-derived type information #56687

Are you sure you want to change the base?

perform inference using optimizer-derived type information #56687

Conversation

aviatesk commented Nov 26, 2024

nanosoldier commented Nov 26, 2024

oscardssmith commented Nov 26, 2024

aviatesk commented Nov 27, 2024

nanosoldier commented Nov 27, 2024

aviatesk commented Dec 26, 2024

nanosoldier commented Dec 26, 2024

aviatesk commented Dec 26, 2024

aviatesk commented Dec 26, 2024

nanosoldier commented Dec 26, 2024

aviatesk commented Dec 27, 2024

nanosoldier commented Dec 28, 2024

aviatesk commented Dec 31, 2024

aviatesk commented Dec 31, 2024

nanosoldier commented Jan 1, 2025

aviatesk commented Jan 1, 2025

nanosoldier commented Jan 1, 2025

aviatesk commented Jan 2, 2025

vtjnash Jan 2, 2025

Choose a reason for hiding this comment

vtjnash Jan 2, 2025 • edited Loading

Choose a reason for hiding this comment

aviatesk commented Jan 3, 2025

vtjnash commented Jan 3, 2025

MasonProtter commented Jan 14, 2025

vtjnash Jan 2, 2025 •

edited

Loading