Extend `invoke` to accept CodeInstance #56660

Keno · 2024-11-23T09:09:10Z

This is an alternative mechanism to #56650 that largely achieves the same result, but by hooking into invoke rather than a generated function. They are orthogonal mechanisms, and its possible we want both. However, in #56650, both Jameson and Valentin were skeptical of the generated function signature bottleneck. This PR is sort of a hybrid of mechanism in #52964 and what I proposed in #56650 (comment).

In particular, this PR:

Extends invoke to support a CodeInstance in place of its usual types argument.
Adds a new typeinf optimized generic. The semantics of this optimized generic allow the compiler to instead call a companion typeinf_edge function, allowing a mid-inference interpreter switch (like Support execution of code from external abstract interpreters #52964), without being forced through a concrete signature bottleneck. However, if calling typeinf_edge does not work (e.g. because the compiler version is mismatched), this still has well defined semantics, you just don't get inference support.

The additional benefit of the typeinf optimized generic is that it lets custom cache owners tell the runtime how to "cure" code instances that have lost their native code. Currently the runtime only knows how to do that for owner == nothing CodeInstances (by re-running inference). This extension is not implemented, but the idea is that the runtime would be permitted to call the typeinf optimized generic on the dead CodeInstance's owner and def fields to obtain a cured CodeInstance (or a user-actionable error from the plugin).

This PR includes an implementation of with_new_compiler from #56650. This PR includes just enough compiler support to make the compiler optimize this to the same code that #56650 produced:

julia> @code_typed with_new_compiler(sin, 1.0)
CodeInfo(
1 ─      $(Expr(:foreigncall, :(:jl_get_tls_world_age), UInt64, svec(), 0, :(:ccall)))::UInt64
│   %2 =   builtin Core.getfield(args, 1)::Float64
│   %3 =    invoke sin(%2::Float64)::Float64
└──      return %3
) => Float64

However, the implementation here is extremely incomplete. I'm putting it up only as a directional sketch to see if people prefer it over #56650. If so, I would prepare a cleaned up version of this PR that has the optimized generics as well as the curing support, but not the full inference integration (which needs a fair bit more work).

vchuravy

Yeah I like this direction more.

vchuravy · 2024-11-23T14:55:57Z

Compiler/extras/CompilerDevTools/src/CompilerDevTools.jl

+let typeinf_world_age = Base.tls_world_age()
+    @eval @noinline typeinf(::SplitCacheOwner, mi::MethodInstance, source_mode::UInt8) =
+        Base.invoke_in_world($typeinf_world_age, Compiler.typeinf_ext, SplitCacheInterp(; world=Base.tls_world_age()), mi, source_mode)


This is why you can't precompile it right?

I was doing something like:

const COMPILER_WORLD = Ref{UInt}(0) function __init__() COMPILER_WORLD[] = Base.tls_world_age() end

To set the world age post loading.

I was hoping to tie it to the definition world of the method, but needs some codegen work to make that fast.

Keno · 2024-11-26T23:36:12Z

Alright, both @vtjnash and @vchuravy prefer this version, so the game plane is:

Rebase this on top of 9d81f63
Take out the special case inference support for now (to be added back in a later PR)
Clean it up, and get it merged as soon as possible.

aviatesk · 2024-11-27T07:49:16Z

2. Adds a new typeinf optimized generic. The semantics of this optimized generic allow the compiler to instead call a companion typeinf_edge function, allowing a mid-inference interpreter switch (like Support execution of code from external abstract interpreters #52964), without being forced through a concrete signature bottleneck.

Does this mean that propagating the compiler plugin context for a specific scope would involve performing a "Cassette"-like transformation, but by making the transformed source call CompilerPlugin.typeinf(interp_for_ctx, ...), we can achieve sufficient inference support while also propagating the compiler plugin context?

Keno · 2024-11-29T02:23:31Z

Does this mean that propagating the compiler plugin context for a specific scope would involve performing a "Cassette"-like transformation, but by making the transformed source call CompilerPlugin.typeinf(interp_for_ctx, ...), we can achieve sufficient inference support while also propagating the compiler plugin context?

That's the idea, yes.

This is an alternative mechanism to #56650 that largely achieves the same result, but by hooking into `invoke` rather than a generated function. They are orthogonal mechanisms, and its possible we want both. However, in #56650, both Jameson and Valentin were skeptical of the generated function signature bottleneck. This PR is sort of a hybrid of mechanism in #52964 and what I proposed in #56650 (comment). In particular, this PR: 1. Extends `invoke` to support a CodeInstance in place of its usual `types` argument. 2. Adds a new `typeinf` optimized generic. The semantics of this optimized generic allow the compiler to instead call a companion `typeinf_edge` function, allowing a mid-inference interpreter switch (like #52964), without being forced through a concrete signature bottleneck. However, if calling `typeinf_edge` does not work (e.g. because the compiler version is mismatched), this still has well defined semantics, you just don't get inference support. The additional benefit of the `typeinf` optimized generic is that it lets custom cache owners tell the runtime how to "cure" code instances that have lost their native code. Currently the runtime only knows how to do that for `owner == nothing` CodeInstances (by re-running inference). This extension is not implemented, but the idea is that the runtime would be permitted to call the `typeinf` optimized generic on the dead CodeInstance's `owner` and `def` fields to obtain a cured CodeInstance (or a user-actionable error from the plugin). This PR includes an implementation of `with_new_compiler` from #56650. That said, this PR does not yet include the compiler optimization that implements the semantics of the optimized generic, which will be in a follow up PR.

Keno · 2024-11-29T03:13:28Z

Alright, both @vtjnash and @vchuravy prefer this version, so the game plane is:

Updated as described.

Compiler/src/abstractinterpretation.jl

vtjnash · 2024-12-03T01:33:21Z

Compiler/src/stmtinfo.jl

+    edge::CodeInstance
+end
+add_edges_impl(edges::Vector{Any}, info::InvokeCICallInfo) =
+    add_one_edge!(edges, info.edge)


This adds a dispatch edge, which is wrong for invoke calls

Hmm, it does appear to do what I want, which is to invalidate the call if the target CodeInstance gets its bounds capped, because we're reading that bound information in inference. What edge type are you suggesting?

Theoretically that is a recursion-only edge, which we do not have the ability to represent precisely, but the function to add them is called add_inlining_edge!

src/builtins.c

vtjnash · 2024-12-03T01:41:32Z

src/builtins.c

+            jl_error("invoke: CodeInstance not valid for this world");
+        }
+        if (!invoke) {
+            jl_compile_codeinst(codeinst);


I don't think this always compiles code correctly, since normally the runtime expects that jl_compile_method_internal pre- and post-sanitizes the input

Could you be more specific. Are you concerned that there are codeinstances on which this isn't legal to call? If so, I think we should flag them separately.

I assume it normally expects the jl_normalize_to_compilable_mi heuristics, as well respecting the user commanded compile_option, and finally handles fallbacks (which I think are still implemented here, just later), and then hopefully also with populating the cache so that it doesn't end up repeating those costly steps every time if compilation didn't directly succeed. Probably nothing immediately crucial here, just a later opportunity to make this more robust and consistent to the user.

src/builtins.c

Compiler/extras/CompilerDevTools/src/CompilerDevTools.jl

vtjnash · 2024-12-03T02:04:11Z

Compiler/extras/CompilerDevTools/src/CompilerDevTools.jl

+    Base.invoke_in_world(which(typeinf, Tuple{SplitCacheOwner, MethodInstance, UInt8}).primary_world, Compiler.typeinf_ext, SplitCacheInterp(; world=Base.tls_world_age()), mi, source_mode)
+
+@eval @noinline function typeinf_edge(::SplitCacheOwner, mi::MethodInstance, parent_frame::Compiler.InferenceState, world::UInt, source_mode::UInt8)
+    # TODO: This isn't quite right, we're just sketching things for now


This comment could use some expanded explanation. Why are we merging something that is wrong: is it merely something incomplete or are we saying this would be unsound to use?

I've removed the pieces of this PR that do anything with the optimized generics, but I've kept the definitions, because they need to be in Core. That way an upgradable compiler stdlib can start making them do something.

vtjnash · 2024-12-03T02:07:28Z

base/optimized_generics.jl

+    function typeinf end
+
+    """
+        typeinf_edge(owner, mi, parent_frame, world, abi_mode)::CodeInstance


This seems like it could use a different name, since it doesn't appear to be compatible with the existing Compiler.typeinf_edge's function behavior

Happy to change the name. typeinf_tfunc, compiletime_typeinf?

typeinf_tfunc sounds more reasonable to me.

vtjnash · 2024-12-03T02:11:59Z

Compiler/extras/CompilerDevTools/src/CompilerDevTools.jl

+@eval @noinline typeinf(::SplitCacheOwner, mi::MethodInstance, source_mode::UInt8) =
+    Base.invoke_in_world(which(typeinf, Tuple{SplitCacheOwner, MethodInstance, UInt8}).primary_world, Compiler.typeinf_ext, SplitCacheInterp(; world=Base.tls_world_age()), mi, source_mode)


I don't see how this invoke_in_world could ever be useful, and only how it could be harmful. Either this function got successfully called, in which case the world is apparently already correct for calling it, or the caller failed to run this in the correct world, in which case the world error should have happened before the call could reach here.

The objective is to switch to a fixed world for invalidation avoidance, while also being compatible with precompilation. Happy to take alternative suggestions for how to accomplish this.

vtjnash · 2024-12-03T02:15:16Z

Compiler/src/bootstrap.jl

+function activate_codegen!()
+    ccall(:jl_set_typeinf_func, Cvoid, (Any,), typeinf_ext_toplevel)
+    Core.eval(Compiler, quote
+        let typeinf_world_age = Base.tls_world_age()


The value returned from this function is not a legal constant to embed into IR (it is only a runtime value and can change meaning over time and precompile), and so this can lead to incorrect behavior

This definition is not meant to survive through package precompilation - only sysimg. It's equivalent to the current mechanism for switching to inference world age. That said, if we can come up with a better precompile safe pattern (see above), I'm happy to use it here as well.

Responding to all 3 comments at once here, I think the main question to answer is just what the API demo looks like, which may then directly provide the answer as to what to call it, which world is relevant to it, and so on. I think we just either need to try to completely merge that within the next 2 weeks, or delete the small stub code in Core (just for the release) so that we aren't committing to the API before it does anything useful and can be tested. Hopefully we can do the former though

Some more questions still to be answered, but this should address the immediate actionable review items.

This is an alternative mechanism to #56650 that largely achieves the same result, but by hooking into `invoke` rather than a generated function. They are orthogonal mechanisms, and its possible we want both. However, in #56650, both Jameson and Valentin were skeptical of the generated function signature bottleneck. This PR is sort of a hybrid of mechanism in #52964 and what I proposed in #56650 (comment). In particular, this PR: 1. Extends `invoke` to support a CodeInstance in place of its usual `types` argument. 2. Adds a new `typeinf` optimized generic. The semantics of this optimized generic allow the compiler to instead call a companion `typeinf_edge` function, allowing a mid-inference interpreter switch (like #52964), without being forced through a concrete signature bottleneck. However, if calling `typeinf_edge` does not work (e.g. because the compiler version is mismatched), this still has well defined semantics, you just don't get inference support. The additional benefit of the `typeinf` optimized generic is that it lets custom cache owners tell the runtime how to "cure" code instances that have lost their native code. Currently the runtime only knows how to do that for `owner == nothing` CodeInstances (by re-running inference). This extension is not implemented, but the idea is that the runtime would be permitted to call the `typeinf` optimized generic on the dead CodeInstance's `owner` and `def` fields to obtain a cured CodeInstance (or a user-actionable error from the plugin). This PR includes an implementation of `with_new_compiler` from #56650. This PR includes just enough compiler support to make the compiler optimize this to the same code that #56650 produced: ``` julia> @code_typed with_new_compiler(sin, 1.0) CodeInfo( 1 ─ $(Expr(:foreigncall, :(:jl_get_tls_world_age), UInt64, svec(), 0, :(:ccall)))::UInt64 │ %2 = builtin Core.getfield(args, 1)::Float64 │ %3 = invoke sin(%2::Float64)::Float64 └── return %3 ) => Float64 ``` However, the implementation here is extremely incomplete. I'm putting it up only as a directional sketch to see if people prefer it over #56650. If so, I would prepare a cleaned up version of this PR that has the optimized generics as well as the curing support, but not the full inference integration (which needs a fair bit more work).

Some more questions still to be answered, but this should address the immediate actionable review items.

Keno marked this pull request as draft November 23, 2024 09:12

vchuravy reviewed Nov 23, 2024

View reviewed changes

Keno mentioned this pull request Nov 24, 2024

WIP: Allow generated functions to return a CodeInstance #56650

Closed

Keno force-pushed the kf/ciinvoke branch from 6521b62 to 081c949 Compare November 29, 2024 03:01

Keno changed the title ~~Extremely WIP/RFC: Extend invoke to accept CodeInstance~~ Extend invoke to accept CodeInstance Nov 29, 2024

Keno marked this pull request as ready for review November 29, 2024 03:01

Keno force-pushed the kf/ciinvoke branch from 081c949 to 8ebb35e Compare November 29, 2024 03:02

Keno force-pushed the kf/ciinvoke branch from 8ebb35e to 3ead814 Compare November 29, 2024 03:11

Keno merged commit efa917e into master Dec 3, 2024
7 checks passed

Keno deleted the kf/ciinvoke branch December 3, 2024 01:28