Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend invoke to accept CodeInstance #56660

Merged
merged 1 commit into from
Dec 3, 2024
Merged

Extend invoke to accept CodeInstance #56660

merged 1 commit into from
Dec 3, 2024

Conversation

Keno
Copy link
Member

@Keno Keno commented Nov 23, 2024

This is an alternative mechanism to #56650 that largely achieves the same result, but by hooking into invoke rather than a generated function. They are orthogonal mechanisms, and its possible we want both. However, in #56650, both Jameson and Valentin were skeptical of the generated function signature bottleneck. This PR is sort of a hybrid of mechanism in #52964 and what I proposed in #56650 (comment).

In particular, this PR:

  1. Extends invoke to support a CodeInstance in place of its usual types argument.

  2. Adds a new typeinf optimized generic. The semantics of this optimized generic allow the compiler to instead call a companion typeinf_edge function, allowing a mid-inference interpreter switch (like Support execution of code from external abstract interpreters #52964), without being forced through a concrete signature bottleneck. However, if calling typeinf_edge does not work (e.g. because the compiler version is mismatched), this still has well defined semantics, you just don't get inference support.

The additional benefit of the typeinf optimized generic is that it lets custom cache owners tell the runtime how to "cure" code instances that have lost their native code. Currently the runtime only knows how to do that for owner == nothing CodeInstances (by re-running inference). This extension is not implemented, but the idea is that the runtime would be permitted to call the typeinf optimized generic on the dead CodeInstance's owner and def fields to obtain a cured CodeInstance (or a user-actionable error from the plugin).

This PR includes an implementation of with_new_compiler from #56650. This PR includes just enough compiler support to make the compiler optimize this to the same code that #56650 produced:

julia> @code_typed with_new_compiler(sin, 1.0)
CodeInfo(
1 ─      $(Expr(:foreigncall, :(:jl_get_tls_world_age), UInt64, svec(), 0, :(:ccall)))::UInt64
│   %2 =   builtin Core.getfield(args, 1)::Float64
│   %3 =    invoke sin(%2::Float64)::Float64
└──      return %3
) => Float64

However, the implementation here is extremely incomplete. I'm putting it up only as a directional sketch to see if people prefer it over #56650. If so, I would prepare a cleaned up version of this PR that has the optimized generics as well as the curing support, but not the full inference integration (which needs a fair bit more work).

@Keno Keno marked this pull request as draft November 23, 2024 09:12
Copy link
Member

@vchuravy vchuravy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I like this direction more.

Comment on lines 15 to 17
let typeinf_world_age = Base.tls_world_age()
@eval @noinline typeinf(::SplitCacheOwner, mi::MethodInstance, source_mode::UInt8) =
Base.invoke_in_world($typeinf_world_age, Compiler.typeinf_ext, SplitCacheInterp(; world=Base.tls_world_age()), mi, source_mode)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why you can't precompile it right?

I was doing something like:

const COMPILER_WORLD = Ref{UInt}(0)
function __init__()
    COMPILER_WORLD[] = Base.tls_world_age()
end

To set the world age post loading.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping to tie it to the definition world of the method, but needs some codegen work to make that fast.

@Keno
Copy link
Member Author

Keno commented Nov 26, 2024

Alright, both @vtjnash and @vchuravy prefer this version, so the game plane is:

  1. Rebase this on top of 9d81f63
  2. Take out the special case inference support for now (to be added back in a later PR)
  3. Clean it up, and get it merged as soon as possible.

@aviatesk
Copy link
Member

2. Adds a new typeinf optimized generic. The semantics of this optimized generic allow the compiler to instead call a companion typeinf_edge function, allowing a mid-inference interpreter switch (like Support execution of code from external abstract interpreters #52964), without being forced through a concrete signature bottleneck.

Does this mean that propagating the compiler plugin context for a specific scope would involve performing a "Cassette"-like transformation, but by making the transformed source call CompilerPlugin.typeinf(interp_for_ctx, ...), we can achieve sufficient inference support while also propagating the compiler plugin context?

@Keno
Copy link
Member Author

Keno commented Nov 29, 2024

Does this mean that propagating the compiler plugin context for a specific scope would involve performing a "Cassette"-like transformation, but by making the transformed source call CompilerPlugin.typeinf(interp_for_ctx, ...), we can achieve sufficient inference support while also propagating the compiler plugin context?

That's the idea, yes.

@Keno Keno changed the title Extremely WIP/RFC: Extend invoke to accept CodeInstance Extend invoke to accept CodeInstance Nov 29, 2024
@Keno Keno marked this pull request as ready for review November 29, 2024 03:01
This is an alternative mechanism to #56650 that largely achieves
the same result, but by hooking into `invoke` rather than a generated
function. They are orthogonal mechanisms, and its possible we want both.
However, in #56650, both Jameson and Valentin were skeptical of the
generated function signature bottleneck. This PR is sort of a hybrid
of mechanism in #52964 and what I proposed in #56650 (comment).

In particular, this PR:

1. Extends `invoke` to support a CodeInstance in place of its usual
   `types` argument.

2. Adds a new `typeinf` optimized generic. The semantics of this optimized
   generic allow the compiler to instead call a companion `typeinf_edge`
   function, allowing a mid-inference interpreter switch (like #52964),
   without being forced through a concrete signature bottleneck. However,
   if calling `typeinf_edge` does not work (e.g. because the compiler
   version is mismatched), this still has well defined semantics, you
   just don't get inference support.

The additional benefit of the `typeinf` optimized generic is that it lets
custom cache owners tell the runtime how to "cure" code instances that
have lost their native code. Currently the runtime only knows how to
do that for `owner == nothing` CodeInstances (by re-running inference).
This extension is not implemented, but the idea is that the runtime would
be permitted to call the `typeinf` optimized generic on the dead
CodeInstance's `owner` and `def` fields to obtain a cured CodeInstance (or
a user-actionable error from the plugin).

This PR includes an implementation of `with_new_compiler` from #56650.

That said, this PR does not yet include the compiler optimization that
implements the semantics of the optimized generic, which will be in a
follow up PR.
@Keno
Copy link
Member Author

Keno commented Nov 29, 2024

Alright, both @vtjnash and @vchuravy prefer this version, so the game plane is:

Updated as described.

@Keno Keno merged commit efa917e into master Dec 3, 2024
7 checks passed
@Keno Keno deleted the kf/ciinvoke branch December 3, 2024 01:28
edge::CodeInstance
end
add_edges_impl(edges::Vector{Any}, info::InvokeCICallInfo) =
add_one_edge!(edges, info.edge)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a dispatch edge, which is wrong for invoke calls

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it does appear to do what I want, which is to invalidate the call if the target CodeInstance gets its bounds capped, because we're reading that bound information in inference. What edge type are you suggesting?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically that is a recursion-only edge, which we do not have the ability to represent precisely, but the function to add them is called add_inlining_edge!

src/builtins.c Show resolved Hide resolved
jl_error("invoke: CodeInstance not valid for this world");
}
if (!invoke) {
jl_compile_codeinst(codeinst);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this always compiles code correctly, since normally the runtime expects that jl_compile_method_internal pre- and post-sanitizes the input

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you be more specific. Are you concerned that there are codeinstances on which this isn't legal to call? If so, I think we should flag them separately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume it normally expects the jl_normalize_to_compilable_mi heuristics, as well respecting the user commanded compile_option, and finally handles fallbacks (which I think are still implemented here, just later), and then hopefully also with populating the cache so that it doesn't end up repeating those costly steps every time if compilation didn't directly succeed. Probably nothing immediately crucial here, just a later opportunity to make this more robust and consistent to the user.

src/builtins.c Show resolved Hide resolved
Base.invoke_in_world(which(typeinf, Tuple{SplitCacheOwner, MethodInstance, UInt8}).primary_world, Compiler.typeinf_ext, SplitCacheInterp(; world=Base.tls_world_age()), mi, source_mode)

@eval @noinline function typeinf_edge(::SplitCacheOwner, mi::MethodInstance, parent_frame::Compiler.InferenceState, world::UInt, source_mode::UInt8)
# TODO: This isn't quite right, we're just sketching things for now
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment could use some expanded explanation. Why are we merging something that is wrong: is it merely something incomplete or are we saying this would be unsound to use?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the pieces of this PR that do anything with the optimized generics, but I've kept the definitions, because they need to be in Core. That way an upgradable compiler stdlib can start making them do something.

function typeinf end

"""
typeinf_edge(owner, mi, parent_frame, world, abi_mode)::CodeInstance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it could use a different name, since it doesn't appear to be compatible with the existing Compiler.typeinf_edge's function behavior

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to change the name. typeinf_tfunc, compiletime_typeinf?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typeinf_tfunc sounds more reasonable to me.

Comment on lines +28 to +29
@eval @noinline typeinf(::SplitCacheOwner, mi::MethodInstance, source_mode::UInt8) =
Base.invoke_in_world(which(typeinf, Tuple{SplitCacheOwner, MethodInstance, UInt8}).primary_world, Compiler.typeinf_ext, SplitCacheInterp(; world=Base.tls_world_age()), mi, source_mode)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this invoke_in_world could ever be useful, and only how it could be harmful. Either this function got successfully called, in which case the world is apparently already correct for calling it, or the caller failed to run this in the correct world, in which case the world error should have happened before the call could reach here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The objective is to switch to a fixed world for invalidation avoidance, while also being compatible with precompilation. Happy to take alternative suggestions for how to accomplish this.

function activate_codegen!()
ccall(:jl_set_typeinf_func, Cvoid, (Any,), typeinf_ext_toplevel)
Core.eval(Compiler, quote
let typeinf_world_age = Base.tls_world_age()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value returned from this function is not a legal constant to embed into IR (it is only a runtime value and can change meaning over time and precompile), and so this can lead to incorrect behavior

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definition is not meant to survive through package precompilation - only sysimg. It's equivalent to the current mechanism for switching to inference world age. That said, if we can come up with a better precompile safe pattern (see above), I'm happy to use it here as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responding to all 3 comments at once here, I think the main question to answer is just what the API demo looks like, which may then directly provide the answer as to what to call it, which world is relevant to it, and so on. I think we just either need to try to completely merge that within the next 2 weeks, or delete the small stub code in Core (just for the release) so that we aren't committing to the API before it does anything useful and can be tested. Hopefully we can do the former though

Keno added a commit that referenced this pull request Dec 3, 2024
Some more questions still to be answered, but this should address the
immediate actionable review items.
Keno added a commit that referenced this pull request Dec 3, 2024
Some more questions still to be answered, but this should address the
immediate actionable review items.
Keno added a commit that referenced this pull request Dec 4, 2024
Some more questions still to be answered, but this should address the
immediate actionable review items.
Keno added a commit that referenced this pull request Dec 5, 2024
Some more questions still to be answered, but this should address the
immediate actionable review items.
Keno added a commit that referenced this pull request Dec 6, 2024
Some more questions still to be answered, but this should address the
immediate actionable review items.
stevengj pushed a commit that referenced this pull request Jan 2, 2025
This is an alternative mechanism to #56650 that largely achieves the
same result, but by hooking into `invoke` rather than a generated
function. They are orthogonal mechanisms, and its possible we want both.
However, in #56650, both Jameson and Valentin were skeptical of the
generated function signature bottleneck. This PR is sort of a hybrid of
mechanism in #52964 and what I proposed in
#56650 (comment).

In particular, this PR:

1. Extends `invoke` to support a CodeInstance in place of its usual
`types` argument.

2. Adds a new `typeinf` optimized generic. The semantics of this
optimized generic allow the compiler to instead call a companion
`typeinf_edge` function, allowing a mid-inference interpreter switch
(like #52964), without being forced through a concrete signature
bottleneck. However, if calling `typeinf_edge` does not work (e.g.
because the compiler version is mismatched), this still has well defined
semantics, you just don't get inference support.

The additional benefit of the `typeinf` optimized generic is that it
lets custom cache owners tell the runtime how to "cure" code instances
that have lost their native code. Currently the runtime only knows how
to do that for `owner == nothing` CodeInstances (by re-running
inference). This extension is not implemented, but the idea is that the
runtime would be permitted to call the `typeinf` optimized generic on
the dead CodeInstance's `owner` and `def` fields to obtain a cured
CodeInstance (or a user-actionable error from the plugin).

This PR includes an implementation of `with_new_compiler` from #56650.
This PR includes just enough compiler support to make the compiler
optimize this to the same code that #56650 produced:

```
julia> @code_typed with_new_compiler(sin, 1.0)
CodeInfo(
1 ─      $(Expr(:foreigncall, :(:jl_get_tls_world_age), UInt64, svec(), 0, :(:ccall)))::UInt64
│   %2 =   builtin Core.getfield(args, 1)::Float64
│   %3 =    invoke sin(%2::Float64)::Float64
└──      return %3
) => Float64
```

However, the implementation here is extremely incomplete. I'm putting it
up only as a directional sketch to see if people prefer it over #56650.
If so, I would prepare a cleaned up version of this PR that has the
optimized generics as well as the curing support, but not the full
inference integration (which needs a fair bit more work).
stevengj pushed a commit that referenced this pull request Jan 2, 2025
Some more questions still to be answered, but this should address the
immediate actionable review items.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants