Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify interface for LibGit2 abstract types #36452

Merged
merged 2 commits into from
Jul 13, 2020
Merged

Conversation

timholy
Copy link
Sponsor Member

@timholy timholy commented Jun 27, 2020

This improves inference results for a number LibGit2 operations, and reduces invalidations from methods (notably, peel) that operate on the values extracted from these fields.

@timholy timholy added compiler:latency Compiler latency libgit2 The libgit2 library or the LibGit2 stdlib module labels Jun 27, 2020
This improves inference results for a number LibGit2 operations,
and reduces invalidations from methods (notably, `peel`) that operate
on the values extracted from these fields.
Copy link
Member

@omus omus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure why this is necessary to help out inference. The change itself is compatible with all existing LibGit2 AbstractGitObject subtypes

else
return getfield(obj, name)
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a comment should be added explaining that this helps inference as on its own the function looks redundant

@timholy
Copy link
Sponsor Member Author

timholy commented Jul 2, 2020

Good point. This came up in the invalidation hunt after I'd been doing it a while, and I guess I got overly brief. Here's the issue: sometimes, the only thing inference knows is that x isa GitObject---this can come about by removing something from an abstact container, dereferencing a C pointer whose concrete type is not known in advance, or @nospecialize annotations deliberately causing inference to avoid specialization. In those circumstances, inference has no idea what kind of object x.ptr or x.owner return: as a conquence, the type gets inferred as Any, and then when you pass it as an argument to calls then you open up a possibility for invalidation by new method definitions. This contributes to compiler latency, because it forces Julia to recompile methods it's already previously compiled.

Specifically, in this case here's what happens: Pkg's update_registries gets built into our system image. This ultimately requires LibGit2.GitAnnotated(::GitRepo, ::String):

function GitAnnotated(repo::GitRepo, committish::AbstractString)
obj = GitObject(repo, committish)
cmt = peel(GitCommit, obj)
return GitAnnotated(repo, GitHash(cmt))
end

As you might imagine, inference doesn't know the precise type of obj, just that it's a GitObject. So now let's look at what happens in the call to peel:

function peel(::Type{T}, obj::GitObject) where T<:GitObject
ensure_initialized()
new_ptr_ptr = Ref{Ptr{Cvoid}}(C_NULL)
@check ccall((:git_object_peel, :libgit2), Cint,
(Ptr{Ptr{Cvoid}}, Ptr{Cvoid}, Cint), new_ptr_ptr, obj.ptr, Consts.OBJECT(T))
return T(obj.owner, new_ptr_ptr[])
end

In detail,

julia> code_warntype(LibGit2.peel, (Type{LibGit2.GitCommit}, LibGit2.GitObject))
Variables
  #self#::Core.Compiler.Const(LibGit2.peel, false)
  #unused#::Core.Compiler.Const(LibGit2.GitCommit, false)
  obj::LibGit2.GitObject
  new_ptr_ptr::Base.RefValue{Ptr{Nothing}}
  err::Int32

Body::LibGit2.GitCommit
1 ─       LibGit2.ensure_initialized()
│   %2  = Core.apply_type(LibGit2.Ptr, LibGit2.Cvoid)::Core.Compiler.Const(Ptr{Nothing}, false)
│   %3  = Core.apply_type(LibGit2.Ref, %2)::Core.Compiler.Const(Ref{Ptr{Nothing}}, false)
│         (new_ptr_ptr = (%3)(LibGit2.C_NULL))
│   %5  = Core.apply_type(LibGit2.Ptr, LibGit2.Cvoid)::Core.Compiler.Const(Ptr{Nothing}, false)
│   %6  = Core.apply_type(LibGit2.Ptr, %5)::Core.Compiler.Const(Ptr{Ptr{Nothing}}, false)
│   %7  = Base.cconvert(%6, new_ptr_ptr)::Base.RefValue{Ptr{Nothing}}%8  = Core.apply_type(LibGit2.Ptr, LibGit2.Cvoid)::Core.Compiler.Const(Ptr{Nothing}, false)
│   %9  = Base.getproperty(obj, :ptr)::Any%10 = Base.cconvert(%8, %9)::Any%11 = LibGit2.Consts.OBJECT::Core.Compiler.Const(LibGit2.Consts.OBJECT, false)
│   %12 = (%11)($(Expr(:static_parameter, 1)))::Core.Compiler.Const(LibGit2.Consts.OBJ_COMMIT, false)
│   %13 = Base.cconvert(LibGit2.Cint, %12)::Core.Compiler.Const(1, false)
│   %14 = Core.apply_type(LibGit2.Ptr, LibGit2.Cvoid)::Core.Compiler.Const(Ptr{Nothing}, false)
│   %15 = Core.apply_type(LibGit2.Ptr, %14)::Core.Compiler.Const(Ptr{Ptr{Nothing}}, false)
│   %16 = Base.unsafe_convert(%15, %7)::Ptr{Ptr{Nothing}}%17 = Core.apply_type(LibGit2.Ptr, LibGit2.Cvoid)::Core.Compiler.Const(Ptr{Nothing}, false)
│   %18 = Base.unsafe_convert(%17, %10)::Any%19 = Base.unsafe_convert(LibGit2.Cint, %13)::Core.Compiler.Const(1, false)
│   %20 = $(Expr(:foreigncall, (:git_object_peel, :libgit2), Int32, svec(Ptr{Ptr{Nothing}}, Ptr{Nothing}, Int32), 0, :(:ccall), :(%16), :(%18), :(%19), :(%13), :(%10), :(%7)))::Int32
│         (err = LibGit2.Cint(%20))
│   %22 = (err < 0)::Bool
└──       goto #3 if not %22
2%24 = LibGit2.Error.GitError::Core.Compiler.Const(LibGit2.Error.GitError, false)
│   %25 = (%24)(err)::LibGit2.Error.GitError
└──       LibGit2.throw(%25)
3 ┄       err
│   %28 = Base.getproperty(obj, :owner)::Any%29 = Base.getindex(new_ptr_ptr)::Ptr{Nothing}%30 = ($(Expr(:static_parameter, 1)))(%28, %29)::LibGit2.GitCommit
└──       return %30

You can see that the call (line 9) Base.getproperty(obj, :ptr)::Any is of unknown return type. The next thing it does call cconvert on it, which introduces a backedge from LibGit2.peel to cconvert(::Type{Ptr{Cvoid}}, ::Any). This convert is "useless" because it already has that type, but the compiler doesn't know that so it inserts it defensively.

Why is that a problem? Because StaticArrays defines this method: https://github.com/JuliaArrays/StaticArrays.jl/blob/f972d162dc3a8810d2a188feda2a737201fe8790/src/FieldArray.jl#L123

(which it's well within its rights to do). Julia notices that this method would, if applicable, supersede cconvert(::Type{Ptr{Cvoid}}, ::Any) and thus it invalidates the version of peel that it compiled, so that if anything calls it again later then it will do what is now considered "the right thing." But this, in turn, invalidates some of Pkg's code, and so the next time it has to do something like update its registry, it has to recompile the whole call chain.

We can fix that if inference knows that Base.getproperty(obj::GitObject, :ptr) already returns a Ptr{Cvoid}, because then it knows exactly which method of cconvert to call and hence never adds that backedge to cconvert(::Type{Ptr{Cvoid}}, ::Any).

Does that answer your question 😄? Perhaps the length of this explanation excuses some of my brevity at the beginning, but I should have said more. What fraction of this would you like in a comment? Or we could just add a comment referring to this PR?

@codecov-commenter
Copy link

Codecov Report

Merging #36452 into master will increase coverage by 0.03%.
The diff coverage is 88.20%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #36452      +/-   ##
==========================================
+ Coverage   86.12%   86.16%   +0.03%     
==========================================
  Files         349      349              
  Lines       65408    65659     +251     
==========================================
+ Hits        56335    56574     +239     
- Misses       9073     9085      +12     
Impacted Files Coverage Δ
base/abstractarraymath.jl 94.81% <ø> (ø)
base/abstractdict.jl 91.85% <ø> (ø)
base/abstractset.jl 98.82% <ø> (ø)
base/accumulate.jl 95.37% <ø> (ø)
base/array.jl 95.55% <ø> (ø)
base/arraymath.jl 91.30% <ø> (ø)
base/bool.jl 100.00% <ø> (ø)
base/broadcast.jl 85.61% <ø> (ø)
base/combinatorics.jl 97.79% <ø> (ø)
base/compiler/types.jl 70.00% <0.00%> (-17.50%) ⬇️
... and 117 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9d71d37...b609aa4. Read the comment docs.

@timholy
Copy link
Sponsor Member Author

timholy commented Jul 8, 2020

Did this answer your question, @omus?

I'll add that this is essentially an implementation of #4935. See also another example in #36323.

@omus
Copy link
Member

omus commented Jul 8, 2020

@timholy that definitely answers my question. If you just a comment mentioning the getproperty methods exist to help inference and have a link to this PR I think we're in great shape

@timholy timholy merged commit 78bd857 into master Jul 13, 2020
@timholy timholy deleted the teh/libgit_abstract branch July 13, 2020 20:23
simeonschaub pushed a commit to simeonschaub/julia that referenced this pull request Aug 11, 2020
This improves inference results for a number LibGit2 operations,
and reduces invalidations from methods (notably, `peel`) that operate
on the values extracted from these fields.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:latency Compiler latency libgit2 The libgit2 library or the LibGit2 stdlib module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants