Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

invert linetable representation #52415

Merged
merged 1 commit into from
Mar 11, 2024
Merged

invert linetable representation #52415

merged 1 commit into from
Mar 11, 2024

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented Dec 5, 2023

Previously, our linetables (similar to LLVM) represented line information as a linked list from callee via inlined_at up to the original information. This requires many copies of this information to be created. Instead we can take advantage of the necessary existence of the line table from the child to flip this chain of information and instead make each statement be a table describing (for each IR instruction):
(current line number, (index into edges, index into edges statements)) plus a table of all edges, plus a table with the original line numbers from the parser, plus the file name. This is all packed into the struct

struct DebugInfo
    def::Union{Method,MethodInstance,Symbol}
    linetable::Union{Nothing,DebugInfo}
    edges::SimpleVector{DebugInfo}
    codelocs::String
end

Which is described in doc/src/devdocs/ast.md for what each field means and look at stacktraces.jl or compiler/ssair/show.jl to look at how to decode and interpret this information.

For the sysimage, this saves several megabytes (about 113 MB -> 110 MB) and about 5% of the stdlib pkgimages (294 MB -> 279 MB).

It also now happens to have the full type information for the inlined functions. Now if you create an IRShow.DILineInfoPrinter with showtypes=true, it can print that information when printing IR.

Opening as draft, although it is hopefully ready for review, since it appears there will be an update needed to keep Revise working with the new format for this info.

julia> @eval Base.IRShow DILineInfoPrinter(debuginfo, def) = DILineInfoPrinter(debuginfo, def, true)
DILineInfoPrinter (generic function with 2 methods)

julia> (@code_typed 1 + 1.0)[1]
CodeInfo(
    @ promotion.jl:425 within `+`
   ┌ invoke MethodInstance for promote(::Int64, ::Float64)
   │ @ promotion.jl:396 within `promote`
   │┌ invoke MethodInstance for Base._promote(::Int64, ::Float64)
   ││ @ promotion.jl:373 within `_promote`
   ││┌ invoke MethodInstance for convert(::Type{Float64}, ::Int64)
   │││ @ number.jl:7 within `convert`
   │││┌ invoke MethodInstance for Float64(::Int64)
   ││││ @ float.jl:221 within `Float64`
1 ─││││ %1 = Base.sitofp(Float64, x)::Float64
│  └└└└
│  ┌ invoke MethodInstance for +(::Float64, ::Float64)
│  │ @ float.jl:460 within `+`
│  │ %2 = Base.add_float(%1, y)::Float64
│  └
└──      return %2
)

@vtjnash vtjnash requested a review from vchuravy December 5, 2023 22:23
@vchuravy vchuravy requested a review from timholy December 5, 2023 22:29
@vtjnash vtjnash force-pushed the jn/invert-linetable branch 2 times, most recently from 019dc8e to ad70b13 Compare December 12, 2023 21:48
@JeffBezanson
Copy link
Member

I like this; feels like a real improvement. Just need to figure out some details and how much we care about compatibility of code that handles IR.

@vtjnash
Copy link
Member Author

vtjnash commented Dec 21, 2023

compatibility of code that handles IR.

Thanks! As long as code has been using the Compiler APIs, they should not be much affected. It is rare for code to need to "invent" a novel line number. Inlining does it in some specific ways, but generally code should not do it otherwise, as lines should come from the user, which is what I wanted to better reflect here, compared to what I felt that inlined_at had done previously.

Copy link
Member

@timholy timholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't pretend to have read the whole thing carefully, but the overall concept seems great. Merely having the type info available for inlining is a huge win on its own. I'll certainly make any needed changes to the Revise stack when this gets merged.

My comments are mostly about the clarity of the documentation.

doc/src/devdocs/ast.md Show resolved Hide resolved
doc/src/devdocs/ast.md Outdated Show resolved Hide resolved
doc/src/devdocs/ast.md Show resolved Hide resolved
doc/src/devdocs/ast.md Outdated Show resolved Hide resolved
doc/src/devdocs/ast.md Outdated Show resolved Hide resolved
doc/src/devdocs/ast.md Outdated Show resolved Hide resolved
doc/src/devdocs/ast.md Show resolved Hide resolved
doc/src/devdocs/ast.md Outdated Show resolved Hide resolved
doc/src/devdocs/ast.md Outdated Show resolved Hide resolved
@vtjnash
Copy link
Member Author

vtjnash commented Jan 9, 2024

I was testing Revise on this branch earlier to see if I needed to make a PR there, and I think this PR already resolved a lot of the issues with it: JuliaDebug/JuliaInterpreter.jl#606. Looks like there are a few other failing test now (since it type-asserts the kind of something and doesn't know about the new Base.IRShow.LineInfoNode). They look minor, but let me know if something isn't clear.

@timholy
Copy link
Member

timholy commented Jan 9, 2024

When we get a better way of tracking "provenance" (e.g., #31162) through lowering, will this make it easier to extend it all the way through the IRCode representation? It seems plausible to me that it will. Someday we should write an IRCode interpreter, I expect that to be much faster than any interpreter we have now. But it's kinda useless if you can't track whatever you're evaluating back to the original source code.

@vtjnash
Copy link
Member Author

vtjnash commented Jan 9, 2024

This already does that in fact. Every statement refers back to the IR statement it was originally from, including the inlining info, instead of the line numbers (only the original statement then has the line number). It isn't one-to-one precise though (as various transforms may combine or delete statements), and doesn't alter lowering to track it back to the AST.

@timholy
Copy link
Member

timholy commented Jan 9, 2024

Right. All that sounds great. Looking forward to this and to getting similar things done for lowering too---it will expand the tooling options considerably.

fldarray = getfield(getfield(node, :data), fld)
fldidx = getfield(node, :idx)
if fld === :line
(fldarray[3fldidx-2], fldarray[3fldidx-1], fldarray[3fldidx-0]) = val::NTuple{3,Int32}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth giving these (line, edge, edge_line) triplets a name?

I had to do a fair bit of digging to figure out how to interpret them

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that covered in the documentation here already?

@vtjnash
Copy link
Member Author

vtjnash commented Mar 5, 2024

@nanosoldier runbenchmarks("inference", vs=":master")

aviatesk added a commit to JuliaDebug/Cthulhu.jl that referenced this pull request Mar 19, 2024
aviatesk pushed a commit to vtjnash/Cthulhu.jl that referenced this pull request Mar 20, 2024
aviatesk pushed a commit to vtjnash/JuliaInterpreter.jl that referenced this pull request Mar 21, 2024
aviatesk pushed a commit to vtjnash/JuliaInterpreter.jl that referenced this pull request Mar 21, 2024
aviatesk pushed a commit to vtjnash/JuliaInterpreter.jl that referenced this pull request Mar 21, 2024
aviatesk pushed a commit to vtjnash/Cthulhu.jl that referenced this pull request Mar 23, 2024
aviatesk pushed a commit to vtjnash/JuliaInterpreter.jl that referenced this pull request Mar 25, 2024
aviatesk pushed a commit to vtjnash/JuliaInterpreter.jl that referenced this pull request Mar 25, 2024
aviatesk pushed a commit to vtjnash/JuliaInterpreter.jl that referenced this pull request Mar 25, 2024
Keno added a commit that referenced this pull request Mar 26, 2024
This was origiginally supposed to be an issue, but I just started
writing out the whole code in the issue text to explain what I want
all the behavior to be, so instead, here's the actual implementation
of it, with the motativation in the commit message, and the details
of the actual behavior in the code change ;)

Sometimes packages rely on Julia internals. This is in general discouraged, but of course for some packages, there isn't really any other option. If your packages needs to hook the julia internals in a deep way or is specifically about introspecting the way that julia itself works, then some amount of reliance on internals is inevitable. In general, we're happy to let people touch the internals, as long as they (and their users) are aware that things will break and it's on them to fix things.

That said, I think we've been a little bit too *caveat emptor* on this entire business. There's a number of really key packages that rely on internals (I'm thinking in particular of Revise, Cthulhu and its dependency stacks) that if they're broken, it's really hard to even develop julia itself. In particular, these packages have been broken on Julia master for a more than a week now (following #52415) and there has been much frustration.

I think one of the biggest issues is that we're generally relying on `VERSION` checks for these kinds of things. This isn't really a problem when updating a package between released major versions, but for closely coupled packages like the above you run into two problems:

1. Since the VERSION number of a package is not known ahead of time, some breaking changes cannot be made atomically, i.e. we need to merge the base PR (which bumps people's nightly) in order to get the version number, which we then need to plug into the various PRs in all the various packages. If something goes wrong in this process (as it did this time), there can be extended breakage.

2. The VERSION based comparison can be wrong if you're on an older PR (that's a head of the base branch, but with different commits). As a result, when working on base PRs, you not infrequently run into a situation, where you get a VERSION false-positive from updating a package, introducing errors you didn't see before. This one isn't usually that terrible, because a rebase will fix it (and you'll probably need to rebase anyway), but it can be very confusing to get new and unexpected errors from random packages.

I would propose that going forward, we strongly discourage closely coupled packages from using `VERSION` comparisons and intead:

1. Where possible, probe for the feature or method signature that they're actually looking for, if it's something small (e.g. a rename of base type).
2. That we as julia base establish a mechanism for probing whether your current pre-release julia has a certain change. My sketch proposal is in this PR.
aviatesk pushed a commit to vtjnash/JuliaInterpreter.jl that referenced this pull request Mar 26, 2024
aviatesk pushed a commit to vtjnash/Cthulhu.jl that referenced this pull request Mar 26, 2024
aviatesk added a commit that referenced this pull request Apr 1, 2024
Before #52415 we could do multiple rounds of inlining on
IR that had already been inlined, but this was no longer possible. By
removing the assertion that was introduced in #52415,
this commit makes it possible to do multi-inlining once more. Note that
to fully solve this, though, we need to enhance `ir_inline_linetable!`
so it can add new linetables to an inner linetable that's already been
inlined. This commit notes that enhancement as something we need to do
later, leaving it with a TODO comment.
aviatesk added a commit that referenced this pull request Apr 1, 2024
Before #52415 we could do multiple rounds of inlining on
IR that had already been inlined, but this was no longer possible. By
removing the assertion that was introduced in #52415,
this commit makes it possible to do multi-inlining once more. Note that
to fully solve this, though, we need to enhance `ir_inline_linetable!`
so it can add new linetables to an inner linetable that's already been
inlined. This commit notes that enhancement as something we need to do
later, leaving it with a TODO comment.
aviatesk added a commit that referenced this pull request Apr 1, 2024
Before #52415 we could do multiple rounds of inlining on
IR that had already been inlined, but this was no longer possible. By
removing the assertion that was introduced in #52415,
this commit makes it possible to do multi-inlining once more. Note that
to fully solve this, though, we need to enhance `ir_inline_linetable!`
so it can add new linetables to an inner linetable that's already been
inlined. This commit notes that enhancement as something we need to do
later, leaving it with a TODO comment.
Keno pushed a commit that referenced this pull request Apr 1, 2024
Before #52415 we could do multiple rounds of inlining on
IR that had already been inlined, but this was no longer possible. By
removing the assertion that was introduced in #52415,
this commit makes it possible to do multi-inlining once more. Note that
to fully solve this, though, we need to enhance `ir_inline_linetable!`
so it can add new linetables to an inner linetable that's already been
inlined. This commit notes that enhancement as something we need to do
later, leaving it with a TODO comment.
aviatesk added a commit that referenced this pull request Apr 2, 2024
…#53781)

- fix up added documents: eb05b4f
- ~~set up a specific type to capture the 3-set data of `codelocs`:
6afde4b~~ (moved to a separate PR)
Pangoraw added a commit to FluxML/IRTools.jl that referenced this pull request Apr 3, 2024
`Core.Compiler.LineInfoNode` was removed in JuliaLang/julia#52415
but kept in `Core` for serializer compatibility reasons (see
https://github.com/JuliaLang/julia/blob/e07c0f1ddbfc89ad1ac4dda7246d8ed5d0d57c19/base/boot.jl#L491).
Pangoraw added a commit to FluxML/IRTools.jl that referenced this pull request Apr 6, 2024
`Core.Compiler.LineInfoNode` was removed in JuliaLang/julia#52415
but kept in `Core` for serializer compatibility reasons (see
https://github.com/JuliaLang/julia/blob/e07c0f1ddbfc89ad1ac4dda7246d8ed5d0d57c19/base/boot.jl#L491).

Linetable representation in Julia IR has been changed to be more compact.
This commit updates IRTools to decode and generate valid debug information according to these changes.

Adaptation for IRTools is relatively straightforward in the sense that
IRTools operates on untyped IR which does not contains inline
statements.

However, IRTools contains an inlining pass but it did not account for
linetable information previously. In this change, I have made the choice
to use the same line for all inlined statements. This can be further
improved in the future.
Pangoraw added a commit to FluxML/IRTools.jl that referenced this pull request Apr 10, 2024
`Core.Compiler.LineInfoNode` was removed in JuliaLang/julia#52415
but kept in `Core` for serializer compatibility reasons (see
https://github.com/JuliaLang/julia/blob/e07c0f1ddbfc89ad1ac4dda7246d8ed5d0d57c19/base/boot.jl#L491).

Linetable representation in Julia IR has been changed to be more compact.
This commit updates IRTools to decode and generate valid debug information according to these changes.

Adaptation for IRTools is relatively straightforward in the sense that
IRTools operates on untyped IR which does not contains inline
statements.

However, IRTools contains an inlining pass but it did not account for
linetable information previously. In this change, I have made the choice
to use the same line for all inlined statements. This can be further
improved in the future.
Keno added a commit that referenced this pull request May 4, 2024
This was origiginally supposed to be an issue, but I just started
writing out the whole code in the issue text to explain what I want
all the behavior to be, so instead, here's the actual implementation
of it, with the motativation in the commit message, and the details
of the actual behavior in the code change ;)

Sometimes packages rely on Julia internals. This is in general discouraged, but of course for some packages, there isn't really any other option. If your packages needs to hook the julia internals in a deep way or is specifically about introspecting the way that julia itself works, then some amount of reliance on internals is inevitable. In general, we're happy to let people touch the internals, as long as they (and their users) are aware that things will break and it's on them to fix things.

That said, I think we've been a little bit too *caveat emptor* on this entire business. There's a number of really key packages that rely on internals (I'm thinking in particular of Revise, Cthulhu and its dependency stacks) that if they're broken, it's really hard to even develop julia itself. In particular, these packages have been broken on Julia master for a more than a week now (following #52415) and there has been much frustration.

I think one of the biggest issues is that we're generally relying on `VERSION` checks for these kinds of things. This isn't really a problem when updating a package between released major versions, but for closely coupled packages like the above you run into two problems:

1. Since the VERSION number of a package is not known ahead of time, some breaking changes cannot be made atomically, i.e. we need to merge the base PR (which bumps people's nightly) in order to get the version number, which we then need to plug into the various PRs in all the various packages. If something goes wrong in this process (as it did this time), there can be extended breakage.

2. The VERSION based comparison can be wrong if you're on an older PR (that's a head of the base branch, but with different commits). As a result, when working on base PRs, you not infrequently run into a situation, where you get a VERSION false-positive from updating a package, introducing errors you didn't see before. This one isn't usually that terrible, because a rebase will fix it (and you'll probably need to rebase anyway), but it can be very confusing to get new and unexpected errors from random packages.

I would propose that going forward, we strongly discourage closely coupled packages from using `VERSION` comparisons and intead:

1. Where possible, probe for the feature or method signature that they're actually looking for, if it's something small (e.g. a rename of base type).
2. That we as julia base establish a mechanism for probing whether your current pre-release julia has a certain change. My sketch proposal is in this PR.
Keno added a commit that referenced this pull request May 5, 2024
This was origiginally supposed to be an issue, but I just started
writing out the whole code in the issue text to explain what I want all
the behavior to be, so instead, here's the actual implementation of it,
with the motativation in the commit message, and the details of the
actual behavior in the code change ;)

Sometimes packages rely on Julia internals. This is in general
discouraged, but of course for some packages, there isn't really any
other option. If your packages needs to hook the julia internals in a
deep way or is specifically about introspecting the way that julia
itself works, then some amount of reliance on internals is inevitable.
In general, we're happy to let people touch the internals, as long as
they (and their users) are aware that things will break and it's on them
to fix things.

That said, I think we've been a little bit too *caveat emptor* on this
entire business. There's a number of really key packages that rely on
internals (I'm thinking in particular of Revise, Cthulhu and its
dependency stacks) that if they're broken, it's really hard to even
develop julia itself. In particular, these packages have been broken on
Julia master for a more than a week now (following #52415) and there has
been much frustration.

I think one of the biggest issues is that we're generally relying on
`VERSION` checks for these kinds of things. This isn't really a problem
when updating a package between released major versions, but for closely
coupled packages like the above you run into two problems:

1. Since the VERSION number of a package is not known ahead of time,
some breaking changes cannot be made atomically, i.e. we need to merge
the base PR (which bumps people's nightly) in order to get the version
number, which we then need to plug into the various PRs in all the
various packages. If something goes wrong in this process (as it did
this time), there can be extended breakage.

2. The VERSION based comparison can be wrong if you're on an older PR
(that's a head of the base branch, but with different commits). As a
result, when working on base PRs, you not infrequently run into a
situation, where you get a VERSION false-positive from updating a
package, introducing errors you didn't see before. This one isn't
usually that terrible, because a rebase will fix it (and you'll probably
need to rebase anyway), but it can be very confusing to get new and
unexpected errors from random packages.

I would propose that going forward, we strongly discourage closely
coupled packages from using `VERSION` comparisons and intead:

1. Where possible, probe for the feature or method signature that
they're actually looking for, if it's something small (e.g. a rename of
base type).
2. That we as julia base establish a mechanism for probing whether your
current pre-release julia has a certain change. My sketch proposal is in
this PR.
lazarusA pushed a commit to lazarusA/julia that referenced this pull request Jul 12, 2024
…Lang#53849)

This was origiginally supposed to be an issue, but I just started
writing out the whole code in the issue text to explain what I want all
the behavior to be, so instead, here's the actual implementation of it,
with the motativation in the commit message, and the details of the
actual behavior in the code change ;)

Sometimes packages rely on Julia internals. This is in general
discouraged, but of course for some packages, there isn't really any
other option. If your packages needs to hook the julia internals in a
deep way or is specifically about introspecting the way that julia
itself works, then some amount of reliance on internals is inevitable.
In general, we're happy to let people touch the internals, as long as
they (and their users) are aware that things will break and it's on them
to fix things.

That said, I think we've been a little bit too *caveat emptor* on this
entire business. There's a number of really key packages that rely on
internals (I'm thinking in particular of Revise, Cthulhu and its
dependency stacks) that if they're broken, it's really hard to even
develop julia itself. In particular, these packages have been broken on
Julia master for a more than a week now (following JuliaLang#52415) and there has
been much frustration.

I think one of the biggest issues is that we're generally relying on
`VERSION` checks for these kinds of things. This isn't really a problem
when updating a package between released major versions, but for closely
coupled packages like the above you run into two problems:

1. Since the VERSION number of a package is not known ahead of time,
some breaking changes cannot be made atomically, i.e. we need to merge
the base PR (which bumps people's nightly) in order to get the version
number, which we then need to plug into the various PRs in all the
various packages. If something goes wrong in this process (as it did
this time), there can be extended breakage.

2. The VERSION based comparison can be wrong if you're on an older PR
(that's a head of the base branch, but with different commits). As a
result, when working on base PRs, you not infrequently run into a
situation, where you get a VERSION false-positive from updating a
package, introducing errors you didn't see before. This one isn't
usually that terrible, because a rebase will fix it (and you'll probably
need to rebase anyway), but it can be very confusing to get new and
unexpected errors from random packages.

I would propose that going forward, we strongly discourage closely
coupled packages from using `VERSION` comparisons and intead:

1. Where possible, probe for the feature or method signature that
they're actually looking for, if it's something small (e.g. a rename of
base type).
2. That we as julia base establish a mechanism for probing whether your
current pre-release julia has a certain change. My sketch proposal is in
this PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants