Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precompile statement emitter sometimes emits invalid statements #28808

Open
KristofferC opened this issue Aug 21, 2018 · 29 comments
Open

Precompile statement emitter sometimes emits invalid statements #28808

KristofferC opened this issue Aug 21, 2018 · 29 comments
Labels
bug Indicates an unexpected problem or unintended behavior compiler:precompilation Precompilation of modules

Comments

@KristofferC
Copy link
Member

KristofferC commented Aug 21, 2018

This is OK:

julia> typeof(dateformat"YYYY-mm-dd\THH:MM:SS")
DateFormat{Symbol("YYYY-mm-dd\\THH:MM:SS"),

But running with trace-compile=stderr we can see the following precompile statement being generated

julia> Dates.format(now(), dateformat"YYYY-mm-dd\THH:MM:SS")
precompile(Tuple{typeof(Dates.format), Dates.DateTime, Dates.DateFormat{Symbol("YYYY-mm-dd\THH:MM:SS") ...

which contains an unescaped \ in the DateFormat type. Encountering such a type in precompilation gathering means it will fail.

@tehrengruber
Copy link

tehrengruber commented Jul 14, 2019

As already mentioned in #32574 I also stumbled upon this recently. As of now I am aware of three cases in which the output is not correct.

Case 1

Types with symbols as parameters which need to be escaped, e.g. contain a backslash (this is the case you mention)
Example:

foo(::T) where T = 1
struct A{B} end
foo(A{Symbol("a\\b")})

Output:

precompile(Tuple{typeof(Main.foo), Type{Main.A{Symbol("a\b")}}})

I have already fixed this case in tehrengruber@0fed122 (will open a pull request at some point)

Case 2

Types with symbols as parameters generated via gensym
Example:

macro add_parametric_type(expr)
  push!(expr.args[2].args, gensym())
  esc(expr)
end

@add_parametric_type struct A{} end

foo(::T) where T = 1

foo(A)

Output:

precompile(Tuple{typeof(Main.foo), Type{Main.A{363} where 363}})

Case 3

Types with a Char as a parameter, e.g. Char(0x56)
Example:

foo(::T) where T = 1
struct A{B} end
foo(A{Char(0x56)})

Output:

precompile(Tuple{typeof(Main.foo), Type{Main.A{Char(0x56000000)}}})

The origin of the problem is in jl_static_show_x_ here

for(int i = nb - 1; i >= 0; --i)
The bitstype Char (4 bytes) is traversed bytewise in reverse order, hence resulting in 0x56000000 instead of 0x00000056. I have no clue whats the reason for this ordering. @JeffBezanson you wrote this piece of code, any reason for the ordering that I'm overlooking here?

Edit

One thought further I recognized that the reason for the reverse byte ordering is endianess.

tehrengruber pushed a commit to tehrengruber/julia that referenced this issue Jul 17, 2019
tehrengruber pushed a commit to tehrengruber/julia that referenced this issue Jul 17, 2019
tehrengruber pushed a commit to tehrengruber/julia that referenced this issue Jul 17, 2019
tehrengruber pushed a commit to tehrengruber/julia that referenced this issue Jul 17, 2019
c42f added a commit that referenced this issue Jul 29, 2019
For implementing special syntax it's useful for typed comprehension to
lower to `collect(T, gen)`. This changes typed comprehensions to use the
same lowering pattern as normal comprehensions.

Needed to tweak the precompile workaround from #28808

Co-authored-by: Andy Ferris <ferris.andy@gmail.com>
KristofferC added a commit that referenced this issue Aug 21, 2019
- move the place where --trace-compile outputs precompile statement to a location that catches more cases
- tweak the REPL code to be more amenable to precompilation in light of
- instead of trying to encode all the rules where the precompile emitter
  fails (#28808) just try to precompile and do nothing if it fails.
KristofferC added a commit that referenced this issue Aug 26, 2019
- move the place where --trace-compile outputs precompile statement to a location that catches more cases
- tweak the REPL code to be more amenable to precompilation in light of
- instead of trying to encode all the rules where the precompile emitter
  fails (#28808) just try to precompile and do nothing if it fails.
KristofferC added a commit that referenced this issue Aug 26, 2019
- move the place where --trace-compile outputs precompile statement to a location that catches more cases
- tweak the REPL code to be more amenable to precompilation in light of
- instead of trying to encode all the rules where the precompile emitter
  fails (#28808) just try to precompile and do nothing if it fails.

(cherry picked from commit c0478d8)
@pfitzseb
Copy link
Member

We also sometimes generate code with invalid _s in it (in type parameters, IIRC).
See here for PackageCompilers workaround.

KristofferC added a commit that referenced this issue Aug 26, 2019
- move the place where --trace-compile outputs precompile statement to a location that catches more cases
- tweak the REPL code to be more amenable to precompilation in light of
- instead of trying to encode all the rules where the precompile emitter
  fails (#28808) just try to precompile and do nothing if it fails.

(cherry picked from commit c0478d8)
@ghost
Copy link

ghost commented Oct 15, 2020

Regarding Case 2, copying over this context from the original issue (#32574):

I would be willing to fix this myself, but I'm not sure how to handle this correctly. The generated symbol can not be used directly since the resulting code would be invalid just as well and replacing it by a different "regular" symbol, e.g. A{##363} where ##363 by A{B} where B, is problematic since B might collide with other parameters in general.

This is now addressable thanks to the cool var"" syntax that @c42f added in #32408.

So i think we just need to find whatever code is dumping the precompile statement (is it jl_static_show_x_?) and have it use the var"" syntax here instead! :)

This would be a very similar exercise to that taken in #34888.

@NHDaly
Copy link
Member

NHDaly commented Oct 15, 2020

I'm working on addressing Case 2, here: #38049

It's fixed on that branch, and now i'm just following up with a couple other small improvements to static_show() that I noticed along the way.

@NHDaly
Copy link
Member

NHDaly commented Oct 19, 2020

Via the logging added to PackageCompiler in JuliaLang/PackageCompiler.jl#457, I've identified two more classes of failures. One of them might just need to be fixed in PackageCompiler, but I think the other one needs to be fixed in julia:

Case 4 - Incorrectly emitting precompile() with abstract types

UPDATE: Opened issue for this: #38149

For some reason, out of the 15627 statements emitted in our last run, we noticed 21 statements that returned false when running them in the output-o process. From what I can tell, these all appear to be cases where for whatever reason Julia emitted the precompile statement with abstract types, even though the function is not marked @nospecialize. Sometimes they're ::Any, and sometimes they're other abstract types or type unions. This is not a valid precompile statement, since you cannot produce machine code for a function specialized to abstract types (at least as far as I understand).

Here are some examples we see:

┌ Debug: Precompilation failed: precompile(Tuple{typeof(Base.Broadcast.broadcasted), Function, Function, Array{MathOptInterface.ScalarAffineTerm{Float64}, 1}})
└ @ Main.anonymous none:33
┌ Debug: Precompilation failed: precompile(Tuple{typeof(Base.Docs.docstr), Any, Any})
└ @ Main.anonymous none:33
┌ Debug: Precompilation failed: precompile(Tuple{typeof(Base.deepcopy_internal), Any, Base.IdDict{Any, Any}})
└ @ Main.anonymous none:33
┌ Debug: Precompilation failed: precompile(Tuple{typeof(DelveSDK._convert_input_value_type), Type{T} where T})
└ @ Main.anonymous none:33
┌ Debug: Precompilation failed: precompile(Tuple{typeof(DelveSDK._convert_input_value_type), Type{T} where T})
└ @ Main.anonymous none:33
┌ Debug: Precompilation failed: precompile(Tuple{typeof(Blobs.self_size), Type{Blobs.Blob{T}}} where T)
└ @ Main.anonymous none:33

And indeed, the above statements don't precompile, for example:

julia> precompile(Tuple{typeof(Base.Docs.docstr), Any, Any})
false

I'm not sure yet why julia emits statements that will fail to precompile, but this is another case that we should try to fix.

Case 5 - UndefVarError caused by unknown Module names.

UPDATE: Opened PR for this: JuliaLang/PackageCompiler.jl#463

I think this one is probably just an issue in PackageCompiler where we're not re-loading all of the modules from the user's program correctly?

In our latest run producing 15627 statements, we had 292 UndefVarErrors that all appear to be related to module names not being available. Here are some examples:

┌ Debug: Error executing precompile(Tuple{Type{Base.Pair{A, B} where B where A}, Symbol, ColorTypes.RGBA{Float64}}):LoadError("string", 1, UndefVarError(:ColorTypes))
└ @ Main.anonymous none:38
┌ Debug: Error executing precompile(Tuple{Type{Base.Pair{A, B} where B where A}, Symbol, Plots.Plot{Plots.GRBackend}}):LoadError("string", 1, UndefVarError(:Plots))
└ @ Main.anonymous none:38
┌ Debug: Error executing precompile(Tuple{getfield(DelveBenchmarkRunner.ML, Symbol("#57#60"))}):LoadError("string", 1, UndefVarError(:DelveBenchmarkRunner))
└ @ Main.anonymous none:38
┌ Debug: Error executing precompile(Tuple{Type{Measures.Length{:pct, Float64}}, Float64}):LoadError("string", 1, UndefVarError(:Measures))
└ @ Main.anonymous none:38

I'm not sure why these ones failed when most of the statements include a module like this, and succeed. So there's something weird going on there. But probably this is something internal to PackageCompiler?

@timholy
Copy link
Member

timholy commented Oct 20, 2020

For your case 4, these are likely due to partial inference (e.g., from being called by a @nspecialize method, specialization heuristics, or outright inference failure) or module-level @nospecialize. For example, Tuple{typeof(Base.Docs.docstr), Any, Any} comes from this @nospecialize which applies to the entire Docs module.

But I do think Julia should not complain about precompiling that signature.

@NHDaly
Copy link
Member

NHDaly commented Oct 20, 2020

Oh interesting, thanks!

But I do think Julia should not complain about precompiling that signature.

Yeah, agreed! this seems like it might be a bug. I've checked, and these statements are emitted again when re-running my snoop file with the resulting sysimg, so I do not believe they are being statically compiled.

Thanks for the context, @timholy

@NHDaly
Copy link
Member

NHDaly commented Oct 23, 2020

Yeah, indeed I think Case 4 is a bug in julia. I've opened an issue for it here:
#38149

@NHDaly
Copy link
Member

NHDaly commented Oct 23, 2020

(okay and indeed Case 5 was related to PackageCompiler; i've opened a PR here: JuliaLang/PackageCompiler.jl#463)

@NHDaly
Copy link
Member

NHDaly commented Oct 23, 2020

Alright, and finally Class 6 -- this is the last of the errors we're currently seeing in our usage, I think, although it's also the biggest:

Class 6: precompile succeeds, but somehow isn't serialized during static compilation.

This last class of issue is for statements that appear to precompile() successfully, yet doing so during sysimg construction does not get cached as you would expect. Here is an example:

precompile(Tuple{typeof(Base.push!), Array{Function, 1}, Function})

This function shows up in the --trace-compile output both during snoop compiling and again when using the newly created sysimg:

julia> using PackageCompiler

julia> mktemp() do f, io
           write(io, "precompile(Tuple{typeof(Base.push!), Array{Function, 1}, Function})")
           close(io)
           PackageCompiler.create_sysimage(sysimage_path="./precompiled", precompile_statements_file=f)
       end
[ Info: PackageCompiler: creating system image object file, this might take a while...
julia>
$ julia --trace-compile=stderr -J./precompiled -e 'precompile(Tuple{typeof(Base.push!), Array{Function, 1}, Function})'
precompile(Tuple{typeof(Base.MainInclude.include), String})
precompile(Tuple{typeof(Base.push!), Array{Function, 1}, Function})

Compare that with another function that does cache its precompiled result correctly:

precompile(Tuple{typeof(Base.:(&)), Int64, Int64})

as seen here:

julia> mktemp() do f, io
           write(io, """
               precompile(Tuple{typeof(Base.:(&)), Int64, Int64})                   # works as expected
               precompile(Tuple{typeof(Base.push!), Array{Function, 1}, Function})  # doesn't cache
               """)
           close(io)
           PackageCompiler.create_sysimage(sysimage_path="./precompiled", precompile_statements_file=f)
       end
[ Info: PackageCompiler: creating system image object file, this might take a while...

julia>
julia -J./precompiled --trace-compile=stderr -e '
               precompile(Tuple{typeof(Base.:(&)), Int64, Int64})                   # works as expected
               precompile(Tuple{typeof(Base.push!), Array{Function, 1}, Function})  # doesnt cache
'
precompile(Tuple{typeof(Base.MainInclude.include), String})
precompile(Tuple{typeof(Base.push!), Array{Function, 1}, Function})

Out of the 15627 precompile statements emitted, we had ** 2035 cases** of this! In total, including all of the above cases, we have 2629 failures/errors, and in practice we find that only about 50-60% of the compilation time is recovered by using a statically compiled binary. I am assuming that this last class of failures is the bulk of the problem, but I don't yet know why.

@KristofferC
Copy link
Member Author

KristofferC commented Oct 23, 2020

Could Case 6 be from invalidations or type piracy?

Have you tried the "double sysimage" trick (JuliaLang/PackageCompiler.jl#429)? Basically, you compile a sysimage, and then using that sysimage as the base, you compile the sysimage again. At that point, all invalidations should have been resolved.

@NHDaly
Copy link
Member

NHDaly commented Oct 23, 2020

Oh, wow! Thanks for the link; I hadn't seen that. That's very fascinating.

Could it explain a very simple case like the one I show above, where we're literally just precompiling the one single function?

Is the idea that i should then run PackageCompiler starting from the sysimg and compile a new sysimg?

Keno added a commit that referenced this issue Feb 9, 2021
Our printing for precompile statements is not 100% reliable (#28808)
and can fail. I introduced the extra `parse` call in the Varargs
change, but because of #28808, it needs to go inside the try/catch.
Keno added a commit that referenced this issue Feb 10, 2021
Our printing for precompile statements is not 100% reliable (#28808)
and can fail. I introduced the extra `parse` call in the Varargs
change, but because of #28808, it needs to go inside the try/catch.
@mlubin
Copy link
Member

mlubin commented Apr 1, 2021

This is also an issue for printf statements. As @KristofferC helped me debug on slack:

julia> precompile(Tuple{typeof(Printf.format), Base.TTY, Printf.Format{Base.CodeUnits{UInt8, String}, Tuple{Printf.Spec{Base.Val{Char(0x67000000)}}, Printf.Spec{Base.Val{Char(0x67000000)}}, Printf.Spec{Base.Val{Char(0x67000000)}}, Printf.Spec{Base.Val{Char(0x67000000)}}}}, Float64, Float64, Vararg{Float64, N} where N})
ERROR: Base.CodePointError{UInt32}(0x67000000)
Stacktrace:
 [1] code_point_err(u::UInt32)
   @ Base ./char.jl:86
 [2] Char(u::UInt32)
   @ Base ./char.jl:160
 [3] top-level scope
   @ REPL[3]:1

@KristofferC
Copy link
Member Author

I think that is Case 3 in #28808 (comment).

ElOceanografo pushed a commit to ElOceanografo/julia that referenced this issue May 4, 2021
Our printing for precompile statements is not 100% reliable (JuliaLang#28808)
and can fail. I introduced the extra `parse` call in the Varargs
change, but because of JuliaLang#28808, it needs to go inside the try/catch.
antoine-levitt pushed a commit to antoine-levitt/julia that referenced this issue May 9, 2021
Our printing for precompile statements is not 100% reliable (JuliaLang#28808)
and can fail. I introduced the extra `parse` call in the Varargs
change, but because of JuliaLang#28808, it needs to go inside the try/catch.
@timholy timholy added the compiler:precompilation Precompilation of modules label Aug 22, 2021
@NHDaly NHDaly assigned NHDaly and unassigned NHDaly Dec 20, 2021
@IanButterworth
Copy link
Member

I think this is happening on master currently?

Generating REPL precompile statements... 36/36
┌ Warning: Failed to precompile expression
│   form = Tuple{typeof(Base.trypoptask), Base.InvasiveLinkedListSynchronized{Task}})
│   exception = Base.Meta.ParseError("invalid character \"\" near column 1")
└ @ nothing nothing:0
┌ Warning: Failed to precompile expression
│   form = precompile(Tuple{typeof(Base.print), Base.GenericIOBuffer{Array{UInt8, 1}}, UInt32})
│   exception = Base.Meta.ParseError("invalid character \"\" near column 1")
└ @ nothing nothing:0
┌ Warning: Failed to precompile expression
│   form = precompile(Tuple{typeof(Base.task_done_hook), Task})
│   exception = Base.Meta.ParseError("invalid character \"\" near column 1")
└ @ nothing nothing:0
┌ Warning: Failed to precompile expression
│   form = precompile(Tuple{typeof(Base.trypoptask), Base.InvasiveLinkedListSynchronized{Task}})
│   exception = Base.Meta.ParseError("invalid character \"\" near column 1")
└ @ nothing nothing:0
┌ Warning: Failed to precompile expression
│   form =  Base.Invastypeof(Base.trypoptask), Base.InvasiveLinkedListSynchronized{Task}})
│   exception = Base.Meta.ParseError("extra token \"}\" after end of expression")
└ @ nothing nothing:0
Executing precompile statements... 1299/133

@vtjnash
Copy link
Member

vtjnash commented Jan 7, 2022

That looks like something is concurrently corrupting your tmpfiles while executing them

@IanButterworth
Copy link
Member

That happening on both my ubuntu and macos setup. Two separate machines. Is it not happening for other people?

@IanButterworth
Copy link
Member

Is it not happening for other people?

@ararslan reported on slack to also be seeing it on MacOS and Ubuntu. Different failures on each

@DilumAluthge
Copy link
Member

DilumAluthge commented Jan 16, 2022

Did it start recently? If so, a bisect could be useful.

@Moelf
Copy link
Contributor

Moelf commented Feb 3, 2022

Generating REPL precompile statements... 36/36
┌ Warning: Failed to precompile expression
│   form = precompile(Tuple{typeof(Base.print), Base.GenericIOBuffer{Array{UInt8, 1}}, String})
│   exception = Base.Meta.ParseError("invalid character \"\" near column 1")
└ @ nothing nothing:0
┌ Warning: Failed to precompile expression
│   form = .trypoptask), Base.InvasiveLinkedListSynchronized{Task}})
│   exception = Base.Meta.ParseError("invalid character \"\" near column 1")
└ @ nothing nothing:0

on Arch, with a80afe5

@IanButterworth
Copy link
Member

IanButterworth commented Feb 12, 2022

A bisect indicated the latest issue started at 3f6abcf which was part of #43415
cc. @JeffBezanson

edit: My impression is that commit just uncovered the underlying bug, it didn't introduce whatever's messing up the temp file

@DilumAluthge DilumAluthge added this to the 1.8 milestone Feb 12, 2022
@IanButterworth
Copy link
Member

A few things appear wrong here with the error message too.

exception = Base.Meta.ParseError("invalid character \"\" near column 1")

It seems the " escaping isn't working and it appears that it's trying to show an invisible char, but not hitting the invisible char error branch here that might give more helpful info

julia/src/julia-parser.scm

Lines 596 to 597 in 0a91e71

(error (string "invisible character \\u" (number->string (fixnum c) 16) " near column " (+ 1 cn)))
(error (string "invalid character \"" c "\" near column " (+ 1 cn))))))))))

@IanButterworth
Copy link
Member

IanButterworth commented Feb 20, 2022

The latest issue here is fixed by making the process that emits precompile statements before the sysimage generation single-threaded #44281

@Moelf
Copy link
Contributor

Moelf commented Feb 20, 2022

isn't sysimage generation always single-threaded which is why it's so slow? or am I misremembering.

@IanButterworth
Copy link
Member

True. Just clarified my comment

@KristofferC KristofferC removed this from the 1.8 milestone Feb 22, 2022
@senhalil
Copy link

Not sure if what I am observing is exactly due to this issue but I am having a similar behavior to what NHDaly describes here.

Basically I create a system image with a precompile_statements_file that includes the following statements but running with --trace-compile gives back the same statements:

precompile(Tuple{typeof(ProtoBuf.defaultval), Type})
precompile(Tuple{typeof(ProtoBuf.read_field), Base.IOStream, proto.v1beta1.RunRequest, ProtoBuf.ProtoMetaAttribs{Any}, UInt64, Type})
precompile(Tuple{typeof(ProtoBuf.read_field), Base.GenericIOBuffer{Array{UInt8, 1}}, proto.v1beta1.RunMetadata, ProtoBuf.ProtoMetaAttribs{Any}, UInt64, Type})
precompile(Tuple{typeof(ProtoBuf.read_field), Base.GenericIOBuffer{Array{UInt8, 1}}, proto.v1beta1.RunSpecification, ProtoBuf.ProtoMetaAttribs{Any}, UInt64, Type})

Tried with the double system image trick with no avail. It happens both on 1.8.5 and 1.9.0-rc1.

@KristofferC is there a way to check if this is due to invalidations or type piracy as you mentioned here

@NHDaly did you managed to find a solution to this issue?

@KristofferC
Copy link
Member Author

Doesn't look like those statements have invalid syntax so seems like a different issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior compiler:precompilation Precompilation of modules
Projects
None yet
Development

No branches or pull requests