Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PkgEval detected issues or regressions? #39641

Closed
6 of 7 tasks
vtjnash opened this issue Feb 12, 2021 · 7 comments
Closed
6 of 7 tasks

PkgEval detected issues or regressions? #39641

vtjnash opened this issue Feb 12, 2021 · 7 comments
Labels
regression Regression in behavior compared to a previous version

Comments

@vtjnash
Copy link
Member

vtjnash commented Feb 12, 2021

Looking through recent PkgEval run for segfaults, these seem worth investigating:

signal (4): Illegal instruction
in expression starting at /home/pkgeval/.julia/packages/RBNF/mMbsg/test/qasm.jl:105
Generator at ./generator.jl:32 [inlined]
Generator at ./generator.jl:32 [inlined]
pprint_for_seq at /home/pkgeval/.julia/packages/PrettyPrint/z2Fty/src/PrettyPrint.jl:60

Fails due to:

julia> Vector{Union{Nothing, T} where T} == Vector{Any}
true
julia> Vector{Union{Nothing, T} where T} === Vector{Any}
false

Which is #35130 (comment)


signal (11): Segmentation fault
in expression starting at /home/pkgeval/.julia/packages/ChainRules/vQPoD/test/rulesets/LinearAlgebra/norm.jl:1
gc_try_setmark at /buildworker/worker/package_linux64/build/src/gc.c:1704 [inlined]
gc_mark_loop at /buildworker/worker/package_linux64/build/src/gc.c:2296
_jl_gc_collect at /buildworker/worker/package_linux64/build/src/gc.c:3034
jl_gc_collect at /buildworker/worker/package_linux64/build/src/gc.c:3241
maybe_collect at /buildworker/worker/package_linux64/build/src/gc.c:880 [inlined]
jl_gc_pool_alloc at /buildworker/worker/package_linux64/build/src/gc.c:1204
jl_gc_alloc_ at /buildworker/worker/package_linux64/build/src/julia_internal.h:285 [inlined]
jl_gc_alloc at /buildworker/worker/package_linux64/build/src/gc.c:3283
_new_array_ at /buildworker/worker/package_linux64/build/src/array.c:122 [inlined]
_new_array at /buildworker/worker/package_linux64/build/src/array.c:188 [inlined]
jl_alloc_array_1d at /buildworker/worker/package_linux64/build/src/array.c:459
Array at ./boot.jl:451 [inlined]
Array at ./boot.jl:460 [inlined]
similar at ./abstractarray.jl:785 [inlined]
similar at ./abstractarray.jl:784 [inlined]
_array_for at ./array.jl:671 [inlined]
compute_live_ins at ./compiler/ssair/slot2ssa.jl:526
construct_ssa! at ./compiler/ssair/slot2ssa.jl:638
slot2reg at ./compiler/ssair/driver.jl:117 [inlined]
run_passes at ./compiler/ssair/driver.jl:124
...

This is similar to ProximalAlgorithms, though in this case, it is a hoisted FCA load of a getfield:

julia> using ChainRulesTestUtils, LinearAlgebra
julia> f = ChainRulesTestUtils.var"#fnew#23"{ChainRulesTestUtils.var"#31#32"{NamedTuple{(), Tuple{}}, typeof(LinearAlgebra.norm)}, 
       Tuple{LinearAlgebra.UpperTriangular{Base.Complex{Float64}, Array{Base.Complex{Float64}, 2}}, Float64}, Tuple{Bool, Bool}}
julia> fx = ccall(:jl_new_struct_uninit, Any, (Any,), f); Core.println(fx)
julia> code_llvm(fx, Tuple{LinearAlgebra.UpperTriangular{Base.Complex{Float64}, Array{Base.Complex{Float64}, 2}}, Vararg{Any}}, raw=true, dump_module=true, optimize=true|false)
  %.fca.0.load = load {}*, {}** %137, align 8                                                                                                                                                           
  %.fca.0.insert = insertvalue [1 x {}*] undef, {}* %.fca.0.load, 0                                                                                                                                     
  %138 = getelementptr inbounds [7 x {}*], [7 x {}*]* %gcframe1182, i64 0, i64 6                                                                                                                        
  store {}* %.fca.0.load, {}** %138, align 16                                                       

[ Info: acquire() failed, error=TypeError(:typeassert, "", NLopt.Callback_Data, Core.Compiler.UseRef(:(Base.arraysize(_4, 2)), 4)), retry(9)

signal (11): Segmentation fault
in expression starting at /home/pkgeval/.julia/packages/EfficientGlobalOptimization/NlQ6o/test/test_ego.jl:1
jl_object_id_ at /buildworker/worker/package_linux64/build/src/builtins.c:385
type_hash at /buildworker/worker/package_linux64/build/src/jltypes.c:1040
typekey_hash at /buildworker/worker/package_linux64/build/src/jltypes.c:1052 [inlined]
lookup_type at /buildworker/worker/package_linux64/build/src/jltypes.c:635
inst_datatype_inner at /buildworker/worker/package_linux64/build/src/jltypes.c:1190
jl_inst_arg_tuple_type at /buildworker/worker/package_linux64/build/src/jltypes.c:1425
arg_type_tuple at /buildworker/worker/package_linux64/build/src/gf.c:1844 [inlined]
jl_lookup_generic_ at /buildworker/worker/package_linux64/build/src/gf.c:2371 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2423
_show_default at ./show.jl:412
show_default at ./show.jl:395 [inlined]
show at ./show.jl:390 [inlined]
print at ./strings/io.jl:35
unknown function (ip: 0x7f5799ec8b45)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2245 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2427
print_to_string at ./strings/io.jl:135
string at ./strings/io.jl:174 [inlined]
macro expansion at ./logging.jl:340 [inlined]
#acquire#13 at /home/pkgeval/.julia/packages/EfficientGlobalOptimization/NlQ6o/src/EGO.jl:55
acquire##kw at /home/pkgeval/.julia/packages/EfficientGlobalOptimization/NlQ6o/src/EGO.jl:51
#acquire#13 at /home/pkgeval/.julia/packages/EfficientGlobalOptimization/NlQ6o/src/EGO.jl:56
acquire##kw at /home/pkgeval/.julia/packages/EfficientGlobalOptimization/NlQ6o/src/EGO.jl:51

This is specifying [compat] for an antiquated version of https://github.com/JuliaOpt/NLopt.jl which was broken.


signal (11): Segmentation fault
in expression starting at none:0
PyObject_Call at /home/pkgeval/.julia/conda/3/lib/libpython3.8.so.1.0 (unknown line)
macro expansion at /home/pkgeval/.julia/packages/PyCall/tqyST/src/exception.jl:95 [inlined]
#109 at /home/pkgeval/.julia/packages/PyCall/tqyST/src/pyfncall.jl:43 [inlined]
disable_sigint at ./c.jl:458 [inlined]
__pycall! at /home/pkgeval/.julia/packages/PyCall/tqyST/src/pyfncall.jl:42 [inlined]
BoundsError(a=Array{Core.Compiler.BasicBlock, (1,)}[Core.Compiler.BasicBlock(stmts=Core.Compiler.StmtRange(start=1, stop=2), preds=Array{Int64, (0,)}[], succs=Array{Int64, (0,)}[])], i=(2,))
jl_bounds_error_ints at /buildworker/worker/package_linux64/build/src/rtutils.c:186
getindex at ./array.jl:802 [inlined]
compute_basic_blocks at ./compiler/ssair/ir.jl:105
...
signal (11): Segmentation fault
in expression starting at /home/pkgeval/.julia/packages/OptionalArgChecks/V6lE7/test/runtests.jl:45
_ZN4llvm23removeUnreachableBlocksERNS_8FunctionEPNS_14DomTreeUpdaterEPNS_16MemorySSAUpdaterE at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)
_ZL19simplifyFunctionCFGRN4llvm8FunctionERKNS_19TargetTransformInfoERKNS_18SimplifyCFGOptionsE at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)
_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)

signal (11): Segmentation fault
in expression starting at /home/pkgeval/.julia/packages/ProximalAlgorithms/IxN0R/test/problems/test_lasso_small.jl:1
_ZN4llvm11PointerType3getEPNS_4TypeEj at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)
_ZN4llvm12InstCombiner18visitAddrSpaceCastERNS_17AddrSpaceCastInstE at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)
_ZN4llvm12InstCombiner3runEv at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)
_ZL31combineInstructionsOverFunctionRN4llvm8FunctionERNS_19InstCombineWorklistEPNS_9AAResultsERNS_15AssumptionCacheERNS_17TargetLibraryInfoERNS_13DominatorTreeERNS_25OptimizationRemarkEmitterEPNS_18BlockFrequencyInfoEPNS_18ProfileSummaryInfoEjPNS_8LoopInfoE at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)
_ZN4llvm24InstructionCombiningPass13runOnFunctionERNS_8FunctionE at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)
_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)

This appears to be a bug in PhiNode handling in llvm-alloc-opt.cpp. In particular, we have this node:

│     %1076 = φ (#394 => %1054, #396 => %1060)::Union{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(+), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(*), Tuple{Base.RefValue{Nothing}, Vector{Flo
at64}}}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(*), Tuple{Float64, Vector{Float64}}}}}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(+), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.Default
ArrayStyle{1}, Nothing, typeof(*), Tuple{Float64, Vector{Float64}}}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(*), Tuple{Float64, Vector{Float64}}}}}}                                                                                

which turns into:

   %1896 = bitcast {}* %value_phi70 to { [2 x {}*] }*

and

   %1909 = bitcast {}* %value_phi70 to { { double, {}* } }*

after (splitting on isa). This ends up confusing llvm-alloc-opt.cpp, which tries to fold the PhiNode and then marks that first slot as containing a ref. This violates LLVM's typed-memory assumption that all loads through a particular pointer must share the same type. This pass either needs to further split the users (into ref and value slots) or inhibit the optimization when the types collide like this.

To see this in usage, try:
code_llvm(ProximalAlgorithms.iterate, (ProximalAlgorithms.DRLS_iterable{Float64, Float64, Vector{Float64}, ProximalAlgorithms.ProximalOperators.LeastSquaresDirect{1, Float64, Float64, Matrix{Float64}, Vector{Float64}, Cholesky{Float64, Matrix{Float64}}}, NormL1{Float64}, ProximalAlgorithms.LBFGS{Float64, Float64, Int64, Vector{Float64}, 5}}, ProximalAlgorithms.DRLS_state{Float64, Vector{Float64}, ProximalAlgorithms.LBFGS{Float64, Float64, Int64, Vector{Float64}, 5}}))


  %.sub2957 = phi {} addrspace(10)** [ %.sub2952, %L112.thread ], [ %.sub29, %L112 ]
julia: /buildworker/worker/package_linux64/build/src/llvm-late-gc-lowering.cpp:1478: State LateLowerGCFrame::LocalScan(llvm::Function&): Assertion `false && "Expected alloca"' failed.

signal (6): Aborted
in expression starting at /home/pkgeval/.julia/packages/Matcha/Bx1mi/test/runtests.jl:70
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7f3604364728)
__assert_fail at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
LocalScan at /buildworker/worker/package_linux64/build/src/llvm-late-gc-lowering.cpp:1478
runOnFunction at /buildworker/worker/package_linux64/build/src/llvm-late-gc-lowering.cpp:2661
_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /opt/julia/bin/../lib/julia/libLLVM-11jl.so (unknown line)

LLVM is normalizing our return_roots alloca from T, n to [T x n], which involves adding a GEP 0,0 before the user. Somehow this GEP gets sunk into a loop, and ends up generating a bunch of phi nodes, which all are the same GEP into the same original alloca.
From: code_llvm(Matcha.inner_matchat, (Vector{Int64}, Int64, Tuple{Greed{typeof(Matcha.alwaysmatch), UnitRange{Int64}}, Greed{Int64, UnitRange{Int64}}}, Matcha.History{Vector{Int64}, SubArray, Int64}))

@vtjnash vtjnash added the regression Regression in behavior compared to a previous version label Feb 12, 2021
@simeonschaub
Copy link
Member

The OptionalArgChecks one can be safely ignored, since it manipulates IR in some really bad and wrong ways. I just couldn't be bother to fix it yet. (Is there a good way to exempt a package from PkgEval? Or perhaps I will just add a check in the test suite that checks the version and just quits if it is too new.)

@vtjnash
Copy link
Member Author

vtjnash commented Feb 12, 2021

Yes, the package should be upper bounded on version compatibility

@maleadt
Copy link
Member

maleadt commented Feb 13, 2021

Off topic, but related: Should PkgEval daily evaluations always compare against the latest release? That way new failures wouldn't disappear after a single run.

@DilumAluthge
Copy link
Member

Off topic, but related: Should PkgEval daily evaluations always compare against the latest release? That way new failures wouldn't disappear after a single run.

I think that would be useful.

@vtjnash
Copy link
Member Author

vtjnash commented Feb 15, 2021

Any package that fails on a release build should be removed from General for that release, so that ability to see that comparison should already be present. Perhaps we could even add an Unstable channel, where package versions automatically get moved if they fail PkgEval too much, or don't have sufficient test coverage?

@KristofferC
Copy link
Member

Any package that fails on a release build should be removed from General for that release

What do you mean "removed from General for that release"? If you are meaning that the version of the package should no longer be installable on that Julia release because PkgEval marked that it has a test failure, that is just completely out of the question.

@vtjnash
Copy link
Member Author

vtjnash commented Feb 16, 2021

I've narrowed down ProximalAlgorithms to being an issue with the llvm-alloc-opt.cpp pass (@yuyichao, if interested), with some details added above.

vtjnash added a commit that referenced this issue Feb 19, 2021
LLVM will switch to this form, so it is preferable to start that way.

Refs Matcha in #39641
vtjnash added a commit that referenced this issue Feb 19, 2021
At runtime, it is prohibited to observe these values, but we need to
make sure they are not reading through undefined pointers (and
potentially trying to GC-root the memory there)

Refs ChainRules in #39641
vtjnash added a commit that referenced this issue Feb 21, 2021
LLVM will switch to this form, so it is preferable to start that way.

Refs Matcha in #39641
vtjnash added a commit that referenced this issue Feb 21, 2021
At runtime, it is prohibited to observe these values, but we need to
make sure they are not reading through undefined pointers (and
potentially trying to GC-root the memory there)

Refs ChainRules in #39641
KristofferC pushed a commit that referenced this issue Feb 22, 2021
LLVM will switch to this form, so it is preferable to start that way.

Refs Matcha in #39641

(cherry picked from commit f879cd1)
KristofferC pushed a commit that referenced this issue Feb 22, 2021
At runtime, it is prohibited to observe these values, but we need to
make sure they are not reading through undefined pointers (and
potentially trying to GC-root the memory there)

Refs ChainRules in #39641

(cherry picked from commit fdd2633)
@vtjnash vtjnash closed this as completed Feb 24, 2021
ElOceanografo pushed a commit to ElOceanografo/julia that referenced this issue May 4, 2021
LLVM will switch to this form, so it is preferable to start that way.

Refs Matcha in JuliaLang#39641
ElOceanografo pushed a commit to ElOceanografo/julia that referenced this issue May 4, 2021
At runtime, it is prohibited to observe these values, but we need to
make sure they are not reading through undefined pointers (and
potentially trying to GC-root the memory there)

Refs ChainRules in JuliaLang#39641
antoine-levitt pushed a commit to antoine-levitt/julia that referenced this issue May 9, 2021
LLVM will switch to this form, so it is preferable to start that way.

Refs Matcha in JuliaLang#39641
antoine-levitt pushed a commit to antoine-levitt/julia that referenced this issue May 9, 2021
At runtime, it is prohibited to observe these values, but we need to
make sure they are not reading through undefined pointers (and
potentially trying to GC-root the memory there)

Refs ChainRules in JuliaLang#39641
staticfloat pushed a commit that referenced this issue Dec 23, 2022
LLVM will switch to this form, so it is preferable to start that way.

Refs Matcha in #39641

(cherry picked from commit f879cd1)
staticfloat pushed a commit that referenced this issue Dec 23, 2022
At runtime, it is prohibited to observe these values, but we need to
make sure they are not reading through undefined pointers (and
potentially trying to GC-root the memory there)

Refs ChainRules in #39641

(cherry picked from commit fdd2633)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
regression Regression in behavior compared to a previous version
Projects
None yet
Development

No branches or pull requests

5 participants