Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Julia nightly throwing at jl_datatype_isinlinealloc #41503

Closed
mateuszbaran opened this issue Jul 7, 2021 · 16 comments · Fixed by #41516, #41935 or #42035
Closed

Julia nightly throwing at jl_datatype_isinlinealloc #41503

mateuszbaran opened this issue Jul 7, 2021 · 16 comments · Fixed by #41516, #41935 or #42035
Labels
kind:regression Regression in behavior compared to a previous version
Milestone

Comments

@mateuszbaran
Copy link
Contributor

A recent version of Julia nightly started throwing errors on tests in Manifolds.jl, see for example these CI runs: https://github.com/JuliaManifolds/Manifolds.jl/pull/381/checks?check_run_id=3010181089 . On Windows, for example, a part of the error reads:

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x1f9b127 -- jl_datatype_isinlinealloc at /cygdrive/c/buildbot/worker/package_win64/build/src\datatype.c:253 [inlined]
jl_datatype_isinlinealloc at /cygdrive/c/buildbot/worker/package_win64/build/src\datatype.c:250 [inlined]
union_isinlinable at /cygdrive/c/buildbot/worker/package_win64/build/src\datatype.c:277 [inlined]
union_isinlinable at /cygdrive/c/buildbot/worker/package_win64/build/src\datatype.c:266
in expression starting at D:\a\Manifolds.jl\Manifolds.jl\test\centered_matrices.jl:3

and on Ubuntu:

signal (11): Segmentation fault
in expression starting at /home/runner/work/Manifolds.jl/Manifolds.jl/test/centered_matrices.jl:3
jl_datatype_isinlinealloc at /buildworker/worker/package_linux64/build/src/datatype.c:253 [inlined]
union_isinlinable at /buildworker/worker/package_linux64/build/src/datatype.c:277 [inlined]
jl_islayout_inline at /buildworker/worker/package_linux64/build/src/datatype.c:300
jl_compute_field_offsets at /buildworker/worker/package_linux64/build/src/datatype.c:446
inst_datatype_inner at /buildworker/worker/package_linux64/build/src/jltypes.c:1520

I'll try making an MWE.

@JeffBezanson JeffBezanson added the kind:regression Regression in behavior compared to a previous version label Jul 7, 2021
@JeffBezanson JeffBezanson added this to the 1.7 milestone Jul 7, 2021
@JeffBezanson
Copy link
Sponsor Member

This has also made it into 1.7.

@mateuszbaran
Copy link
Contributor Author

So far I've managed to narrow it down to ReverseDiff:

using ReverseDiff
exp_f(t) = 0
ReverseDiff.gradient(exp_f, [0.1])

@maleadt
Copy link
Member

maleadt commented Jul 8, 2021

Further reduced:

struct a{d}
    e::d
end
struct f{j,k} <: AbstractArray{a{f{Any,k}},Any}
    l::j
end
f{Any,Any}

@JeffBezanson
Copy link
Sponsor Member

JeffBezanson commented Jul 8, 2021

This is 7ffc10b (according to bisect) @vtjnash

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jul 8, 2021

I think that is #34223 actually. The presence of the type as part of its own supertype causes the order in which we need to compute the layout of the types to diverge from the order in which we need to allocate the (super)types, which makes this complicated. We might end up needing to prohibit a type like this from being inlineable ever to avoid that ordering problem. I'll see if I can put that in a PR.

vtjnash added a commit that referenced this issue Jul 8, 2021
Similar to #35275, but through a supertype instead of through the parameters
Fixes #41503
Fixes #41349
@Sacha0
Copy link
Member

Sacha0 commented Jul 9, 2021

We hit what appears to be a very similar issue, but with a slightly different stacktrace:

signal (11): Segmentation fault
in expression starting at [...]
jl_datatype_isinlinealloc at /build/source/src/datatype.c:253 [inlined]
jl_datatype_isinlinealloc at /build/source/src/datatype.c:250 [inlined]
union_isinlinable at /build/source/src/datatype.c:277 [inlined]
union_isinlinable at /build/source/src/datatype.c:266 [inlined]
jl_islayout_inline at /build/source/src/datatype.c:300
jl_compute_field_offsets at /build/source/src/datatype.c:446
inst_datatype_inner at /build/source/src/jltypes.c:1520
jl_apply_tuple_type_v_ at /build/source/src/jltypes.c:1538 [inlined]
jl_apply_tuple_type_v at /build/source/src/jltypes.c:1548
jl_apply at /build/source/src/julia.h:1787 [inlined]
do_apply at /build/source/src/builtins.c:713
argtypes_to_type at ./compiler/typeutils.jl:53 [inlined]
abstract_call_known at ./compiler/abstractinterpretation.jl:1261
abstract_call at ./compiler/abstractinterpretation.jl:1316
abstract_call at ./compiler/abstractinterpretation.jl:1301
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1455
typeinf_local at ./compiler/abstractinterpretation.jl:1842
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1932
_typeinf at ./compiler/typeinfer.jl:226
typeinf at ./compiler/typeinfer.jl:209
typeinf_edge at ./compiler/typeinfer.jl:822 [inlined]
abstract_call_method at ./compiler/abstractinterpretation.jl:473
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:160
abstract_call_known at ./compiler/abstractinterpretation.jl:1262
abstract_call at ./compiler/abstractinterpretation.jl:1316
abstract_call at ./compiler/abstractinterpretation.jl:1301
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1455
...

Reverting 7ffc10b resolves the issue, but #41516 doesn't seem to do the trick.

@Sacha0
Copy link
Member

Sacha0 commented Jul 13, 2021

(Reopening to continue tracking the immediately preceding form of the issue.)

@Sacha0 Sacha0 reopened this Jul 13, 2021
@JeffBezanson
Copy link
Sponsor Member

According to #41516 (comment), can be triggered by testing DFTK.

@Sacha0
Copy link
Member

Sacha0 commented Jul 20, 2021

Confirmed, Pkg.test("DFTK") reproduced the issue just now on beta3 patched with #41516 and #41091. Leading section of the stacktrace:

signal (11): Segmentation fault: 11
in expression starting at /Users/sacha/pkg/temp-depot-seven/packages/DFTK/f3xkb/test/silicon_lda.jl:44
jl_datatype_isinlinealloc at /Users/sacha/pkg/julia7/src/datatype.c:253 [inlined]
union_isinlinable at /Users/sacha/pkg/julia7/src/datatype.c:277
jl_islayout_inline at /Users/sacha/pkg/julia7/src/datatype.c:300 [inlined]
jl_stored_inline at /Users/sacha/pkg/julia7/src/datatype.c:307
allocatedinline at ./array.jl:157 [inlined]
aligned_sizeof at ./reflection.jl:344
jfptr_aligned_sizeof_44554 at /Users/sacha/pkg/julia7/usr/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/sacha/pkg/julia7/src/gf.c:0 [inlined]
jl_apply_generic at /Users/sacha/pkg/julia7/src/gf.c:2430
jl_apply at /Users/sacha/pkg/julia7/src/./julia.h:1787 [inlined]
do_apply at /Users/sacha/pkg/julia7/src/builtins.c:713
jl_f__apply_pure at /Users/sacha/pkg/julia7/src/builtins.c:739
pure_eval_call at ./compiler/abstractinterpretation.jl:962
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:115
abstract_call_known at ./compiler/abstractinterpretation.jl:1262
abstract_call at ./compiler/abstractinterpretation.jl:1316
abstract_call at ./compiler/abstractinterpretation.jl:1301
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1455
typeinf_local at ./compiler/abstractinterpretation.jl:1842
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1932
_typeinf at ./compiler/typeinfer.jl:226
typeinf at ./compiler/typeinfer.jl:209
[...]

@Sacha0
Copy link
Member

Sacha0 commented Jul 20, 2021

In working towards an MRE or trace that I can hand off for the variant of this issue that our test suite hits, I managed to reproduce our variant locally and found that the stracktrace looks more similar to those posted above than seemed the case in our CI logs. In case it's useful while I continue working at this:

signal (11): Segmentation fault: 11
in expression starting at [...]
jl_datatype_isinlinealloc at /Users/sacha/pkg/julia7/src/datatype.c:253 [inlined]
union_isinlinable at /Users/sacha/pkg/julia7/src/datatype.c:277
jl_islayout_inline at /Users/sacha/pkg/julia7/src/datatype.c:300 [inlined]
jl_compute_field_offsets at /Users/sacha/pkg/julia7/src/datatype.c:446
inst_datatype_inner at /Users/sacha/pkg/julia7/src/jltypes.c:1520
jl_apply_tuple_type_v_ at /Users/sacha/pkg/julia7/src/jltypes.c:1538 [inlined]
jl_apply_tuple_type_v at /Users/sacha/pkg/julia7/src/jltypes.c:1548
_jl_invoke at /Users/sacha/pkg/julia7/src/gf.c:0 [inlined]
jl_apply_generic at /Users/sacha/pkg/julia7/src/gf.c:2430
jl_apply at /Users/sacha/pkg/julia7/src/./julia.h:1787 [inlined]
do_apply at /Users/sacha/pkg/julia7/src/builtins.c:713
argtypes_to_type at ./compiler/typeutils.jl:53 [inlined]
abstract_call_known at ./compiler/abstractinterpretation.jl:1261
abstract_call at ./compiler/abstractinterpretation.jl:1316
abstract_call at ./compiler/abstractinterpretation.jl:1301
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1455
typeinf_local at ./compiler/abstractinterpretation.jl:1842
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1932
_typeinf at ./compiler/typeinfer.jl:226
typeinf at ./compiler/typeinfer.jl:209
[...]

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jul 21, 2021

Okay thanks, here's another MWE for this issue (we fail to realize that this object is a concrete data type, should be in the cache, and should be given a layout):

julia> Tuple{Tuple{Tuple{K, UInt128} where K<:Tuple{Int64}, Int64}}

signal (11): Segmentation fault
in expression starting at REPL[3]:1
jl_datatype_isinlinealloc at /data/vtjnash/julia1/src/datatype.c:253 [inlined]
jl_datatype_isinlinealloc at /data/vtjnash/julia1/src/datatype.c:250 [inlined]
union_isinlinable at /data/vtjnash/julia1/src/datatype.c:277 [inlined]
union_isinlinable at /data/vtjnash/julia1/src/datatype.c:266 [inlined]
jl_islayout_inline at /data/vtjnash/julia1/src/datatype.c:300
jl_compute_field_offsets at /data/vtjnash/julia1/src/datatype.c:446
inst_datatype_inner at /data/vtjnash/julia1/src/jltypes.c:1520
jl_apply_tuple_type_v_ at /data/vtjnash/julia1/src/jltypes.c:1538

Note that previously we were constructing a bad layout here (this is v1.6):

julia> S = Tuple{Tuple{Tuple{K, UInt128} where K<:Tuple{Int64}, Int64}}
Tuple{Tuple{Tuple{Tuple{Int64}, UInt128}, Int64}}

julia> T = Tuple{Tuple{Tuple{Tuple{Int64}, UInt128}, Int64}}
Tuple{Tuple{Tuple{Tuple{Int64}, UInt128}, Int64}}

julia> S === T
true

julia> S == T
true

julia> S
Tuple{Tuple{Tuple{Tuple{Int64}, UInt128}, Int64}}

julia> T
Tuple{Tuple{Tuple{Tuple{Int64}, UInt128}, Int64}}

julia> isbitstype(T)
true

julia> isbitstype(S)
false

@fingolfin
Copy link
Contributor

We are also hitting this in Oscar.jl, see oscar-system/Oscar.jl#587

vtjnash added a commit that referenced this issue Aug 19, 2021
Need to compute `cacheable` after normalization, since the purpose of
the normalization was to turn these into normal cacheable objects,
when applicable.

Brokenness exposed by #36211
Fixes #41503
vtjnash added a commit that referenced this issue Aug 23, 2021
Need to compute `cacheable` after normalization, since the purpose of
the normalization was to turn these into normal cacheable objects,
when applicable.

Brokenness exposed by #36211
Fixes #41503
@antoine-levitt
Copy link
Contributor

Unfortunately the fix doesn't fix the issue for us. See a recent CI run here : https://github.com/JuliaMolSim/DFTK.jl/pull/522/checks?check_run_id=3421567794#step:7:458 and can be reproduced just by testing the DFTK package (tested on a nightly from today)

@Sacha0
Copy link
Member

Sacha0 commented Aug 25, 2021

(Reopening in accord with Antoine's comment. Testing #41935 on our codebase now as well, will report back.)

@Sacha0 Sacha0 reopened this Aug 25, 2021
@martinholters
Copy link
Member

MWE:

struct Model{T}
    atoms::Vector{Pair{Any, Vector{T}}}
end

compute_forces() = Model[][1].atoms[1][2]
precompile(compute_forces, ())

Model{Float64}([])

KristofferC pushed a commit that referenced this issue Aug 27, 2021
Need to compute `cacheable` after normalization, since the purpose of
the normalization was to turn these into normal cacheable objects,
when applicable.

Brokenness exposed by #36211
Fixes #41503

(cherry picked from commit 292f1a9)
vtjnash added a commit that referenced this issue Aug 27, 2021
vchuravy pushed a commit that referenced this issue Aug 28, 2021
@antoine-levitt
Copy link
Contributor

Can confirm fixed. Thanks!

KristofferC pushed a commit that referenced this issue Aug 31, 2021
Fixes #41503

(cherry picked from commit 10755f7)
LilithHafner pushed a commit to LilithHafner/julia that referenced this issue Feb 22, 2022
Need to compute `cacheable` after normalization, since the purpose of
the normalization was to turn these into normal cacheable objects,
when applicable.

Brokenness exposed by JuliaLang#36211
Fixes JuliaLang#41503
LilithHafner pushed a commit to LilithHafner/julia that referenced this issue Feb 22, 2022
LilithHafner pushed a commit to LilithHafner/julia that referenced this issue Mar 8, 2022
Need to compute `cacheable` after normalization, since the purpose of
the normalization was to turn these into normal cacheable objects,
when applicable.

Brokenness exposed by JuliaLang#36211
Fixes JuliaLang#41503
LilithHafner pushed a commit to LilithHafner/julia that referenced this issue Mar 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:regression Regression in behavior compared to a previous version
Projects
None yet
8 participants