Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault due to at-inbounds in _mapreducedim! #17328

Closed
chipkent opened this issue Jul 8, 2016 · 22 comments · Fixed by davidavdav/NamedArrays.jl#29
Closed

Segfault due to at-inbounds in _mapreducedim! #17328

chipkent opened this issue Jul 8, 2016 · 22 comments · Fixed by davidavdav/NamedArrays.jl#29

Comments

@chipkent
Copy link

chipkent commented Jul 8, 2016

Hit this segfault. Figured you would want it. Need anything else to debug it?

signal (11): Segmentation fault
unknown function (ip: 0x7f78327b57af)
jl_apply_generic at /cc/home/chip/software/julia-2ac304dfba/bin/../lib/julia/libjulia.so (unknown line)
_mapreducedim! at reducedim.jl:206
jlcall__mapreducedim!_24373 at  (unknown line)
jl_apply_generic at /cc/home/chip/software/julia-2ac304dfba/bin/../lib/julia/libjulia.so (unknown line)
sumabs at reducedim.jl:277
jlcall_sumabs_24361 at  (unknown line)
jl_apply_generic at /cc/home/chip/software/julia-2ac304dfba/bin/../lib/julia/libjulia.so (unknown line)
show at /cc/home/chip/source/github/Cecropia/src/finance/OptimalPortfolio.jl:122
print at strings/io.jl:8
jlcall_print_24319 at  (unknown line)
jl_apply_generic at /cc/home/chip/software/julia-2ac304dfba/bin/../lib/julia/libjulia.so (unknown line)
print at strings/io.jl:18
println at strings/io.jl:25
jl_apply_generic at /cc/home/chip/software/julia-2ac304dfba/bin/../lib/julia/libjulia.so (unknown line)
println at strings/io.jl:28
jl_apply_generic at /cc/home/chip/software/julia-2ac304dfba/bin/../lib/julia/libjulia.so (unknown line)
unknown function (ip: 0x7f783280f663)
unknown function (ip: 0x7f783280e9f9)
unknown function (ip: 0x7f7832823ebc)
unknown function (ip: 0x7f7832824b8c)
jl_load at /cc/home/chip/software/julia-2ac304dfba/bin/../lib/julia/libjulia.so (unknown line)
include at ./boot.jl:261
jl_apply_generic at /cc/home/chip/software/julia-2ac304dfba/bin/../lib/julia/libjulia.so (unknown line)
include_from_node1 at ./loading.jl:320
jl_apply_generic at /cc/home/chip/software/julia-2ac304dfba/bin/../lib/julia/libjulia.so (unknown line)
unknown function (ip: 0x7f783280f663)
unknown function (ip: 0x7f783280e9f9)
unknown function (ip: 0x7f7832823ebc)
unknown function (ip: 0x7f7832824b8c)
jl_load_file_string at /cc/home/chip/software/julia-2ac304dfba/bin/../lib/julia/libjulia.so (unknown line)
include_string at loading.jl:288
jl_apply_generic at /cc/home/chip/software/julia-2ac304dfba/bin/../lib/julia/libjulia.so (unknown line)
unknown function (ip: 0x7f783280f663)
unknown function (ip: 0x7f783280e9f9)
unknown function (ip: 0x7f7832823ebc)
jl_toplevel_eval_in at /cc/home/chip/software/julia-2ac304dfba/bin/../lib/julia/libjulia.so (unknown line)
eval at /cc/home/chip/.julia/v0.4/Atom/src/Atom.jl:3
jl_apply_generic at /cc/home/chip/software/julia-2ac304dfba/bin/../lib/julia/libjulia.so (unknown line)
anonymous at /cc/home/chip/.julia/v0.4/Atom/src/eval.jl:39
withpath at /cc/home/chip/.julia/v0.4/Requires/src/require.jl:37
withpath at /cc/home/chip/.julia/v0.4/Atom/src/eval.jl:53
jlcall_withpath_21503 at  (unknown line)
jl_apply_generic at /cc/home/chip/software/julia-2ac304dfba/bin/../lib/julia/libjulia.so (unknown line)
anonymous at /cc/home/chip/.julia/v0.4/Atom/src/eval.jl:107
unknown function (ip: 0x7f78328158f4)
unknown function (ip: (nil))
Julia has stopped: null, SIGSEGV
@chipkent
Copy link
Author

chipkent commented Jul 8, 2016

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6da43ec in jl_method_table_assoc_exact (mt=0x7ffdf64de270, args=0x7fffffffbce8, n=2) at gf.c:251
251 gf.c: No such file or directory.
(gdb) bt
#0  0x00007ffff6da43ec in jl_method_table_assoc_exact (mt=0x7ffdf64de270, args=0x7fffffffbce8, n=2) at gf.c:251
#1  0x00007ffff6da983c in jl_apply_generic (F=0x7ffdf5242190, args=0x7fffffffbce8, nargs=2) at gf.c:1663
#2  0x00007ffdd5530341 in ?? ()
#3  0x0000000000000000 in ?? ()

@chipkent
Copy link
Author

chipkent commented Jul 8, 2016

Version 0.4.5

@chipkent chipkent changed the title Segfault on 0.4.5 Segfault in jl_method_table_assoc_exact Jul 8, 2016
@chipkent
Copy link
Author

chipkent commented Jul 8, 2016

Possibly related to #13510.

@yuyichao yuyichao added the needs more info Clarification or a reproducible example is required label Jul 8, 2016
@yuyichao
Copy link
Contributor

yuyichao commented Jul 8, 2016

Please quote the backtrace and it's impossible to tell what's happening without the code to reproduce it.

@kmsquire
Copy link
Member

kmsquire commented Jul 8, 2016

@chipkent, thanks for the report, although it isn't really actionable yet. See https://github.com/JuliaLang/julia/blob/master/CONTRIBUTING.md#how-to-file-a-bug-report for what you need to provide. Cheers!

@chipkent
Copy link
Author

chipkent commented Jul 9, 2016

After a bit of hacking, I was able to reproduce the problem with:

using NamedArrays
a = NamedArray([1.0,2.0,3.0,4.0])
x = sumabs(a,1)

@tkelman
Copy link
Contributor

tkelman commented Jul 9, 2016

bisected to b77b026, so the issue is apparently not new? Unless the package is doing something fishy, it should probably be giving an error rather than a segfault though.

@tkelman
Copy link
Contributor

tkelman commented Jul 9, 2016

Ah, it's a bad @inbounds somewhere. With --check-bounds=yes you can see it's trying to index (Dict("1"=>1),)[2]

@tkelman
Copy link
Contributor

tkelman commented Jul 9, 2016

@timholy what length should CartesianRange( () ) be? 0 or 1?

@timholy
Copy link
Sponsor Member

timholy commented Jul 9, 2016

Just like

julia> a = reshape([3])
0-dimensional Array{Int64,0}:
3

julia> length(a)
1

it should be

julia> length(CartesianRange(()))
1

julia> for i in CartesianRange(())
           @show i
       end
i = CartesianIndex{0}(())

@tkelman
Copy link
Contributor

tkelman commented Jul 9, 2016

Looks like the issue is the R[i1,IR] line here

r = R[i1,IR]
, which is assuming you can add an extra trailing CartesianIndex{0}(()) index into the reduction result accumulator, which here is a size (1,) NamedArray that does not support being indexed by dimensions it does not have.

@timholy
Copy link
Sponsor Member

timholy commented Jul 9, 2016

If NamedArray declares its getindex method like this:

@inline getindex{T,N}(A::NamedArray{T,N}, I::Vararg{Int,N}) = ...

then the generic fallbacks in base will handle this automatically. EDIT: the @inline will eliminate the splatting penalty.

@timholy
Copy link
Sponsor Member

timholy commented Jul 9, 2016

(This was basically the whole point behind #11242.)

@timholy
Copy link
Sponsor Member

timholy commented Jul 9, 2016

CC @davidavdav

@tkelman
Copy link
Contributor

tkelman commented Jul 9, 2016

It does seem like some of the @inbounds in the rather complicated Base reduction code are overly optimistic, and NamedArrays isn't doing anything blatantly wrong that should be allowed to cause a segfault, it's just not up on the latest, absolute most sophisticated way to do things on master.

@timholy
Copy link
Sponsor Member

timholy commented Jul 9, 2016

As long as we don't mind the performance hit, I'd be happy to get rid of @inbounds. It would also be lovely to eliminate the special cases for linear indexing, although that again will cause a hit due (at least) to #9080.

However, because of the advantage of @simd, getting rid of @inbounds will be a pretty big hit (2-10x, depending on problem and CPU).

@tkelman
Copy link
Contributor

tkelman commented Jul 9, 2016

We've been pretending that generality comes at less of a complexity/code duplication cost than it really does, sacrificing safety on user defined array types for performance on built-in types. When we have control over the implementation and know checks are accurate we can add unsafe performance annotations. It's not safe to do so otherwise, I think.

@timholy
Copy link
Sponsor Member

timholy commented Jul 9, 2016

It would be interesting to delete all @inbounds in Base (at least those that apply to abstract types) and see what nanosoldier thinks about it.

@tkelman tkelman removed the needs more info Clarification or a reproducible example is required label Jul 9, 2016
@davidavdav
Copy link
Contributor

I am sorry NamedArray caused this. But I am a bit confused as to why this happens in NamedArray indexing and not in Array indexing, I remember the getindex() code in NamedArray was inspired largely on the code in array.jl. I am also not familiar with Vararg, in the example @timholy gives, the type of arguments are suggested to be Ints, but for namedarray each index can have a different type of the pattern Dict{T, Int}.

@tkelman
Copy link
Contributor

tkelman commented Jul 11, 2016

I don't think NamedArray is entirely at fault here, it's the inbounds that is responsible for the segfault.

@timholy
Copy link
Sponsor Member

timholy commented Jul 11, 2016

No apologies necessary; I'll check out NamedArray.jl and see whether I can figure out a good solution.

@tkelman tkelman changed the title Segfault in jl_method_table_assoc_exact Segfault due to at-inbounds in _mapreducedim! Jul 12, 2016
@mbauman mbauman added compiler:simd instruction-level vectorization and removed compiler:simd instruction-level vectorization labels Apr 24, 2018
@mbauman
Copy link
Sponsor Member

mbauman commented Apr 24, 2018

This appears to have been resolved through PRs to packages.

@mbauman mbauman closed this as completed Apr 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants