Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Couple of tuple speed-ups and fixes #20720

Merged
merged 2 commits into from
Apr 6, 2017
Merged

Conversation

pabloferz
Copy link
Contributor

Should fix #20718

base/tuple.jl Outdated
@@ -160,7 +160,8 @@ heads(t::Tuple, ts::Tuple...) = (t[1], heads(ts...)...)
tails() = ()
tails(t::Tuple, ts::Tuple...) = (tail(t), tails(ts...)...)
map(f, ::Tuple{}, ts::Tuple...) = ()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tangentially, this method seems incongruous with the spirit of map?

julia> map(+, (), (1,2))
()

julia> map(+, [], [1,2])
ERROR: DimensionMismatch("dimensions must match")
Stacktrace:
 [1] promote_shape(::Tuple{Base.OneTo{Int64}}, ::Tuple{Base.OneTo{Int64}}) at ./indices.jl:79
 [2] _array_for(::Type{Any}, ::Base.Iterators.Zip2{Array{Any,1},Array{Int64,1}}, ::Base.HasShape) at ./array.jl:406
 [3] collect(::Base.Generator{Base.Iterators.Zip2{Array{Any,1},Array{Int64,1}},Base.##3#4{Base.#+}}) at ./array.jl:416
 [4] map(::Function, ::Array{Any,1}, ::Array{Int64,1}) at ./abstractarray.jl:1888

base/tuple.jl Outdated
@@ -160,7 +160,8 @@ heads(t::Tuple, ts::Tuple...) = (t[1], heads(ts...)...)
tails() = ()
tails(t::Tuple, ts::Tuple...) = (tail(t), tails(ts...)...)
map(f, ::Tuple{}, ts::Tuple...) = ()
map(f, t1::Tuple, t2::Tuple, ts::Tuple...) = (f(heads(t1, t2, ts...)...), map(f, tails(t1, t2, ts...)...)...)
map(f, t1::Tuple, t2::Tuple, ts::Tuple...) = (@_inline_meta;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this definition not result in arbitrarily deep inlining in the same manner that the two argument form does without this additional definition that limits inlining?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I looked at this backwards, is heads and tails what should be fine to inline. I'll try another solution.

@pabloferz
Copy link
Contributor Author

pabloferz commented Feb 21, 2017

Updated to avoid code blowup for large tuples.

This should also fix #20723 now. Unlike map(+, [], [1,2]), map(+, (), (1,2)) would throw ERROR: BoundsError: attempt to access (). Should I make it throw DimensionMismatch("dimensions must match") instead?

NOTE: Will write tests for the later, once we decide what to throw.

@pabloferz pabloferz changed the title Couple of tuple speed-ups Couple of tuple speed-ups and fixes Feb 21, 2017
@@ -132,7 +132,7 @@ All16{T,N} = Tuple{T,T,T,T,T,T,T,T,
T,T,T,T,T,T,T,T,Vararg{T,N}}
function map(f, t::Any16)
n = length(t)
A = Array{Any}(n)
A = Array{Any,1}(n)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps Vector?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't use Vector here because of bootstrapping.

@@ -148,19 +148,22 @@ function map(f, t::Tuple, s::Tuple)
end
function map(f, t::Any16, s::Any16)
n = length(t)
A = Array{Any}(n)
A = Array{Any,1}(n)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise here, perhaps Vector?

base/tuple.jl Outdated
map(f, t1::Tuple, t2::Tuple, ts::Tuple...) = (f(heads(t1, t2, ts...)...), map(f, tails(t1, t2, ts...)...)...)
tails(t::Tuple, ts::Tuple...) = (@_inline_meta; (tail(t), tails(ts...)...))
map(f, ::Tuple{}...) = ()
map(f, t::Tuple, ts::Tuple...) = (@_inline_meta;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these definitions exceed the line-length recommendation? If not, perhaps condense these definitions to single lines?

Copy link
Contributor Author

@pabloferz pabloferz Feb 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is 92 characters, right? (I don't recall exactly) If so, then indeed this exceeds the line-length recommendation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I see the first definition exceeds the limit by a few characters though the second does not; I assume you split the second definition for consistency, so cheers :).

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should be written as long-form methods if they need to split across multiple lines

base/tuple.jl Outdated
@@ -20,7 +20,7 @@ endof(t::Tuple) = length(t)
size(t::Tuple, d) = d==1 ? length(t) : throw(ArgumentError("invalid tuple dimension $d"))
getindex(t::Tuple, i::Int) = getfield(t, i)
getindex(t::Tuple, i::Real) = getfield(t, convert(Int, i))
getindex(t::Tuple, r::AbstractArray{<:Any,1}) = tuple([t[ri] for ri in r]...)
getindex(t::Tuple, r::AbstractArray{<:Any,1}) = (n = length(eachindex(r)); ntuple(i->t[r[i]], n))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain this change's purpose? Could indexing r with indices 1:n (rather than iterating over r's elements, or indexing r more genernically) be an issue? If not, why length(eachindex(r)) rather than simply length(r)? Does using ntuple here not potentially hurt inference?

Copy link
Contributor Author

@pabloferz pabloferz Feb 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could indexing r with indices 1:n be an issue?

I feel like we are slowly moving to support non traditional indexing, to handle things like, e.g., OffsetArrays. That's why I wrote it like this.

Does using ntuple here not potentially hurt inference?

This is already non-inferable so it doesn't hurt changing it to make it faster.

Copy link
Member

@Sacha0 Sacha0 Feb 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we are slowly moving to support non traditional indexing, to handle things like, e.g., OffsetArrays. That's why I wrote it like this.

Exactly :). The proposed definition indexes r from 1:n, as i iterates over 1:n in ntuple. So if indexing r from 1:n is not correct, the proposed definition will not be correct.

This is already non-inferable so it doesn't hurt changing it to make it faster.

Consider

julia> foo(t::Tuple, r::AbstractArray{<:Any,1}) = tuple([t[ri] for ri in r]...)
foo (generic function with 1 method)

julia> bar(t::Tuple, r::AbstractArray{<:Any,1}) = (n = length(eachindex(r)); ntuple(i->t[r[i]], n))
bar (generic function with 1 method)

julia> @code_warntype foo((1,2,3), [1])
Variables:
  #self#::#foo
  t::Tuple{Int64,Int64,Int64}
  r::Array{Int64,1}
  #1::##1#2{Tuple{Int64,Int64,Int64}}

Body:
  begin
      #1::##1#2{Tuple{Int64,Int64,Int64}} = $(Expr(:new, :($(QuoteNode(##1#2{Tuple{Int64,Int64,Int64}}))), :(t)))
      SSAValue(0) = #1::##1#2{Tuple{Int64,Int64,Int64}}
      SSAValue(1) = $(Expr(:new, :($(QuoteNode(Base.Generator{Array{Int64,1},##1#2{Tuple{Int64,Int64,Int64}}}))), SSAValue(0), :(r)))
      SSAValue(2) = $(Expr(:invoke, MethodInstance for collect(::Base.Generator{Array{Int64,1},##1#2{Tuple{Int64,Int64,Int64}}}), :(Base.collect), SSAValue(1)))
      return (Core._apply)(Main.tuple, SSAValue(2))::Tuple{Vararg{Int64,N} where N}
  end::julia> foo(t::Tuple, r::AbstractArray{<:Any,1}) = tuple([t[ri] for ri in r]...)
foo (generic function with 1 method)

julia> bar(t::Tuple, r::AbstractArray{<:Any,1}) = (n = length(eachindex(r)); ntuple(i->t[r[i]], n))
bar (generic function with 1 method)

julia> @code_warntype foo((1,2,3), [1])
Variables:
  #self#::#foo
  t::Tuple{Int64,Int64,Int64}
  r::Array{Int64,1}
  #1::##1#2{Tuple{Int64,Int64,Int64}}

Body:
  begin
      #1::##1#2{Tuple{Int64,Int64,Int64}} = $(Expr(:new, :($(QuoteNode(##1#2{Tuple{Int64,Int64,Int64}}))), :(t)))
      SSAValue(0) = #1::##1#2{Tuple{Int64,Int64,Int64}}
      SSAValue(1) = $(Expr(:new, :($(QuoteNode(Base.Generator{Array{Int64,1},##1#2{Tuple{Int64,Int64,Int64}}}))), SSAValue(0), :(r)))
      SSAValue(2) = $(Expr(:invoke, MethodInstance for collect(::Base.Generator{Array{Int64,1},##1#2{Tuple{Int64,Int64,Int64}}}), :(Base.collect), SSAValue(1)))
      return (Core._apply)(Main.tuple, SSAValue(2))::Tuple{Vararg{Int64,N} where N}
  end::Tuple{Vararg{Int64,N} where N}

julia> @code_warntype bar((1,2,3), [1])
Variables:
  #self#::#bar
  t::Tuple{Int64,Int64,Int64}
  r::Array{Int64,1}
  n::Int64
  #3::##3#4{Tuple{Int64,Int64,Int64},Array{Int64,1}}

Body:
  begin
      $(Expr(:inbounds, false))
      # meta: location abstractarray.jl eachindex 744
      # meta: location abstractarray.jl indices1 63
      # meta: location abstractarray.jl indices 56
      SSAValue(3) = (Base.arraysize)(r::Array{Int64,1}, 1)::Int64
      # meta: location tuple.jl map 124
      SSAValue(2) = SSAValue(3)
      # meta: pop location
      # meta: pop location
      # meta: pop location
      # meta: pop location
      $(Expr(:inbounds, :pop))
      n::Int64 = (Core.typeassert)((Base.select_value)((Base.slt_int)(SSAValue(2), 0)::Bool, 0, SSAValue(2))::Int64, Int64)::Int64
      #3::##3#4{Tuple{Int64,Int64,Int64},Array{Int64,1}} = $(Expr(:new, :($(QuoteNode(##3#4{Tuple{Int64,Int64,Int64},Array{Int64,1}}))), :(t), :(r)))
      SSAValue(0) = #3::##3#4{Tuple{Int64,Int64,Int64},Array{Int64,1}}
      return $(Expr(:invoke, MethodInstance for ntuple(::##3#4{Tuple{Int64,Int64,Int64},Array{Int64,1}}, ::Int64), :(Main.ntuple), SSAValue(0), :(n)))
  end::Tuple

julia>

julia> @code_warntype bar((1,2,3), [1])
Variables:
  #self#::#bar
  t::Tuple{Int64,Int64,Int64}
  r::Array{Int64,1}
  n::Int64
  #3::##3#4{Tuple{Int64,Int64,Int64},Array{Int64,1}}

Body:
  begin
      $(Expr(:inbounds, false))
      # meta: location abstractarray.jl eachindex 744
      # meta: location abstractarray.jl indices1 63
      # meta: location abstractarray.jl indices 56
      SSAValue(3) = (Base.arraysize)(r::Array{Int64,1}, 1)::Int64
      # meta: location tuple.jl map 124
      SSAValue(2) = SSAValue(3)
      # meta: pop location
      # meta: pop location
      # meta: pop location
      # meta: pop location
      $(Expr(:inbounds, :pop))
      n::Int64 = (Core.typeassert)((Base.select_value)((Base.slt_int)(SSAValue(2), 0)::Bool, 0, SSAValue(2))::Int64, Int64)::Int64
      #3::##3#4{Tuple{Int64,Int64,Int64},Array{Int64,1}} = $(Expr(:new, :($(QuoteNode(##3#4{Tuple{Int64,Int64,Int64},Array{Int64,1}}))), :(t), :(r)))
      SSAValue(0) = #3::##3#4{Tuple{Int64,Int64,Int64},Array{Int64,1}}
      return $(Expr(:invoke, MethodInstance for ntuple(::##3#4{Tuple{Int64,Int64,Int64},Array{Int64,1}}, ::Int64), :(Main.ntuple), SSAValue(0), :(n)))
  end::Tuple

The existing definition yields Tuple{Vararg{Int64,N} where N} whereas the proposed definition yields only Tuple; some information is lost?

Copy link
Contributor Author

@pabloferz pabloferz Feb 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow I forgot that eachindex does not necessarily return an iterable that can be indexed with UnitRanges. I updated the code with an alternative.

Re. inferability: I see that we'd loose some information, but I'd argue that when working with tuples what we most commonly want to infer is the length, so loosing the eltype here is not that bad, if it buy us some speed. Though, I could be convinced otherwise.

@pabloferz
Copy link
Contributor Author

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@felixrehren
Copy link
Contributor

Those sqrtm tests seem very noisy. I can't think how these changes to tuples would have affected them. cc @jrevels

base/tuple.jl Outdated
function getindex(t::Tuple, r::AbstractArray{<:Any,1})
s = Ref(start(r))
ntuple(j -> ((i, s[]) = next(r, s[]); t[i]), length(r))
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems orthogonal to the other changes in this pull request. Perhaps split this change into a separate pull request for additional discussion and to attract additional review eyes? (I remain concerned about losing eltype information with this change, and see no reason that that information is not important in general.)

Copy link
Contributor Author

@pabloferz pabloferz Feb 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the changes to ntuple, the eltype for this would be inferred. But I'll probably left this aside anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting :). Out of curiosity, which of the simultaneous ntuple changes make the new getindex version's eltype inferable? (The generator splat to array splat change?)

If this version does not lose eltype inferability, then cheers form this end. (Though having another pair of eyes on these changes would still be worthwhile.) Best!

Copy link
Contributor Author

@pabloferz pabloferz Feb 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out that inference works different when you directly return something like c1 ? r1 : c2 ? r2 : c3 ? r3 ... to when you assign that to a variable and then return the variable. So basically what improved things was assigning the tuple before returning it.

I won't leave this version as is as it messes inference when using this function inside macros in some circumstances (because of the closure). But I have a slightly different version that doesn't have that limitation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find! Inference's sensitivity to such things is somewhat disconcerting. Best!

base/tuple.jl Outdated
return t
end

_ntuple(f::Function, n::Integer) = (@_noinline_meta; ([f(i) for i = 1:n]...))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why switch from splatting a generator to splatting an array?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the time being, is faster than using the generator.

n == 7 ? (f(1), f(2), f(3), f(4), f(5), f(6), f(7)) :
n == 8 ? (f(1), f(2), f(3), f(4), f(5), f(6), f(7), f(8)) :
n == 9 ? (f(1), f(2), f(3), f(4), f(5), f(6), f(7), f(8), f(9)) :
n == 10 ? (f(1), f(2), f(3), f(4), f(5), f(6), f(7), f(8), f(9), f(10)) :
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change impact behavior for 10 < n < 16? (Out of curiosity, what motivates this change?)

Copy link
Contributor Author

@pabloferz pabloferz Feb 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is faster in some cases or at worst has equal performance than the existing implementation.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that LLVM doesn't reduce this to a calculated-jump. If you care enough to try, would a binary tree implementation be better? (It's less nice from a code-simplicity standpoint, so don't feel you have to do this.)

@pabloferz
Copy link
Contributor Author

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Something went wrong when running your job:

NanosoldierError: failed to run benchmarks against primary commit: failed process: Process(`sudo cset shield -e su nanosoldier -- -c ./benchscript.sh`, ProcessExited(1)) [1]

Logs and partial data can be found here
cc @jrevels

@pabloferz
Copy link
Contributor Author

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Something went wrong when running your job:

NanosoldierError: failed to run benchmarks against primary commit: failed process: Process(`sudo cset shield -e su nanosoldier -- -c ./benchscript.sh`, ProcessExited(1)) [1]

Logs and partial data can be found here
cc @jrevels

@pabloferz
Copy link
Contributor Author

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@pabloferz
Copy link
Contributor Author

Once more to see if anything is noise

@nanosoldier runbenchmarks(ALL, vs = ":master")

@pabloferz pabloferz requested a review from timholy March 17, 2017 00:40
@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

Copy link
Sponsor Member

@timholy timholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

base/range.jl Outdated
@@ -151,6 +151,10 @@ unitrange_last{T}(start::T, stop::T) =
ifelse(stop >= start, convert(T,start+floor(stop-start)),
convert(T,start-oneunit(stop-start)))

if isdefined(Main, :Base)
getindex(t::Tuple, r::UnitRange) = (o = r.start - 1; ntuple(n -> t[o + n], length(r)))
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe AbstractUnitRange? And use first(r) rather than r.start?

n == 7 ? (f(1), f(2), f(3), f(4), f(5), f(6), f(7)) :
n == 8 ? (f(1), f(2), f(3), f(4), f(5), f(6), f(7), f(8)) :
n == 9 ? (f(1), f(2), f(3), f(4), f(5), f(6), f(7), f(8), f(9)) :
n == 10 ? (f(1), f(2), f(3), f(4), f(5), f(6), f(7), f(8), f(9), f(10)) :
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that LLVM doesn't reduce this to a calculated-jump. If you care enough to try, would a binary tree implementation be better? (It's less nice from a code-simplicity standpoint, so don't feel you have to do this.)

map(f, t1::Tuple, t2::Tuple, ts::Tuple...) = (f(heads(t1, t2, ts...)...), map(f, tails(t1, t2, ts...)...)...)
heads(ts::Tuple...) = map(t -> t[1], ts)
tails(ts::Tuple...) = map(tail, ts)
map(f, ::Tuple{}...) = ()
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change because of this:

julia> map(+, (), (3,), (4,))
()

Consequently, this PR will have to wait until 1.0-dev.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should that be a dimension mismatch error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this was a bug. X-ref: #20723

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like that behavior (I like this better), but I'd be surprised if there isn't code out there that counts on the current behavior. So even though I'd call this an improvement, I think it's dangerous to introduce a behavior change at this stage of the release process unless there's reason (1) to be certain it's a bug, or (2) be almost certain no one had been using this. But don't take that as official policy, it's just my viewpoint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we had an alpha tag, I plan on running a pkgeval comparison before tagging a beta, so we could find out

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these are flaky tolerances or timeouts, but AxisArrays failure looks related https://gist.github.com/2290f8d80dea51c9dc4aa3b4de59daf2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@pabloferz pabloferz Apr 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we merge this then, or should we wait until a new version of AxisArrays is tagged?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timholy what do you think?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go for it. I just tagged a new release.

@pabloferz
Copy link
Contributor Author

Interesting that LLVM doesn't reduce this to a calculated-jump.

It does if I remove the assignment and return directly, but then inference isn't able to at least figure out the eltype (it infers Tuple instead NTuple{N,T} where N for the appropriate T).

If you care enough to try, would a binary tree implementation be better?

For now I'd like to keep the code simplicity and leave other approaches for another occasion.

Copy link
Member

@Sacha0 Sacha0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sticking with this @pabloferz! :)

@pabloferz pabloferz merged commit d5d7280 into JuliaLang:master Apr 6, 2017
@pabloferz pabloferz deleted the pz/maptups branch April 6, 2017 20:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:collections Data structures holding multiple items, e.g. sets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Too many allocations when binary dot operators for tuples are used multiple times in a single line
8 participants