-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate vectorized round methods in favor of compact broadcast syntax #18590
Conversation
There are so many of these functions for various matrix types. I wonder if |
I'm not following you, @TotalVerb. |
@StefanKarpinski These PRs are subtly changing the semantics of julia> broadcast(round, Bidiagonal([1, 0, 0], [1, 0], true))
3×3 Array{Int64,2}:
1 1 0
0 0 0
0 0 0 and after it, we have julia> broadcast(round, Bidiagonal([1, 0, 0], [1, 0], true))
3×3 Bidiagonal{Int64}:
1 1 ⋅
⋅ 0 0
⋅ ⋅ 0 But if a user defines My question is whether there could possibly be a way for |
Example of change of behaviour: in v0.5, |
Thanks for clarifying, @TotalVerb. |
I've been thinking along the same lines while writing these PRs. To some degree Apart from the existing |
Check if |
@StefanKarpinski I like that idea, though it would have to be |
Something along those lines :). But whether calling |
Side effects should just be prohibited within |
I agree. Just noting that (IIRC) that point has been contentious. But perhaps that debate has been settled? If so, great! Makes life relatively straightforward. Best! |
I don't think it has. https://github.com/JuliaLang/julia/issues/7010 |
Ref. #10536 |
Ok, here is a radical idea (more intended as food for thought than actual realization):
iszeropreserving(::Function) = NotZeroPreserving()
iszeropreserving(::typeof(round)) = ZeroPreserving()
# ...
|
|
This is the trait grouping function: julia> @generated function iszeropreserving{T}(fn, dt::Type{T})
f = fn.instance
val = f(zero(T)) == zero(f(zero(T)))
:($val)
end
iszeropreserving (generic function with 1 method)
julia> iszeropreserving(sin, Int)
true
julia> iszeropreserving(cos, Int)
false But it is a generated function. Could this be done with an |
Obviating the need for a trait would be great. Discussion in #10536 converged on acceptance of the (weaker) condition (than purity) that return values of |
Actually, side effects are not in an undefined order with loop fusion, because fusion is guaranteed to occur. One of the advantages of making this a syntactic guarantee was precisely that we don't have to worry about proving that the result is equivalent to the result without fusion (as would be required if this were a mere optimization). |
@stevengj My terminology was inaccurate. But the point stands that the order is not simply a linear order. I really don't think side effects in |
However, in the special case of operating on a sparse array, calling |
Some thoughts on generic structure-preserving broadcast methods for sparse and structured matrices. The first section explores the simple case of broadcasting unary operations. The second section explores broadcasting binary operations (as a model of broadcasting operations with two or more arguments). The third section explores one approach to some of the challenges mentioned in the first and second sections (inference of zero preserving behavior). Below, Apologies this is so long. Lots of material. Broadcasting unary operations Consider function magicbroadcast(f, D::Diagonal)
if f(zero(eltype(D))) == zero(f(zero(eltype(D))))
return Diagonal(f.(D.diag))
else
return f.(convert(Matrix, foo))
end
end is not type stable due to the runtime branch (for general type ZeroPreserving{T} end
iszeropreserving{T<:Number}(::typeof(sin), ::Type{T}) = ZeroPreserving{true}()
iszeropreserving{T<:Number}(::typeof(cos), ::Type{T}) = ZeroPreserving{false}()
magicbroadcast(f, D::Diagonal) = _magicbroadcast(f, iszeropreserving(f, eltype(D)), D::Diagonal)
_magicbroadcast(f, ::ZeroPreserving{true}, D::Diagonal) = Diagonal(f.(D.diag))
_magicbroadcast(f, ::ZeroPreserving{false}, D::Diagonal) = f.(convert(Matrix, D)) has the virtue of being type stable, but the disadvantage of requiring the @generated function iszeropreserving{dT}(fT, ::Type{dT})
f = fT.instance
if f(zero(dT)) == zero(f(zero(dT)))
return :(ZeroPreserving{true}())
else
return :(ZeroPreserving{false}())
end
end
Broadcasting binary operations Consider (1) if either Also in contrast to the unary case where sampling So for operations accepting two or more arguments, it seems zero preserving behavior must either be declared or inferred from other declarations (if one wants classification beyond merely classes 4/5). Inference As with type, zero-preservation class might be inferrable in many cases (given declaration of: (1) the zero-preservation class of a set of common 'atomic' functions; and (2) rules for computing the zero-preservation class of compositions of those functions). Below is a demo (which for simplicity ignores argument types). I know next to nothing about type inference, so I imagine the following snippet may be horrific and/or amusing to those who know more; any direction would be much appreciated. What the demo code (below) enables julia> # unary zero preserving
julia> magicbroadcast(x -> 4*abs(sin(x)*cos(x)) + (1/2)*tan(x)*exp(x), Diagonal(rand(4)))
4×4 Diagonal{Float64}:
0.163829 ⋅ ⋅ ⋅
⋅ 2.69932 ⋅ ⋅
⋅ ⋅ 1.48974 ⋅
⋅ ⋅ ⋅ 1.25917
julia> # unary not zero preserving
julia> magicbroadcast(x -> 4*abs(sin(x)*cos(x)) - log(x)*exp(x), Diagonal(rand(4)))
4×4 Array{Float64,2}:
2.51054 Inf Inf Inf
Inf 2.65709 Inf Inf
Inf Inf 2.73759 Inf
Inf Inf Inf 2.77127
julia> # binary zero preserving in the first argument, first arg structured
julia> magicbroadcast((x,y) -> sin(x)*cos(y), Diagonal(rand(4)), rand(4,4))
4×4 Diagonal{Float64}:
0.37013 ⋅ ⋅ ⋅
⋅ 0.367892 ⋅ ⋅
⋅ ⋅ 0.258526 ⋅
⋅ ⋅ ⋅ 0.506654
julia> # binary zero preserving in the first argument, first arg dense
julia> magicbroadcast((x,y) -> sin(x)*cos(y), rand(4,4), Diagonal(rand(4)))
4×4 Array{Float64,2}:
0.64847 0.470474 0.380607 0.273377
0.42008 0.549254 0.109858 0.449444
0.322848 0.18293 0.5724 0.649076
0.361723 0.561342 0.170636 0.227352
julia> # binary zero-pair preserving, both args structured
julia> magicbroadcast((x,y) -> sin(x) + sin(y), Diagonal(rand(4)), Diagonal(rand(4)))
4×4 Diagonal{Float64}:
1.05201 ⋅ ⋅ ⋅
⋅ 0.905931 ⋅ ⋅
⋅ ⋅ 0.990544 ⋅
⋅ ⋅ ⋅ 1.24777
julia> # binary zero-pair preserving, one arg dense
julia> magicbroadcast((x,y) -> sin(x) + sin(y), Diagonal(rand(4)), rand(4,4))
4×4 Array{Float64,2}:
0.880026 0.432476 0.822629 0.653721
0.309337 1.01615 0.523797 0.720296
0.672002 0.546849 1.27934 0.173061
0.494756 0.791128 0.0729191 0.714148 The demo code import Base: *, +, -, abs, exp, log, sin, cos, tan
type ZeroPreserving{T} end
# ZeroPreserving{true} - zero preserving in all args (unary zero preserving, or binary zero preserving in both args)
# ZeroPreserving{false} - zero nonpreserving in all args (unary not zero preserving, or binary not zero preserving)
# ZeroPreserving({true,true}) - binary zero-pair preserving
# ZeroPreserving{(true,false)} - binary zero preserving in first argument
# ZeroPreserving{(false,true)} - binary zero preserving in second argument
# Some short names
const ZP = ZeroPreserving
const AT = true
const AF = false
const TT = (true, true)
const TF = (true, false)
const FT = (false, true)
# unary magicbroadcast
magicbroadcast(f, D::Diagonal) = _magicbroadcast(f, iszeropreserving(f, (eltype(D),)), D::Diagonal)
_magicbroadcast(f, ::ZP{AT}, D::Diagonal) = broadcastnz(f, D)
_magicbroadcast(f, ::ZP{AF}, D::Diagonal) = broadcastall(f, D)
broadcastnz(f, D) = Diagonal(f.(D.diag))
broadcastall(f, D) = f.(convert(Matrix, D))
# binary magicbroadcast
magicbroadcast(f, A, B) = _magicbroadcast(f, iszeropreserving(f, (eltype(A), eltype(B))), A, B)
_magicbroadcast(f, ::ZP, A, B) = broadcastall(f, A, B)
_magicbroadcast(f, ::ZP{AT}, A, B) = broadcastnz(f, A, B)
_magicbroadcast(f, ::ZP{TF}, A::Diagonal, B) = broadcastnz(f, A, B)
_magicbroadcast(f, ::ZP{FT}, A, B::Diagonal) = broadcastnz(f, A, B)
_magicbroadcast(f, ::ZP{TT}, A::Diagonal, B::Diagonal) = broadcastnz(f, A, B)
broadcastnz(f, A::Diagonal, B) = Diagonal(f.(diag(A), diag(B)))
broadcastnz(f, A, B::Diagonal) = Diagonal(f.(diag(A), diag(B)))
broadcastnz(f, A::Diagonal, B::Diagonal) = Diagonal(f.(A.diag, B.diag)) # disambiguate
broadcastall(f, A, B) = f.(convert(Matrix, A), convert(Matrix, B))
iszeropreserving(f, ::Tuple{Type}) = f(ZP{AT}())
iszeropreserving(f, ::Tuple{Type,Type}) = f(ZP{TF}(), ZP{FT}())
# Below we define rules for computing the zero-preservation class of an operation formed via composition of operations with known zero-preservation class
# Rules for unary operations
declareunaryZPT(fns...) = for fn in fns; (::typeof(fn))(t::ZP) = t; end # zero preserving
declareunaryZPF(fns...) = for fn in fns; (::typeof(fn))(::ZP) = ZP{AF}(); end # not zero preserving
declareunaryZPT(abs, sin, tan)
declareunaryZPF(cos, log, exp)
# Rules for binary operations that are zero preserving in both arguments
function declarebinaryZPT(fns...)
for fn in fns
# ZP{T} in either argument
(::typeof(fn))(::ZP{AT}, ::ZP) = ZP{AT}()
(::typeof(fn))(::ZP, ::ZP{AT}) = ZP{AT}()
(::typeof(fn))(::ZP{AT}, ::ZP{AT}) = ZP{AT}() # disambiguate
# ZP{F} in either argument
(::typeof(fn))(::ZP{AF}, t::ZP) = t
(::typeof(fn))(t::ZP, ::ZP{AF}) = t
(::typeof(fn))(::ZP{AF}, ::ZP{AF}) = ZP{AF}() # disambiguate
(::typeof(fn))(::ZP{AF}, ::ZP{AT}) = ZP{AT}() # disambiguate
(::typeof(fn))(::ZP{AT}, ::ZP{AF}) = ZP{AT}() # disambiguate
# remaining ZP{TT} cases
(::typeof(fn))(::ZP{TT}, ::ZP{TT}) = ZP{TT}()
(::typeof(fn)){C<:Union{ZP{TF},ZP{FT}}}(::ZP{TT}, t::C) = t
(::typeof(fn)){C<:Union{ZP{TF},ZP{FT}}}(t::C, ::ZP{TT}) = t
# remaining ZP{TF} and ZP{FT} cases
(::typeof(fn))(::ZP{TF}, ::ZP{TF}) = ZP{TF}()
(::typeof(fn))(::ZP{FT}, ::ZP{FT}) = ZP{FT}()
(::typeof(fn))(::ZP{TF}, ::ZP{FT}) = ZP{AT}()
(::typeof(fn))(::ZP{FT}, ::ZP{TF}) = ZP{AT}()
# combinations involving numeric constants
(::typeof(fn))(::ZP{AT}, ::Number) = ZP{AT}()
(::typeof(fn))(::Number, ::ZP{AT}) = ZP{AT}()
end
end
declarebinaryZPT(*)
# Rules for binary operations that are zero-pair preserving
function declarebinaryZPTT(fns...)
for fn in fns
# ZP{T} in either argument
(::typeof(fn))(::ZP{AT}, t::ZP) = t
(::typeof(fn))(t::ZP, ::ZP{AT}) = t
(::typeof(fn))(::ZP{AT}, ::ZP{AT}) = ZP{AT}() # disambiguate
# ZP{F} in either argument
(::typeof(fn))(::ZP{AF}, ::ZP) = ZP{AF}()
(::typeof(fn))(::ZP, ::ZP{AF}) = ZP{AF}()
(::typeof(fn))(::ZP{AF}, ::ZP{AF}) = ZP{AF}() # disambiguate
(::typeof(fn))(::ZP{AF}, ::ZP{AT}) = ZP{AF}() # disambiguate
(::typeof(fn))(::ZP{AT}, ::ZP{AF}) = ZP{AF}() # disambiguate
# remaining ZP{TT} cases
(::typeof(fn))(::ZP{TT}, ::ZP{TT}) = ZP{TT}()
(::typeof(fn)){C<:Union{ZP{TF},ZP{FT}}}(::ZP{TT}, ::C) = ZP{TT}()
(::typeof(fn)){C<:Union{ZP{TF},ZP{FT}}}(::C, ::ZP{TT}) = ZP{TT}()
# remaining ZP{TF} and ZP{FT} cases
(::typeof(fn))(::ZP{TF}, ::ZP{TF}) = ZP{TF}()
(::typeof(fn))(::ZP{FT}, ::ZP{FT}) = ZP{FT}()
(::typeof(fn))(::ZP{TF}, ::ZP{FT}) = ZP{TT}()
(::typeof(fn))(::ZP{FT}, ::ZP{TF}) = ZP{TT}()
end
end
declarebinaryZPTT(+, -) Some extensions and challenges Handling numeric constants in additional cases should be possible, for example for binary operations that preserve zeros in both arguments consider another two rules (::typeof(fn))(::ZP{AF}, x::Number) = ifelse(x == 0, ZP{AT}(), ZP{AF}())
(::typeof(fn))(x::Number, ::ZP{AF}) = ifelse(x == 0, ZP{AT}(), ZP{AF}()) Though in principle the branches in these rules could be eliminated at compile time ( Given the zero-preservation class of a binary operation, intelligently combining different structured and/or sparse matrix types should be possible. For example, broadcasting a binary operation that is zero preserving in the first argument (but not in the second) over a Some operations pose challenges (perhaps common enough operations to sink this ship?). For example, not knowing that the domain of Complement or alternative The approach discussed in JuliaSparse/SparseArrays.jl#60 seems either a simpler alternative to the above approach or a nice complement for when the above approach fails. (#7010 discusses providing versions of the common higher-order functions that, when operating over argument combinations involving sparse / structured matrices, allow the user to conveniently specify what subsection of the matrices they want to operate over / what type to return). Thoughts? Thanks and best! |
@tkelman, I'm not sure what you exactly mean by the "adjacency structure input", but changing an adjacency graph or weights does still not strike me as something you'd (sensibly) do by Look, if you just want single-argument functions to work, we could do: function Base.broadcast{T<:Number}(f::Function, D::Diagonal{T})
d = f.(diag(D))
z = f(zero(T)) # for sparse matrices, we assume f(0) is pure and deterministic
if z == 0
return spdiagm((d,), (0,))
else
return ... dense SparseMatrixCSC with d on diagonal and z offdiagonal
end
end and similarly for When you start to get into multi-argument broadcasts (which includes the |
Just a suggestion... if sparse structures stored their "default" element, then this could be calculated once for the output. E.g. you could make a (Some of the difficulty shifts to matrix multiplication but type-wise that is much simpler to deal with). |
@andyferris That's been discussed before at #14963, and it doesn't look like people support this solution. |
It would be reasonable to explore a sparse type with defaults in a package, if anyone were up to it. (I don't have the bandwidth.) |
Thanks @nalimilan. I read that post, and it seemed like an aweful lot of worrying about odd corner cases, when we could make sparse structures much more powerful (e.g. @kmsquire good point |
It wouldn't be free though, every single method that handles sparse input would need to be checked and most of them modified or rewritten to correctly handle a non-zero default value. If there were any precedent for making that design choice in any other sparse linear algebra library then I think there would be more enthusiasm for it. |
I wasn't suggesting it would be free for the developers :) The run-time checking of zero default values could easily be eliminated by dispatching on As for precedent, Julia is clearly the only language I've seen that is generic enough to make this worthwhile for the end user, fast enough to bother with such optimizations, and flexible enough (with e.g. the trick I mention above) that it is possible without run-time compromise. PS - the gain is also outside of linear algebra, e.g. you get a kickass |
@andyferris, the runtime overhead is not the concern; sparse arrays are only performant if the arrays are large, in which case the overhead of checking a single element is negligible. The concern is the developer overhead of having to modify/rewrite a ton of working code, and of the subtle trap that "sparse" wouldn't mean what it normally means. Worse, it's not clear whether all this work would actually accomplish anything useful; who needs The nullable use case seems so different that I don't think you would be able to usefully re-use the "sparse" code anyway. |
@stevengj you make two excellent points, and of course generalizations could be dealt with in a package.
You might be right, I'm not sure. My interest is partially because I have been spending time on this and have been thinking of possible related array storage containers. |
Rebased. Best! |
5a98ec5
to
33e6f99
Compare
Subsumed by #19791. |
Is there an issue yet for making broadcast fall back to returning sparse instead of dense on the structured array types where that would make sense? |
Not yet. I have the code prototyped though. Hoping to polish that off tomorrow, time allowing. Best! |
This PR deprecates all remaining vectorized
round
methods (less those for SparseVectors, separate PR) in favor of compact broadcast syntax. Ref. #16285, #17302, #18495, #18512, #18513, #18558, #18564, #18566, #18571 #18575, #18576, and #18586.This PR fails one test, specifically
julia/test/sparsedir/sparse.jl
Line 1570 in a6a8946
round(Int, rand(x)*100)
withround.(Int, rand(x)*100)
changes the inferred result type fromSparseMatrixCSC{Int64,Int64}
toSparseMatrixCSC{Tv,Int64}
. Ref #16074 and cc @KristofferC.Best!
(Unlike with
float
,real
, etc., the remaining vectorizedround
methods never alias their input. This PR should be less controversial than #18495, #18512, and #18513 as a result.)