WIP: make transpose non-recursive #23424

andyferris · 2017-08-24T11:22:01Z

Step 2 of #20978 "Taking matrix transposes seriously". Currently WIP - I hope to at least add some more unit tests and have the design questions below answered.

This PR continues to have a recursive adjoint for doing linear algebra (on possibly nested arrays, or other user defined structures) but removes the recursion for transpose. In effect, transpose is being "gifted" to data applications as a simple way to rearrange data in containers (like permutedims) - for example, we can now do ["abc", "def"].'. I also have corrected any non-recursive adjoint behavior as I found it.

One consequence which seems odd at first but makes more sense when you think about it is that some linear algebra applications will need to use conj(m') instead of m.'. This doesn't seem so bad since this is actually how legitimate linear algebra uses of .' arises anyway.

The design for RowVector extends on what was already there, adding recursive AdjointArray wrappers just like the existing lazy ConjArray wrapper. The idea is to do the same for transpose(::AbstractMatrix) and adjoint(::AbstractMatrix) soon, so that transpose there is lazy and we can remove Ac_mul_B and so-on. Future work will also involve moving functions into appropriate files, but since work is ongoing I probably would prefer to do that in a separate PR.

Two design questions:

Should we define a conjadjoint(x) function? It seems to come up occassionally, and would be easy to elide in appropriate cases, e.g. via conjadjoint(x::Number) = x. (Going the opposite direction, we could remove ConjAdjointArray...)
Should Symmetric and issymmetric be recursive? Their names make it seem a bit like they should go with transpose and work on "data", however most common usage is for linear algebra and it is a part of LinAlg. The two choices are issymmetric(m) = m == m.' or issymmetric(m) = m == conj(m').

CC: @StefanKarpinski @stevengj @andreasnoack @Sacha0

andyferris · 2017-08-24T11:28:19Z

Also, this a prerequisite of #23018 (which requires a non-recursively transposing RowVector to work properly).

fredrikekre

Nice! A general question: everything is @inlined, is that necessary?

fredrikekre · 2017-08-24T11:25:34Z

base/linalg/conjarray.jl

+@inline adjoint(a::ConjArray) = ConjAdjointArray(parent(a))
+
+"""
+AdjointArray(array)


fredrikekre · 2017-08-24T11:27:09Z

base/linalg/conjarray.jl

+0+0im  1+1im
+```
+"""
+struct AdjointArray{T,N,A<:AbstractArray} <: AbstractArray{T,N}


Perhaps struct AdjointArray{T,N,A<:AbstractArray{T,N}} <: AbstractArray{T,N} or perhaps the element type of A can be different from T?

N stays the same but the eltype can change via the mapping through adjoint. I can do struct AdjointArray{T,N,A<:AbstractArray{<:Any,N}} <: AbstractArray{T,N}.

fredrikekre · 2017-08-24T11:28:34Z

base/linalg/conjarray.jl

+
+@inline AdjointArray(a::AbstractArray{T,N}) where {T,N} = AdjointArray{adjoint_type(T),N,typeof(a)}(a)
+
+const AdjointVector{T,V<:AbstractVector} = AdjointArray{T,1,V}


const AdjointVector{T,V<:AbstractVector{T}} unless eltype(V) sometimes is not T?

fredrikekre · 2017-08-24T11:30:55Z

base/linalg/conjarray.jl

+const AdjointVector{T,V<:AbstractVector} = AdjointArray{T,1,V}
+@inline AdjointVector(v::AbstractVector{T}) where {T} = AdjointArray{adjoint_type(T),1,typeof(v)}(v)
+
+const AdjointMatrix{T,M<:AbstractMatrix} = AdjointArray{T,2,M}


const AdjointMatrix{T,M<:AbstractMatrix{T}}

fredrikekre · 2017-08-24T11:31:17Z

base/linalg/conjarray.jl

+const AdjointMatrix{T,M<:AbstractMatrix} = AdjointArray{T,2,M}
+@inline AdjointMatrix(m::AbstractMatrix{T}) where {T} = AdjointArray{adjoint_type(T),2,typeof(m)}(m)
+
+# This type can cause the element type to change under conjugation - e.g. an array of complex arrays.


Guess this answers my own question above then.

fredrikekre · 2017-08-24T11:32:28Z

base/linalg/conjarray.jl

+@inline conj(a::AdjointArray) = ConjAdjointArray(parent(a))
+
+"""
+ConjAdjointArray(array)


fredrikekre · 2017-08-24T11:33:45Z

base/linalg/generic.jl

@@ -929,7 +929,7 @@ issymmetric(x::Number) = x == x
 """
    ishermitian(A) -> Bool

-Test whether a matrix is Hermitian.
+Test whether a matrix is Hermitian (such that `A = adjoint(A)`).


fredrikekre · 2017-08-24T11:33:59Z

base/linalg/rowvector.jl

@@ -13,44 +13,37 @@ vector can be multiplied by a matrix on its right (such that `v.' * A = (A.' * v
 differs from a `1×n`-sized matrix by the facts that its transpose returns a vector and the
 inner product `v1.' * v2` returns a scalar, but will otherwise behave similarly.
 """
-struct RowVector{T,V<:AbstractVector} <: AbstractMatrix{T}
+struct RowVector{T,V<:AbstractVector{T}} <: AbstractMatrix{T}


fredrikekre · 2017-08-24T11:35:35Z

base/linalg/conjarray.jl

+0+0im  1-1im
+```
+"""
+struct ConjAdjointArray{T,N,A<:AbstractArray} <: AbstractArray{T,N}


Can this be ConjAdjointArray{T,N,A<:AbstractArray{T,N}} <: AbstractArray{T,N}

andreasnoack · 2017-08-24T14:18:51Z

Great that you are working on this. A few comments after a first read

Did you trash the idea of using mapped arrays?
AdjointArray being elementwise and not flipping the sizes seems unfortunate to me. If this is the best way to structure the functionality it might be better with a different name.
Are we concerned about the number of lazy array wrappers here?
I think issymmetric(m) = m == conj(m') is the right thing with these changes since I think issymmetric will continue to mainly be a linear algebra operation.

andyferris · 2017-08-25T04:44:58Z

@andreasnoack Yes, I think a MappedArray in Base would be better, and solves your first three points. I just wasn't sure how popular that would be. Maybe it's a separate PR?

I think you are right regarding issymmetric (and thus Symmetric).

StefanKarpinski · 2017-08-25T12:43:03Z

If we're going to have some kinds of deferred computation views of arrays in Base, why not do the full blown thing? It seems somewhat simpler than having a bunch of special cases and then having the general version in a package.

timholy · 2017-08-25T14:05:07Z

Feel free to steal wildly from https://github.com/JuliaArrays/MappedArrays.jl. Note there are two types, differing in terms of whether you can or cannot define setindex! (for which you need the inverse transformation).

timholy · 2017-08-25T14:07:18Z

A general question: everything is @inlined, is that necessary?

AFAICT you almost never need manual @inline/@noinline anymore. Moreover, sometimes it solves problems (e.g., excessive compilation time in corner cases) not to use it.

andyferris · 2017-08-29T10:21:30Z

AFAICT you almost never need manual @inline/@noinline anymore. Moreover, sometimes it solves problems (e.g., excessive compilation time in corner cases) not to use it.

Right - this is wonderful Tim :) I haven't had the chance to try it out with large StaticArrays yet. Does the heuristic take into account that the overhead of performing a function call (as opposed to inlining) scales with the size of the function arguments and return value? (for example: doing the equivalent a map(f, t) over a very simple f and very long isbits tuple t might be worth inlining for arbitrary long t. Here the cost of the function body scales as length(t), as does overhead of the function call.)

andyferris · 2017-08-29T11:18:23Z

If we're going to have some kinds of deferred computation views of arrays in Base, why not do the full blown thing? It seems somewhat simpler than having a bunch of special cases and then having the general version in a package.

I tend to agree... but I'm not sure what the community feels, or the best way forward. I mean - we need deferred computation of transpose and adjoint to preserve the nice Ac_mul_B behavior. But what else gets deferred? All map(f, array)s? broadcast? *, /, anything and everything in Base?

andyferris · 2017-08-29T12:24:30Z

Feel free to steal wildly from https://github.com/JuliaArrays/MappedArrays.jl. Note there are two types, differing in terms of whether you can or cannot define setindex! (for which you need the inverse transformation).

@timholy MappedArrays.jl looks cool. I took the ideas and merged the two classes into a single MappedArray here: https://github.com/JuliaLang/julia/compare/ajf/mappedarray?expand=1.

If I get it working I'll open a PR - any design opinions?

JeffBezanson · 2017-08-29T14:25:14Z

Yes might as well bring in the general MappedArray here --- looks like a small amount of code.

Jutho · 2017-09-12T20:28:19Z

As a side note when I see the name ConjAdjoint popping up: some literature refers to normal transpose as adjoint, and explicitly specifies Hermitian adjoint for the former ctranspose, e.g.
https://en.wikipedia.org/wiki/Transpose_of_a_linear_map.

I don't think conjugate adjoint is a defined term, and therefore, no confusion could arise about its meaning, but I thought it was worth bringing up the above, since interpreting adjoint as transpose might lead to interpreting conjugate adjoint as ctranspose.

andyferris · 2017-11-23T11:34:26Z

FYI when I get tests passing, I'd like to merge this (not there yet, still bugs).

Any last comments on e.g. MappedArray are welcome (we can fix afterwards also).

Sacha0 · 2017-11-23T18:17:07Z

I much appreciate the effort to move #5332 and #20978 forward that this pull request represents, and am on board with most of the associated plan. I would much like to also be on board with the terminology choices and downstream impact that this pull request advances, and so regularly try to convince myself that those choices yield the best set of tradeoffs available.

Each time I try to convince myself of that position, I run aground on the same set of misgivings. One of those misgivings is the substantial implementation complexity this choice forces upon LinAlg: Conflating transpose and array-flip forces LinAlg to handle both, requiring LinAlg support a much larger set of type combinations in common operations.

To illustrate, consider mul(A, B) where A and B are bare, adjoint-wrapped, transpose-wrapped, or array-flip-wrapped Matrixs. Without conflating transpose and array-flip, A can be a Matrix, an adjoint-wrapped Matrix, or a transpose-wrapped Matrix (and likewise B). So mul(A, B) need support nine type combinations. But conflating transpose and array-flip, A can be a Matrix, an adjoint-wrapped Matrix, a transpose-wrapped Matrix, or an array-flip-wrapped Matrix (and likewise B). So now mul(A, B) need support sixteen type combinations.

This problem worsens exponentially with the number of arguments. For example, without conflation mul!(C, A, B) need support twenty-seven type combinations, whereas with conflation mul!(C, A, B) need support sixty-four type combations. And of course adding Vectors and non-Matrix matrix/operator types to the mix further complicates things.

This side effect seems worth considering before moving forward with this change. Best!

dlfivefifty · 2017-11-23T18:53:17Z

I think that's more of a comment about the current design of LinAlg. I think a redesign using traits, like IsBlasStrided would help.

I've sketched out a way this can work in a BandedMatrices.jl branch:

https://github.com/JuliaMatrices/BandedMatrices.jl/blob/feat-bandslice/src/generic/interface.jl

It dispatches by the trait type, not the matrix type. So for example if blasstructure(A::AbstractMatrix) returns BlasStrided, then it can dispatch multiplication to gemv.

Sacha0 · 2017-11-23T19:01:48Z

@dlfivefifty, can you expand? I do not follow how traits would address the problem of needing to handle an additional wrapper type with different semantics. Best!

dlfivefifty · 2017-11-23T19:03:54Z

If it’s BLAS, knowing the stride and pointer is sufficient. So the type just needs to overload those. The trait helps determine if the BLAS interface is implemented.

Sacha0 · 2017-11-23T19:23:54Z

@dlfivefifty, these considerations are not confined to BLAS operations over dense arrays of BLAS-compatible scalar types, and in any case I am not certain I follow? Additionally, a complete redesign of LinAlg's structure does not seem in the cards in the next few weeks, even if somehow traits would address this concern?

Jutho · 2017-11-23T19:54:11Z

I agree with @dlfivefifty . Pure julia fallback methods don't need to know whether a matrix is bare or wrapped, if only getindex and setindex! are correctly implemented (unless maybe to implement loops in the correct order, but even that could be solved). So it is only for BLAS that one needs to know, but there indeed just having every type correctly return the pointer, strides and e.g. the correct flag (C, T or N) means a cost just linear in the number of different wrappers, independent of the number of arguments, if properly implemented.

Sacha0 · 2017-11-23T23:02:56Z

Perhaps I see the difference in perspective 🙂:

The responses above seem to assume that operations over transpose-wrapped and array-flip-wrapped objects can always share an underlying implementation. This assumption holds in the most common and familiar cases, for example in operations that map to BLAS calls, where the semantics of transpose and array-flip coincide. Under that assumption, agreed, the additional complexity required to handle both transpose and array-flip in LinAlg would in some sense be superficial; that is, one can imagine a hypothetical system that could gracefully handle that additional complexity at some shallow level in call chains.

Even in such a system, however, handling both transpose and array-flip would still require an appreciable level of additional complexity. And lovely as such a system would be, the existing system neither is nor is likely to evolve into such a system by feature freeze.

And in general, operations over transpose-wrapped and array-flip-wrapped objects cannot always share an (efficient) underlying implementation. The difference in behavior between transpose-wrapped and array-flip-wrapped objects --- that the former is recursive, whereas the latter is not -- necessarily propagates all the way down to the elementary operations that constitute a given higher level operation, and so cannot always be efficiently handled at shallow levels in call chains.

Pure julia fallback methods don't need to know whether a matrix is bare or wrapped, if only getindex and setindex! are correctly implemented (unless maybe to implement loops in the correct order, but even that could be solved).

That statement is, to say the least, a minor trivialization of the importance and difficulty of writing efficient kernels for even simple operations that do not immediately map to e.g. BLAS calls 😄. Most operations in LinAlg that fall under the heading "pure Julia fallbacks" are not implemented in the way you describe, and, were they, they would be much slower. Best!

Jutho · 2017-11-23T23:13:50Z

That statement is, to say the least, a minor trivialization of the importance and difficulty of writing efficient kernels for even simple operations that do not immediately map to e.g. BLAS calls 😄. Most operations in LinAlg that fall under the heading "pure Julia fallbacks" are not implemented in the way you describe, and, were they, they would be much slower. Best!

The most important aspect to take into account is loop order in combination with proper blocking no? Which could be checked by looking at the strides rather than at what particular type of wrapper it is (if the data is not strided than it's probably impossible to decide on an optimal strategy). I've been working on a package (https://github.com/Jutho/Strided.jl) to deal with strided data with arbitrary strides (such that permutedims, transpose and ctranspose, views with ranges and special reshapes which don't break strides are all lazy), and where an arbitrary combination of such strided arrays can be combined in an efficient map(!) construct and an optimal loop order and blocking strategy is chosen (at runtime, so some overhead for small arrays), even including multithreading.

Still different from linear algebra operations, but I would think similar constructions are possible. But indeed, not likely in the current structure of LinAlg.

andyferris · 2017-11-23T23:25:22Z

@Sacha0 Yes, important points to consider. My thoughts on the way forward:

With eltype being a BLAS type, we have bare arrays, transposed arrays, conjugated arrays and transposed conjugated arrays. I see the 16 dispatch possibilities of * are the generalization of the set of Ac_mul_B operations we have and are the closure of * operations under conj and transpose of the inputs. This PR does it's best to make sure that adjoint turns into transpose(conj(...)) wherever possible (when eltype is Number).
If we wrote a blocking (cache-oblivious) method for * I suspect we could get closer to BLAS speeds with a single generic algorithm. It would only care about strides and so on, and the RowVector and TransposedMatrix wrappers could report these correctly, and with luck the ConjArray wrapper might not introduce overhead.
I imagine we could generalize the above cache-oblivious strided indexing order to other indexing patterns that understand sparseness.

Perhaps I'm being overly optimisitc, but with the right traits and so-on I thought we might get to a decent internal system in the future (not v1.0).

Sacha0 · 2017-11-23T23:43:53Z

I've been working on a package (https://github.com/Jutho/Strided.jl) to deal with strided data with arbitrary strides (such that permutedims, transpose and ctranspose, views with ranges and special reshapes which don't break strides are all lazy), and where an arbitrary combination of such strided arrays can be combined in an efficient map(!) construct and an optimal loop order and blocking strategy is chosen (at runtime, so some overhead for small arrays), even including multithreading.

@Jutho, this work sounds super exciting! I much look forward to checking this package out! 😃

The most important aspect to take into account is loop order in combination with proper blocking no? Which could be checked by looking at the strides rather than at what particular type of wrapper it is (if the data is not strided than it's probably impossible to decide on an optimal strategy).

Remember that: (1) LinAlg also supports various sparse and structured storage types, various triangular and symmetric/hermitian views into those and other storage types, implicit representations of various linear operators, etc; and (2) that matrix multiplication is only one among many operations that LinAlg supports. Efficient strategies for performing various of these of operations on various combinations of these types exist and are implemented in LinAlg. Additional layers of combinatorial types introduce substantial complexity, even when underlying implementations can be efficiently shared (which unfortunately is not always the case). Best!

dlfivefifty · 2017-11-24T09:05:39Z

These can each be handled by a trait, IsTriangularStorage , IsBandedStorage , etc. and dispatch on the trait type instead of the matrix type.

This would allow user defined matrices to also benefit from optimal implementation.

Sacha0 · 2017-11-25T03:55:51Z

These responses implicitly affirm that gracefully handling this approach's additional complexity would require substantial additions to and/or structural overhaul of LinAlg, underscoring rather than mitigating concerns over this additional complexity. That such things are much easier said than done is also worth bearing in mind :). Best!

* RowVector no longer transposes elements * MappedArray used to propagate conj, transpose, adjoint, etc

andyferris · 2017-12-14T23:35:26Z

Since we aren't doing non-recursive transpose, I'll close this PR.

Data users that have been following this, please use the new, friendlier permutedims for non-recursive transpose.

fredrikekre reviewed Aug 24, 2017

View reviewed changes

mbauman added domain:arrays [a, r, r, a, y, s] domain:linear algebra Linear algebra labels Aug 24, 2017

andyferris force-pushed the ajf/vectortranspose2 branch from 11ef34b to aef7238 Compare August 29, 2017 11:16

andyferris added this to the 1.0 milestone Sep 3, 2017

andyferris force-pushed the ajf/vectortranspose2 branch from aef7238 to fcf5201 Compare September 7, 2017 04:21

andyferris mentioned this pull request Sep 9, 2017

Diagonal(RowVector) segfaults #23649

Closed

andyferris force-pushed the ajf/vectortranspose2 branch from 8f699e7 to dfee60c Compare September 14, 2017 04:25

andyferris force-pushed the ajf/vectortranspose2 branch from 7b0de64 to 7b6ca5d Compare November 4, 2017 05:08

andyferris mentioned this pull request Nov 10, 2017

new syntax for transpose #21037

Closed

mbauman mentioned this pull request Nov 19, 2017

WIP:Add Transpose immutable #6837

Closed

StefanKarpinski mentioned this pull request Nov 20, 2017

A_mul_B and all that jazz #5332

Closed

andyferris mentioned this pull request Nov 21, 2017

decide if [1 2 3] should be a matrix or row vector #23018

Closed

andyferris force-pushed the ajf/vectortranspose2 branch 3 times, most recently from 6c60c1e to a52023a Compare November 23, 2017 11:00

Sacha0 mentioned this pull request Nov 23, 2017

Taking matrix transposes seriously #20978

Closed

andyferris force-pushed the ajf/vectortranspose2 branch 4 times, most recently from 68acd8e to 3232750 Compare November 26, 2017 07:14

Work on implementing non-recursive transpose.

7da9eaf

* RowVector no longer transposes elements * MappedArray used to propagate conj, transpose, adjoint, etc

andyferris force-pushed the ajf/vectortranspose2 branch from 3232750 to 7da9eaf Compare November 28, 2017 13:10

andyferris closed this Dec 14, 2017

DilumAluthge deleted the ajf/vectortranspose2 branch March 25, 2021 22:08


		@inline AdjointArray(a::AbstractArray{T,N}) where {T,N} = AdjointArray{adjoint_type(T),N,typeof(a)}(a)

		const AdjointVector{T,V<:AbstractVector} = AdjointArray{T,1,V}

WIP: make transpose non-recursive #23424

WIP: make transpose non-recursive #23424

Conversation

andyferris commented Aug 24, 2017 • edited Loading

andyferris commented Aug 24, 2017

fredrikekre left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andreasnoack commented Aug 24, 2017

andyferris commented Aug 25, 2017 • edited Loading

StefanKarpinski commented Aug 25, 2017

timholy commented Aug 25, 2017

timholy commented Aug 25, 2017 • edited Loading

andyferris commented Aug 29, 2017 • edited Loading

andyferris commented Aug 29, 2017

andyferris commented Aug 29, 2017

JeffBezanson commented Aug 29, 2017

Jutho commented Sep 12, 2017

andyferris commented Nov 23, 2017 • edited Loading

Sacha0 commented Nov 23, 2017

dlfivefifty commented Nov 23, 2017

Sacha0 commented Nov 23, 2017

dlfivefifty commented Nov 23, 2017

Sacha0 commented Nov 23, 2017 • edited Loading

Jutho commented Nov 23, 2017

Sacha0 commented Nov 23, 2017

Jutho commented Nov 23, 2017

andyferris commented Nov 23, 2017

Sacha0 commented Nov 23, 2017

dlfivefifty commented Nov 24, 2017

Sacha0 commented Nov 25, 2017

andyferris commented Dec 14, 2017

andyferris commented Aug 24, 2017 •

edited

Loading

andyferris commented Aug 25, 2017 •

edited

Loading

timholy commented Aug 25, 2017 •

edited

Loading

andyferris commented Aug 29, 2017 •

edited

Loading

andyferris commented Nov 23, 2017 •

edited

Loading

Sacha0 commented Nov 23, 2017 •

edited

Loading