WIP: Linear Algebra bikeshedding #2212

ViralBShah · 2013-02-07T01:20:28Z

Basically includes:

Deprecate lud, chold, cholpd, qrd, qrpd
lu, chol, cholpivot, qr, qrpivot now return Factorization objects
documentation

[edit: The suitesparse stuff is now cherry-picked on the db/suitesparse branch/]

Relevant comments are in #2062.

- use lud and lud! methods for the decomposition - allocate umf_conf and umf_info and use them throughout - preserve the pointers to the symbolic and numeric factorizations in the UmfpackLU object. Also keep around the decremented index/pointer vectors.

…and qrpd. Update docs for qr.

Conflicts: base/deprecated.jl test/Makefile

StefanKarpinski · 2013-02-07T01:28:17Z

Oh, never mind. It's in a pull request. Carry on.

nolta · 2013-02-07T01:56:07Z

Here's a link to the previous chol vs chold argument: 826c42f#commitcomment-1816094

carlobaldassi · 2013-02-07T02:17:43Z

What is the rationale for removing the matlab-compatible versions? (Alternatively: was this discussed further?) I'm quite happy with the solution found at the end of the thread linked by @nolta, i.e. the current situation.

ViralBShah · 2013-02-07T03:56:41Z

What I do not like about the current situation is that we have all these great names that matlab uses (lu, qr, etc.) but they actually force you into computing in inefficient ways. I am proposing that we have defaults that are good and do the right thing in julia - i.e. all the linear algebra routines return Factorization objects, which allow for efficient solving of systems and such things, and use factors() to get the underlying factors in a way that is matlab compatible.

ViralBShah · 2013-02-07T03:59:40Z

With this pull request, what you get is this, which is often all you really want to do with these factorizations:

lufact = lu(A)
x = lufact \ b

But one can always get the factors and do other things:

l,u,p = factors(lufact) # Matlab-style - just the extra `factors`
x = u \ l \ b[p]  # Matlab-style solve that is not efficient, but works

nolta · 2013-02-07T04:43:15Z

So we're just going to have the same argument again?

ViralBShah · 2013-02-07T04:53:30Z

In general, we have recently, as a community, been taking decisions that favour sane behaviour over matlab compatibility. I think that this is a case where being matlab compatible actually hurts, especially since we do not have varargout that matlab does to make its APIs nicer.

StefanKarpinski · 2013-02-07T05:11:03Z

Having the Matlab-compatible versions and the *d and *d! versions is an awful lot of multiplicity here. Inflating the number of functions in the language by some non-negligible fraction is a lot of bending over backwards to maintain the Matlab-compatible and inferior interface.

nolta · 2013-02-07T05:42:43Z

I guess inferior's in the eye of the beholder. I use chol all the time for monte carlo simulations, and factors(chol(A)) feels incredibly inferior.

ViralBShah · 2013-02-07T05:57:25Z

What do you do with the result of chol in your MC simulations?

toivoh · 2013-02-07T06:17:21Z

Since chol is a bit special (only one factor), perhaps it could deserve a shortcut like cholfactor to extract the factor?

nolta · 2013-02-07T06:28:51Z

Sample from a multivariate Gaussian distribution.

ViralBShah · 2013-02-07T07:30:11Z

I like the idea of having cholfactor as a shortcut in the case of chol. @nolta would that work?

ViralBShah · 2013-02-07T07:32:29Z

We can also overload a few more methods for CholeskyDense objects to allow for the kind of things one might want to do.

StefanKarpinski · 2013-02-07T07:35:35Z

Needing cholfactor might be a sign of this change not being ideal. Perhaps we should give this particular change some more time? Can Gaussian sampling be done easily with and/or more efficiently with the factorization object?

carlobaldassi · 2013-02-07T09:03:50Z

For the record, what I use most often is sum(log(diag(chol(x))), i.e. the log of the square root of the determinant. Passing through det of the factor is not ok, since a lot of precision is lost that way (plus it's less efficient). This is already a long chain, putting factors in there is not the end of the world, but not particularly nice either. The only reasonable addition I can think of to the methods for CholeskyDense in this context would be having a logdet function. But I guess I'll end up writing my own shortcut anyway.

andreasnoack · 2013-02-07T09:31:08Z

There is already rand(MultivariateNormal(mean, var)) in Distributions which could be used for multivariate normal sampling but that is of course a long name. Two possibilities could be a randn(cov, dim) method and some *(Matrix, CholeskyDense) methods.

@carlobaldassi I think it would be a good idea to have a logdet.

mlubin · 2013-02-07T13:46:55Z

I support getting rid of the matlab compatibility. Factorization objects are a much more sane default.

dmbates · 2013-02-07T14:09:29Z

@nolta As @andreasnoackjensen pointed out, sampling from a multivariate normal can be done with rand(MultivariateNormal(...))from the Distributions package.

julia> using Distributions

julia> mvn = MultivariateNormal([1.,2,3], diagm([4.,5,6]))
MultivariateNormal distribution
Mean:
[1.0, 2.0, 3.0]
Covchol: CholeskyDense{Float64}(3x3 Float64 Array:
 2.0  0.0      0.0    
 0.0  2.23607  0.0    
 0.0  0.0      2.44949,'U')
Mean:
[1.0, 2.0, 3.0]
Variance:
3x3 Float64 Array:
 4.0  0.0  0.0
 0.0  5.0  0.0
 0.0  0.0  6.0

julia> srand(12321)

julia> rand(mvn)
3-element Float64 Array:
 2.34813
 4.22277
 1.4347 

julia> srand(12321)

julia> rand(mvn, 3, 20)
3x20 Float64 Array:
 2.34813   0.491271    3.73474  -1.58803  2.15756   6.94249   2.16482  -3.25563   …  1.34691  -1.97108  0.192781  2.69105  1.23021  3.86129  -1.64641  1.4843   -0.778525
 4.22277   1.74714     4.80336   4.99726  2.07695  -0.790902  1.7337   -1.32994      6.00423   2.7134   4.6624    2.69176  1.08644  4.08717   3.14195  3.62224  -3.74896 
 1.4347   -0.0141233  -1.76968   3.15905  1.85379   2.03352   1.69304   0.195228     3.5728    2.00291  4.46716   4.52372  2.83042  3.09832   7.28602  3.83717   6.57325

dmbates · 2013-02-07T14:20:35Z

@carlobaldassi I think that adding a logdet method for CholeskyDense (and the soon-to-be CholeskySparse) type is the way to go. It would also be worthwhile adding diag methods for those types. The same could be done for the log-determinant and diagonal of LU factorizations and even the QR, operating on R.

The approach in R is to use the signature determinant(mat::Matrix{Number}, log::Bool) where log defaults to false. When log == true the value is the log of the determinant and + or - 1 for the sign. In the short term the important accessor is diag of the factorization.

StefanKarpinski · 2013-02-09T15:52:39Z

I'm not sure I understand. Are you objecting to "shipping" the current state of master or something else?

ViralBShah · 2013-02-09T16:19:40Z

What I am saying is that the current state of Factorization in master is not satisfactory in my opinion to ship. For something as basic as this, it would be difficult to change it at a later date - and I prefer moving all Factorization stuff into a package where it can be baked till it is good enough for inclusion in Base.

I would want to hear what others have to say.

ViralBShah · 2013-02-09T16:24:48Z

Just a note - I have cherry-picked all the commits by @dmbates. The blas stuff is in master, whereas the suitesparse stuff is now on the db/suitesparse branch.

mlubin · 2013-02-09T16:28:18Z

I suppose it's ok for Factorization to be moved into a package for now. Does this include the matlab-style routines? The Julia documentation should clearly point to it though.

ViralBShah · 2013-02-09T16:34:57Z

What I am proposing is to leave the matlab style routines as they are for now as part of Base - so chol, cholp, lu, qr, qrp, etc. can stay as they are. The *d versions: chold, cholpd, lud, qrd, qrpd would move to say, a LinalgFactor package, where that interface can be experimented with. The package versioning gives a good method for backwards compatibility too.

mlubin · 2013-02-09T16:41:10Z

Doesn't that make it difficult to change lu, chol, etc. to return factorization objects in the future?

StefanKarpinski · 2013-02-09T16:46:21Z

Honestly, I think this is a terrible idea. If a change this drastic was going to happen, it should have been done days ago. We now have an API stability deadline in one day and there is no way a change this disruptive is going to happen without breaking things. If the factorization stuff wasn't ready, it shouldn't have been in master – we are already shipping it. Trying to pull this out of base now is making a massive, disruptive change with no deprecation path with a single day to get it right. Not a good idea.

mlubin · 2013-02-09T17:14:33Z

Is there really that much to fix in order to get Factorization to a reasonable state? Could we agree on a list of things to fix and leave non-compatibility-breaking enhancements for later?

I propose:

We codify @nolta's approach that in all cases where operators are overloaded for Factorization objects or Factorization objects are passed to Base functions, the behavior should be the same as if the original matrix was passed.
Access individual factors using ref instead of factors. For now these can return full matrices, but in the future they could return specialized types, e.g. for QR as I mentioned the Q could be a special type which implements multiplication and backslash. This wouldn't break syntax compatibility if all someone does is use the Q factor as a normal matrix, although it would break things if they try to access the elements for some reason.
Put off for 0.2 implementing the nice tuple/iterator syntax for picking out all the factors at once.

StefanKarpinski · 2013-02-09T17:19:12Z

It seems to me that "getting this right" is not super crucial now that we've arrived at the realization that we're not going to get this right before 0.1; since we know we're going to break people's code with 0.2 when we change all of this the main goal is not to break their code with 0.1 also with no benefit. So for tomorrow, I would suggest we tread lightly with only minimally disruptive changes that are unambiguously a good thing.

andreasnoack · 2013-02-09T17:33:00Z

I also think it would too drastic to remove the factorizations. Just for the perspective in this, the discussion started from the annoyance of writing factors(chol(A)) instead of chol(A).

I think @mlubin's suggestions are reasonable. In particular @dmbates's idea of using ref instead of factors.

However, I think we should reconsider if it is useful to restrict factorizations to be the same thing as the original matrix: the LinearOperator view. What is the gain? If this association is dropped you wouldn't be surprised by the behaviour of * in the QR factorization. Why would you ever use * on a factorization object it represented the original matrix?

mlubin · 2013-02-09T17:41:06Z

(qr(A)[:Q])*x is much more understandable than (qr(A))*x to anyone with a background in linear algebra who has never seen Julia before.

andreasnoack · 2013-02-09T17:53:32Z

I agree that qr(A)[:Q]*x is more understandable, but that is different from qr(A)*x being dangerous. My point was partly that * is not useful if a factorization is understood as the same thing as the original matrix and partly that mathematical abstractions are not always useful in programming.

timholy · 2013-02-09T18:21:45Z

But then * should not be defined at all. \ clearly is useful, because the factorization is the best way to solve a linear equation. And having * not be the inverse of \ is simply too horrible to contemplate.

toivoh · 2013-02-09T19:10:56Z

About _, not computing unneeded factors, and extracting several at once: how about

Q, R = qr(A)[(:Q, :R)]

Then the entire set of requested factors would be known immediately upon extraction, and without extending the iterator protocol. Also, it is quite clear what is happening even if the type of factorization object being operated upon cannot be seen from the code, unlike factors, which can happily give the wrong answer if the type of factorization is different than expected.

Still, I'm unclear on the details of factorizations where you only want to compute some of the factors. This seems to presuppose some kind of lazy evaluation, which would require the factorization object to keep a copy of the source matrix inside, unless the set of needed factors is given in advance. The most efficient algorithm to compute a given subset of factors would probably also need to know this set in advance, not incrementally. On the other hand, a factorization object with some factors unavailable will be crippled, e.g. cannot implement \, so is probably not really a factorization object at all.

To compute a subset of factors, maybe it would be better not to (visibly) construct a factorization object, but to use something like

U, S = svd((:U, :S), A)

(and you could perhaps specify either :V or :Vt, which could provide some nice flexibility regarding #912.)
This would also allow to compute the Cholesky factor using

L = chol (:L, A)

ViralBShah · 2013-02-10T02:34:26Z

@toivoh This is not lazy evaluation. The idea is that you use the LAPACK routines for factorization, which often compute a compact form. If you want the actual matrices that represent the factors, you may have to do some more work, when forming the actual matrices. However, if you only want to use the factorization to say, solve a system, then you may not need the actual matrices.

ViralBShah · 2013-02-10T02:39:22Z

Given the need for minimally invasive changes, I am still pushing for renaming the *d routines to *fact routines. It is absolutely impossible to figure out what the d in qrd means, given that it returns a Factorization object. It may cause some intermediate churn, but at least we will have names that are more meaningful to newcomers, and we can then take our time discussing what the ideal interface should look like.

mlubin · 2013-02-10T02:41:33Z

Sounds reasonable to me.

ViralBShah · 2013-02-10T02:42:33Z

I am also not sure whether for 0.1, we should retain the current behaviour of qrd(a)*x which only multiplies by Q.

mlubin · 2013-02-10T02:58:06Z

If it's too much intermediate churn to remove that behavior given that we haven't decided how to expose the functionality, I would at least not present it as a good thing to do in the documentation.

Minimal invasive changes to the Factorization objects for 0.1. All existing functionality is retained, with only a few renamings for clarity. Replaces the *d routines with *fact routines: lud ->lufact chold -> cholfact cholpd-> cholpfact qrd ->qrfact qrpd -> qrpfact Replaces * and Ac_mul_B on QRDense objects: qmulQR performs Q*A qTmulQR performs Q'*A Updates to exports, documentation, tests, and deprecations. This is a self-contained commit that can be reverted if necessary close #2062

ViralBShah · 2013-02-10T04:35:59Z

I am closing this pull request, and we can resume this discussion after 0.1. I am actually fine with the current state, if we can retain the commit above. If everyone hates it, it can be reverted - but it does not change any functionality and only renames a few things for clarity and consistency.

I am pushing it in since 0.1 will most likely be tagged when I am not awake.

andreasnoack · 2013-02-10T05:52:59Z

@timholy The horrible behaviour is already there for matrices. Inherited from Matlab:

julia> A=randn(5,2);b=randn(5);

julia> b-A*(A\b)
5-element Float64 Array:
 -0.418845
  1.15814 
 -0.986439
  0.719669
  0.531056

Horrible, but convenient.

timholy · 2013-02-10T10:53:16Z

Fair enough, but that's only because inv(A) doesn't exist. When it does, it gives you the expected result.

alanedelman · 2013-02-10T14:28:31Z

May I propose we disconnect two issues

the possibility of a compressed form for Q and perhaps L and U
the entanglement of Q&R, L&U etc.

I believe that the issue is more about 2 and yet the discussion above is mostly about 1.
There is nothing wrong with having a compressed format for Q analogous to "sparse." or
"symtridiagonal." If we didn't entangle, it still could be just fine to go (L,U)\A for
L(U\A). With enough overloading, the user would hardly know the difference from matlab.
... so the real discussion we have to have is whether LAPACK's entangled formats
should be disentangled, or rewritten so they end up disentangled, or if both are bad ideas
for some sort of porformance reason. (Off the top of my head, I don't know the answer)

ViralBShah · 2013-02-11T16:24:00Z

Does anyone use svdt? Is it useful to have? svd is matlab-compatible and factors(svdfact(A)) now does what svdt does right now. I would prefer to deprecate svdt but wanted to check here first.

andreasnoack · 2013-02-11T16:28:06Z

Now that we have an svd factorization type I think svdt should disappear.

ViralBShah and others added 10 commits February 3, 2013 20:23

Reinstate all the commits from #2069.

df8f86e

Relevant comments are in #2062.

Get rid of chold. Add documentation for chol.

bc7dec7

Rename cholpd to cholpivot. Add docs for cholpivot.

fe13a03

Improve UMFpack interface

ccf2024

- use lud and lud! methods for the decomposition - allocate umf_conf and umf_info and use them throughout - preserve the pointers to the symbolic and numeric factorizations in the UmfpackLU object. Also keep around the decremented index/pointer vectors.

Merge remote-tracking branch 'origin/vs/linalg2' into origin-vs-linalg2

b96e9c2

Rename lud to lu. Update docs.

c5b0fe0

Rename QRPDense to QRDensePivoted. Use qr and qrpivot instead of qrd …

2fe3c04

…and qrpd. Update docs for qr.

Deprecate old dense matrix factorization names.

b259333

Replace lud with lu in suitesparse.jl

bf427cd

Merge branch 'master' into vs/linalg2

1187918

Conflicts: base/deprecated.jl test/Makefile

ViralBShah closed this Feb 10, 2013

ViralBShah deleted the vs/linalg2 branch February 11, 2013 16:25

andreasnoack mentioned this pull request Feb 15, 2013

WIP:New design of linalg (factorizations) #2308

Merged

ViralBShah mentioned this pull request Mar 16, 2013

0.2 release notes #2581

Closed

WIP: Linear Algebra bikeshedding #2212

WIP: Linear Algebra bikeshedding #2212

Conversation

ViralBShah commented Feb 7, 2013

StefanKarpinski commented Feb 7, 2013

nolta commented Feb 7, 2013

carlobaldassi commented Feb 7, 2013

ViralBShah commented Feb 7, 2013

ViralBShah commented Feb 7, 2013

nolta commented Feb 7, 2013

ViralBShah commented Feb 7, 2013

StefanKarpinski commented Feb 7, 2013

nolta commented Feb 7, 2013

ViralBShah commented Feb 7, 2013

toivoh commented Feb 7, 2013

nolta commented Feb 7, 2013

ViralBShah commented Feb 7, 2013

ViralBShah commented Feb 7, 2013

StefanKarpinski commented Feb 7, 2013

carlobaldassi commented Feb 7, 2013

andreasnoack commented Feb 7, 2013

mlubin commented Feb 7, 2013

dmbates commented Feb 7, 2013

dmbates commented Feb 7, 2013

StefanKarpinski commented Feb 9, 2013

ViralBShah commented Feb 9, 2013

ViralBShah commented Feb 9, 2013

mlubin commented Feb 9, 2013

ViralBShah commented Feb 9, 2013

mlubin commented Feb 9, 2013

StefanKarpinski commented Feb 9, 2013

mlubin commented Feb 9, 2013

StefanKarpinski commented Feb 9, 2013

andreasnoack commented Feb 9, 2013

mlubin commented Feb 9, 2013

andreasnoack commented Feb 9, 2013

timholy commented Feb 9, 2013

toivoh commented Feb 9, 2013

ViralBShah commented Feb 10, 2013

ViralBShah commented Feb 10, 2013

mlubin commented Feb 10, 2013

ViralBShah commented Feb 10, 2013

mlubin commented Feb 10, 2013

ViralBShah commented Feb 10, 2013

andreasnoack commented Feb 10, 2013

timholy commented Feb 10, 2013

alanedelman commented Feb 10, 2013

ViralBShah commented Feb 11, 2013

andreasnoack commented Feb 11, 2013