Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

log(matrix) issue #32313

Closed
ztultrebor opened this issue Jun 13, 2019 · 19 comments · Fixed by #32327
Closed

log(matrix) issue #32313

ztultrebor opened this issue Jun 13, 2019 · 19 comments · Fixed by #32327
Labels
bug Indicates an unexpected problem or unintended behavior linear algebra Linear algebra

Comments

@ztultrebor
Copy link

ztultrebor commented Jun 13, 2019

A similar issue was discussed in #21179. That issue has been fixed, as the example given calculates correctly

julia> B = [3 2; -5 -3]
2×2 Array{Int64,2}:
  3   2
 -5  -3

julia> exp(log(B))
2×2 Array{Complex{Float64},2}:
  3.0+3.9276e-16im    2.0+1.30473e-16im
 -5.0-1.26787e-15im  -3.0-4.20143e-16im

But if we let C=10 B then we encounter a problem

julia> C = [30 20; -50 -30]
2×2 Array{Int64,2}:
  30   20
 -50  -30

julia> exp(log(C))
2×2 Array{Complex{Float64},2}:
   -30.0-8.57143im  -14.2857-17.1429im
 44.2857-17.1429im      30.0+8.57143im

This approach gives the correct answer

julia> using LinearAlgebra

julia> function logmat(A)
       Λ, S = eigen(A)
       return S*(log.(Complex.(Λ)).*inv(S))
   end
logmat (generic function with 1 method)

julia> exp(logmat(C))
2×2 Array{Complex{Float64},2}:
  30.0+0.0im   20.0+0.0im
 -50.0+0.0im  -30.0+0.0im
@JeffBezanson JeffBezanson added the linear algebra Linear algebra label Jun 13, 2019
@stevengj
Copy link
Member

This works:

mylog(X) = let S = schur(complex(X)); S.Z * log(UpperTriangular(S.T)) * S.Z'; end

so the problem seems to come from the combination of two Schur factorizations here?

@stevengj
Copy link
Member

It looks like the schur(complex(schur(C).T)).T that is computed here has a certain "unlucky" structure that is handled very inaccurately:

julia> U = UpperTriangular(schur(complex(schur(C).T)).T)
2×2 UpperTriangular{Complex{Float64},Array{Complex{Float64},2}}:
 0.0+10.0im      -64.3428-18.9738im
            -2.44249e-15-10.0im   

julia> exp(Matrix(log(U))) - U
2×2 Array{Complex{Float64},2}:
 1.77636e-15-3.55271e-15im      128.686+37.9475im    
         0.0+0.0im          -1.9984e-15-1.77636e-15im

Here is a simpler example:

julia> log(UpperTriangular([10.0im 0; 0 -10.0im]))
2×2 UpperTriangular{Complex{Float64},Array{Complex{Float64},2}}:
 2.30259+1.5708im      NaN+NaN*im  
                  2.30259-1.5708im

Not sure where these NaNs are coming from, but I'm guessing that's related…

@stevengj
Copy link
Member

stevengj commented Jun 14, 2019

I've verified that the original code at http://eprints.ma.man.ac.uk/1851/02/logm.zip does not seem to have this problem (the result for logm_new(C) seems accurate).

However, it does have some division-by-zero warnings:

octave:10> U = [10.0i 0; 0 -10.0i]
U =
    0 + 10i    0 +  0i
    0 +  0i   -0 - 10i

octave:11> logm_new(U)
warning: division by zero
warning: called from
    logm_new>powerm2by2 at line 179 column 6
    logm_new at line 87 column 20
warning: division by zero
warning: called from
    logm_new>logm2by2 at line 228 column 7
    logm_new at line 100 column 20
ans =
   2.30259 + 1.57080i   0.00000 + 0.00000i
   0.00000 + 0.00000i   2.30259 - 1.57080i

@stevengj stevengj added the bug Indicates an unexpected problem or unintended behavior label Jun 14, 2019
@stevengj
Copy link
Member

stevengj commented Jun 14, 2019

cc @iamnapo, since this code is from #21571

@stevengj
Copy link
Member

The NaN in my log(UpperTriangular([10.0im 0; 0 -10.0im])) example above seems to be coming from this line.

In particular, Akp1 == -10im and Ak == 10im, so (Akp1 - Ak) / (Akp1 + Ak) gives NaN + NaN*im

@ztultrebor
Copy link
Author

Is there a reason to do the first Schur factorization as real?

@stevengj
Copy link
Member

stevengj commented Jun 14, 2019

@ztultrebor, it's just a performance optimization, I think, since a real Schur factorization is faster than the complex one (but may only be quasi-upper-triangular, leading to the second Schur factorization for cases with complex eigenvalues). In any case, it looks like the underlying problem comes elsewhere, from the fact that log(::UpperTriangular) is inaccurate in some cases.

@ztultrebor
Copy link
Author

@stevengj There's definitely a problem (or two) with the log(::UpperTriangular) method.

While the improved mylog function works for a small random matrix:

julia> mylog(X) = let S = schur(complex(X)); S.Z * log(UpperTriangular(S.T)) * S.Z'; end
mylog (generic function with 1 method)

julia> D2 = randn(2,2)
2×2 Array{Float64,2}:
 -0.0532145  1.41135
 -0.03215    1.4111

julia> exp(mylog(D2)) - D2
2×2 Array{Complex{Float64},2}:
 -2.77556e-17+1.56125e-17im  -6.66134e-16+1.38778e-16im
 -6.93889e-18+6.07153e-18im  -2.22045e-16-1.04083e-17im

it still fails for a slightly larger random matrix:

julia> D6 = randn(6,6)
6×6 Array{Float64,2}:
  0.90881    0.336947    0.843843   0.918692  -1.05808   -0.179687
  1.49618    0.375474   -0.437477   1.98801   -1.0208    -0.381434
  1.09977   -0.595175   -0.718849   0.597049  -1.90258    0.0568476
 -1.6915    -0.601828    0.52197   -0.692272  -0.473441   1.38306
 -1.27279    1.69424     0.24387    0.498871  -2.3394     0.286934
  0.457298  -0.0873094  -0.199842   0.851316  -0.371774  -0.050287

julia> exp(mylog(D6)) - D6
6×6 Array{Complex{Float64},2}:
 -0.0373332-0.0413267im  0.00640153+0.00270801im    0.0674095-0.0659298im   0.0314105+0.0050294im
  0.0733027-0.171132im    -0.110669-0.229787im       -0.015317+0.265689im   -0.0314504+0.158948im
  0.0851588-0.915255im    -0.407165-0.90443im          0.31459+0.695568im    0.0492944+0.664101im
  -0.070744-0.283477im     -0.06765-0.17742im          0.22292-0.0141363im   0.0841001+0.146827im
 -0.0921727-0.97041im     -0.321872-0.765978im          0.5693+0.306179im     0.181584+0.593536im
  0.0366271-0.057668im   -0.0444716-0.0900447im      -0.02057+0.117721im   -0.0190502+0.0607899im

The Schur factorization is fine:

julia> D6schur = schur(D6);

julia> D6schur.Z*D6schur.T*D6schur.Z' - D6
6×6 Array{Float64,2}:
  2.77556e-15  5.55112e-17   0.0           1.22125e-15  -1.33227e-15   6.93889e-16
  1.11022e-15  9.99201e-16  -6.10623e-16  -4.44089e-16   0.0           0.0
 -2.22045e-16  5.55112e-16  -1.44329e-15  -2.88658e-15  -3.55271e-15   4.02456e-16
 -1.77636e-15  3.33067e-16  -6.66134e-16   5.55112e-16  -3.4972e-15    6.66134e-16
 -2.44249e-15  6.66134e-16  -2.41474e-15  -7.21645e-16  -7.54952e-15  -1.11022e-16
 -9.4369e-16   4.57967e-16  -2.77556e-17  -1.11022e-15   1.66533e-16   4.51028e-16

@ztultrebor
Copy link
Author

Also, this is broken

julia> G = [1 1; 0 1]
2×2 Array{Int64,2}:
 1  1
 0  1

julia> log(G)
ERROR: MethodError: no method matching log(::UpperTriangular{Complex{Int64},Array{Complex{Int64},2}})
Closest candidates are:
  log(::Float16) at math.jl:1018
  log(::Complex{Float16}) at math.jl:1019
  log(::Float64) at special/log.jl:254
  ...

@stevengj
Copy link
Member

stevengj commented Jun 14, 2019

The Compute accurate superdiagonal of T code that is giving NaN in my example is from Higham's Functions of Matrices, chapter 11:
image
image

In particular, note the comment about it depending on the definition of the atanh function — since our code was ported from from Matlab code, we have to be careful that the atanh definition is the same.

@stevengj
Copy link
Member

stevengj commented Jun 14, 2019

I found a typo in the code that seems to fix the problem. I also think we should update to use (11.27) from Higham's book since our atanh follows the IEEE 754 definition. Will have a PR soon, I hope.

I also get

julia> norm(exp(log(D6)) - D6) / norm(D6)
1.9415387506094363e-15

for your bigger example above.

(There has got to be a better way to compute log(x/y) / (x-y) than this crazy complicated formula in Higham's book.)

@stevengj
Copy link
Member

@ztultrebor, the issue will be closed when the bugfix PR is merged.

@stevengj stevengj reopened this Jun 15, 2019
@RalphAS
Copy link

RalphAS commented Jun 24, 2019

Higham's formulae seem to rely on arcane features of Matlab floating point operations, which differ from the ISO C99ff standards followed (faithfully, AFAICT) by Julia. I propose a different sign based on analysis of the ISO-style branch cut:

# this is the fragile thing we are trying to represent
function f_baseline(a,b)
    log(b) - log(a)
end
# this is Higham's version (bottom of p.279):
function f_higham1(a,b)
    z = (b-a)/(b+a)
    log((1.0 + z)/(1.0 - z)) + 2.0*im*pi*(unw(log(b)-log(a)))
end
# this is the form in 11.27
function f_higham2(a,b)
    z = (b-a)/(b+a)
    u = unw(log1p(z) - log1p(-z))
    2.0 * (atanh(z) + im*pi*(unw(log(b)-log(a)) + u))
end

# this is the form in 11.27, with changed sign
function f_proposed(a,b)
    z = (b-a)/(b+a)
    u = -unw(log1p(z) - log1p(-z))
    2.0 * (atanh(z) + im*pi*(unw(log(b)-log(a)) + u))
end
function atest(n=1024)
    a1 = 10.0 * (rand(n) .- 0.5) .+ 0im
    a2 = 10.0 * (rand(n) .- 0.5) .+ 0im
    y0 = f_baseline.(a1,a2)
    y1 = f_higham1.(a1,a2)
    y2 = f_higham2.(a1,a2)
    y3 = f_proposed.(a1,a2)
    println("Higham 1: ",norm(y1-y0))
    println("Higham 2: ",norm(y2-y0))
    println("Proposed: ",norm(y3-y0))
end

Typical results:

julia-1.1> atest()
Higham 1: 99.74247458479297
Higham 2: 201.06192982974676
Proposed: 7.368633527567163e-12

This is because Higham's formulae (in Julia) sometimes pick strange sheets of the Riemann surface:

┌─────────────┬───────────────┬────────────────┬────────────────┬───────────────┐
│           z │      baseline │       higham1  │       higham2  │     proposed  │
├─────────────┼───────────────┼────────────────┼────────────────┼───────────────┤
│  -4.19+0im-0.486+3.14im-0.486+3.14im-0.486+3.14im-0.486+3.14im │
│   -122-0im-0.0163-3.14im-0.0163-3.14im-0.0163-15.7im-0.0163-3.14im │
│  0.236-0im0.48+0im0.48-0im0.48-0im0.48+0im │
│  0.594-0im1.37+0im1.37-0im1.37-0im1.37+0im │
│   1.35-0im1.9+3.14im1.9-3.14im1.9-9.42im1.9+3.14im │
│  -3.28+0im-0.629+3.14im-0.629+3.14im-0.629+3.14im-0.629+3.14im │
│   2.25+0im0.955-3.14im0.955-9.42im0.955-3.14im0.955-3.14im │
│  -40.9-0im-0.0489-3.14im-0.0489-3.14im-0.0489-15.7im-0.0489-3.14im │
│   4.59+0im0.443-3.14im0.443-9.42im0.443-3.14im0.443-3.14im │
│  0.334+0im0.694+0im0.694+0im0.694+0im0.694+0im │
│ -0.885-0im-2.8+0im-2.8-0im-2.8-0im-2.8+0im │
│ -0.522-0im-1.16+0im-1.16-0im-1.16-0im-1.16+0im │
│   49.8+0im0.0401-3.14im0.0401-9.42im0.0401-3.14im0.0401-3.14im │
│   1.94-0im1.14+3.14im1.14-3.14im1.14-9.42im1.14+3.14im │
│   1.34+0im1.94-3.14im1.94-9.42im1.94-3.14im1.94-3.14im │
│     13+0im0.154-3.14im0.154-9.42im0.154-3.14im0.154-3.14im │
└────────────┴────────────────┴────────────────┴────────────────┴────────────────┘

Arguments like the above with signed zeros are important for the matrix functions. Those extra multiples of 2 * pi * im put the terms out of the range of validity of the Pade forms, at least for the logarithm.

@stevengj I tried the formulation in your PR with the above sign change and see no trouble (for matrices which should be well-conditioned for log).

@StefanKarpinski
Copy link
Member

This took me a second, but it's key to reading the table above that the "baseline" and "proposed" columns always match.

@ztultrebor
Copy link
Author

@RalphAS are there values of z for which f_baseline gives the wrong answer?

@ztultrebor
Copy link
Author

What about this?

function matrix_log_taylor(A, i=256)
    B0 = A - I
    R = copy(B0)
    B = copy(B0)
    for n=2:i
        B *= B0
        R += (-1)^(n+1)*B/n
    end
    return R
end

function logmat(A)
    Λ, S = eigen(A)
    if length(Set(Λ))==length(Λ) 
        # no repeated eigenvalues
        return S*(log.(complex(Λ)).*inv(S))
    else   
        # repeated eigenvalues
        n = 0
        while norm(A-I)>1
            A = sqrt(A)
            n += 1
        end
        return 2^n*matrix_log_taylor(A)
    end
end

See here p.33.
This approach seems to be more correct and faster than the default method.

@dorn-gerhard
Copy link

Refering to the example of the beginning of this issue:
C = [30 20; -50 -30]

log(exp(C)) seems to work now fine, but
exp(log(C)) shows a rather strange behavior:
2×2 Matrix{Float64}: -7.69911 -5.13274 12.8319 7.69911

@stevengj
Copy link
Member

stevengj commented Feb 13, 2022

@dorn-gerhard, I think you mean that exp(log(C)) works as expected log(exp(C)) gives the surprising result you quoted?

julia> C = [30 20; -50 -30];

julia> log(exp(C))
2×2 Matrix{Float64}:
 -7.69911  -5.13274
 12.8319    7.69911

This is just because log is multi-valued — the result above is correct:

julia> exp(log(exp(C)))  exp(C)
true

In particular, if you look at A = C - log(exp(C)), it is a matrix whose exponential is the identity:

julia> A = C - log(exp(C))
2×2 Matrix{Float64}:
  37.6991   25.1327
 -62.8319  -37.6991

julia> exp(A)
2×2 Matrix{Float64}:
  1.0          3.55271e-14
 -8.52651e-14  1.0

because the eigenvalues of A are ±2πi:

julia> eigvals(A) / 2pi
2-element Vector{ComplexF64}:
 -2.297067269945873e-16 - 1.9999999999999998im
 -2.297067269945873e-16 + 1.9999999999999998im

@stevengj
Copy link
Member

stevengj commented Feb 13, 2022

Did we ever incorporate @RalphAS's improvement above? Looks like it:

w = atanh(z) + im * pi * (unw(logAkp1-logAk) - unw(log1p(z)-log1p(-z)))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior linear algebra Linear algebra
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants