-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement to_vec for matrix factorizations #128
Conversation
LGTM. Could you please bump the patch version? |
I'm happy with this, but could someone else quickly take a look to make sure I've not missed anything obvious. @oxinabox @nickrobinson251 @wesselb ? |
I am on leave. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Thanks for adding this. :) I've left two small comments.
end | ||
return x_vec, QRCompact_from_vec | ||
end | ||
|
||
# Non-array data structures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x.T
consists of upper triangular blocks: JuliaLang/julia#41363 (comment)
The lower part is undefined on purpose, NaNs can hide there. This line won't work:
x_vec, back = to_vec([x.factors, x.T])
For application purposes it may be preferable to think of derivatives in Q
and R
, not in factors
and T
.
So to_vec
would unroll Q
and R
into a matrix. To instantiate a new QR object from updated values, one would need to re-calculate the QR factorization:
function FiniteDifferences.to_vec(x::S) where {S <: Union{LinearAlgebra.QRCompactWYQ, LinearAlgebra.QRCompactWY}}
x_vec, x_back = to_vec([x.Q x.R])
function QRCompact_from_vec(v)
Q_new, R_new = x_back(v)
QR = S(Q_new * R_new)
return S(QR.factors, QR.T)
end
return x_vec, QRCompact_from_vec
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that but unfortunately recomputing the QR factorization is enough to introduce numerical errors, and make the to_vec
test fail (since it test for exact equality).
As far as I understand the internal representation used by to_vec
should not change the result and it should be fine. If it is not the case, I will need to be smarter, somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand the internal representation used by to_vec should not change the result and it should be fine. If it is not the case, I will need to be smarter, somehow.
If it contains NaNs, it definitely does. This should be reusing the logic added in JuliaLang/julia#41363, in particular _triuppers_qr
, that way no numerical instabilities will be introduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure to understand. The numerical instability I am referring to are those that prevent are introduced by recomputing the QR decomposition internally:
julia> X = rand(10, 10) ;
julia> F = qr(X) ;
julia> Xnew = F.Q * F.R ;
julia> Xnew == X
false
julia> Fnew = qr(Xnew) ;
julia> Fnew == F
false
julia> Fnew.Q == F.Q
false
The tests require from_vec(to_vec(F)) == F
which is not guaranteed there.
To deal with NaNs in T
is it preferrable to change the supported version of Julia and use LinearAlgebra._triuppers_qr
or to reproduce here part of the logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just meant that by filtering out the undef values, we get both the correct behavior and avoid any numerical issues related to having to refactorize the recalculated matrix.
To deal with NaNs in T is it preferrable to change the supported version of Julia and use LinearAlgebra._triuppers_qr or to reproduce here part of the logic?
It's probably easiest if we just copy that code from there for now. Would be good to mention in a comment that this is taken from Base though and that it can be removed once we drop support for older versions of Julia.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, a good test here would be to check that to_vec(qr(A)) == to_vec(qr(A))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I incorporated that, and added a test of NaN in test_to_vec along the way.
QR must unfortunaely be tested separately for now, because it compares the T matrices that have those arbitrary values.
What is the status of this PR? I see that some checks are failing but the logs have expired. Do you need help with this? |
It is waiting for @Kolaru 's to replay to Wessel's comments. It is reasonable to make another PR to branching off there branch, then rebase and fix the tests, and the comments, and we would be able to merge it. |
Indeed it kind of went off my radar. I think it may have been partially superseeded by #156. I will try to come back to it ASAP. |
dbcbb7d
to
7e3010d
Compare
This should be ready now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The QRCompactWY case is still broken, see my comment above.
@@ -162,11 +162,24 @@ function to_vec(x::Cholesky) | |||
return x_vec, Cholesky_from_vec | |||
end | |||
|
|||
function to_vec(x::S) where {S <: Union{LinearAlgebra.QRCompactWYQ, LinearAlgebra.QRCompactWY}} | |||
x_vec, back = to_vec([x.factors, x.T]) | |||
function to_vec(x::S) where {U, S <: Union{LinearAlgebra.QRCompactWYQ{U}, LinearAlgebra.QRCompactWY{U}}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this is valid for QRCompactWYQ
? If it is, it should at least be tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QRCompactWYQ
is represented with the same T
and factors
matrices than QRCompactWY
, yes. I added the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect, thanks!
factors, T = back(v) | ||
return S(factors, T) | ||
factors, Tback = back(v) | ||
return S(factors, Tback) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that using S
like this here will not be type stable. Since it's not super performance critical, it might not be worth all the trouble though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why is that so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test/to_vec.jl
Outdated
@test F.Q == F.Q | ||
@test F.R == F.R |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this test? Would be good to also add the to_vec(qr(A)) == to_vec(qr(A))
test I proposed above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an oversight from me, it should read F_back.Q == F.Q
, to test that back(to_vec(qr(A)))
correctly gives back the input.
Your suggested test was on the next line, I have reformulated it to make more explicit what is tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more comment, otherwise looks good to me. Thanks for pulling through!
Co-authored-by: Simeon Schaub <simeondavidschaub99@gmail.com>
I integrated the last suggestion, that should indeed be enough to cover all cases. |
Cool, thanks for the contribution! |
Fixes #114 and should help cleaning the tests in JuliaDiff/ChainRules.jl#306.