-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MvNormalMeanScalePrecision distribution #206
base: main
Are you sure you want to change the base?
Conversation
ping @Nimrais |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, great work on implementing the MvNormalMeanScalePrecision distribution and integrating it into the ExponentialFamily.jl.
return prod(BayesBase.default_prod_rule(wleft, wright), wleft, wright) | ||
end | ||
|
||
function BayesBase.rand(rng::AbstractRNG, dist::MvGaussianMeanScalePrecision{T}) where {T} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function BayesBase.rand(rng::AbstractRNG, dist::MvGaussianMeanScalePrecision{T}) where {T}
μ, γ = mean(dist), scale(dist)
return μ .+ (1 / γ) .* randn(rng, T, length(μ))
end
Avoid constructing the identity matrix I(length(μ)) and directly scale the random vector.
Use broadcasting with ., which is more efficient and avoids unnecessary allocations.
|
||
# FIXME: This is not the most efficient way to generate random samples within container | ||
# it needs to work with scale method, not with std | ||
function BayesBase.rand!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similiarly to rand
function BayesBase.rand!(rng::AbstractRNG, dist::MvGaussianMeanScalePrecision, container::AbstractArray{T}) where {T <: Real}
μ, γ = mean(dist), scale(dist)
randn!(rng, container)
@. container = μ + (1 / γ) * container
return container
end
Btw I think rand just need to re-use rand!
test/distributions/normal_family/mv_normal_mean_scale_precision_tests.jl
Outdated
Show resolved
Hide resolved
…{MvNormalMeanScalePrecision})
…{MvNormalMeanScalePrecision})
…malMeanScalePrecision})
…recision efficency test fix: remove unneeded code fix: remove not needed stuff fix: remove unused code test: add efficency test fix: return distributions_setuptests to HEAD test(fix): typo test(fix): remove unneeded testset test(fix): update efficency test
3812a9b
to
77f4a0d
Compare
@bvdmitri I think PR is ready for review, but I need some help with efficient implementation of the fisher. The only tests that are failing are once that checking that fisher in this parametrisation is really faster. |
Thanks for refactoring this, @Nimrais! |
@test_opt cholinv(fi_full) | ||
|
||
cholinv_time_small = @elapsed cholinv(fi_small) | ||
cholinv_alloc_small = @allocated fisherinformation(ef_small) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here is supposed to be cholinv
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cholinv_alloc_small = @allocated fisherinformation(ef_small) | |
cholinv_alloc_small = @allocated cholinv(ef_small) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All tests that are failing involve the cholinv_time_small
so it might be the reason for those failures
sorry its alloc here, perhaps the cholinv for the BlockArray
is not that efficient? I can also take a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we have a fast algorithm to compute the Cholesky
factorization for this kind of matrices? It looks like a rank-1 update D + u'u
so perhaps Cholesky
can be done super fast? and then the inv
of it too. If this is the case we could add this type of a matrix in BayesBase
for example and write a specialized method for fastcholesky
in FastCholsky.jl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you have a right idea I suppose. Yesterday evening I asked a question in the Julia slack about this matrix. And Woodbury matrix identity was suggested to me. I found that it's a diagonal with a 2-rank update.
A = zeros(k+1, k+1)
A[1:k, 1:k] .= -inv(2*η2) * I(k)
A[k+1, k+1] = η2_part
U = zeros(k+1)
V = zeros(k+1)
U[k+1] = 1
V[1:k] = η1 * inv(2*η2^2)
V[k+1] = 0
M = A + U * V' + V * U'
Applying Woodbury identity
A_inv = zeros(k+1, k+1)
A_inv[1:k, 1:k] .= -2*η2 * I(k)
A_inv[k+1, k+1] = inv(A[k+1, k+1])
# Construct U and V
U = zeros(k+1, 2)
V = zeros(k+1, 2)
U[k+1, 1] = 1
U[1:k, 2] = η1 * inv(2*η2^2)
# Construct C and compute its inverse
C = [0 1; 1 0]
C_inv = inv(C)
S = C_inv + U' * A_inv * U
# Compute S_inv
S_inv = inv(S)
# Compute fisher_inv using the Woodbury identity
fisher_inv = A_inv - A_inv * U * S_inv * U' * A_inv
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This package was suggested to me to use once I have decomposition https://github.com/JuliaLinearAlgebra/WoodburyMatrices.jl. However I am a bit hesitant to use it just for one matrix.
…n_tests.jl Co-authored-by: Bagaev Dmitry <bvdmitri@gmail.com>
This PR was initially aimed at addressing ReactiveBayes/ReactiveMP.jl#387, which it still does. The distribution in question is parametrized by the mean and scale parameter of the precision matrix.
Initially, I implemented it as part of the
MultivariateNormalDistributionsFamily
. However, the conversions betweenMvNormalMeanScalePrecision
and other distributions in this "class" don't always hold.During the process, @Nimrais suggested that this distribution could be particularly interesting for
ExponentialFamilyProjections.jl
. To make it more useful, we need to optimize methods related to the computation of the Fisher information matrix. I made a first attempt to improve performance by modifying the computation ofkron(invη2, invη2)
. I believe further improvements are possible, but this serves as a starting point.Any suggestions for additional optimizations to enhance the distribution's effectiveness are much welcome.
UPD: I added the piece of code that actually fixes the ReactiveBayes/ReactiveMP.jl#387