Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SparseVariationalApproximation cleanup #86

Merged
merged 20 commits into from
Jan 7, 2022
Merged

SparseVariationalApproximation cleanup #86

merged 20 commits into from
Jan 7, 2022

Conversation

st--
Copy link
Member

@st-- st-- commented Dec 22, 2021

This PR unifies the code for Centered and NonCentered versions of SparseVariationalApproximation. It might marginally change numeric behaviour. It might marginally make it slower. On the plus side, it significantly removes the amount of code, which should make it easier to actually improve it overall.

(Motivation: I actually tried out timing ApproximateGPs vs GPflow, and when I saw how much worse it is, I started looking more into the code ... this PR simplifies the code quite a bit, but it actually seems to make it slightly slower. I think the main issue is that there's huge amounts of allocations (hundreds of megabytes for N=5000, M=500 in each iteration). Not sure how best to proceed, but it is a rather unsatisfactory state of things :/)

src/sparse_variational.jl Show resolved Hide resolved
src/sparse_variational.jl Outdated Show resolved Hide resolved
)
Cux = cov(f.prior, inducing_points(f), x)
D = f.data.Kuu.L \ Cux
μ = Cux' * f.data.α
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and line 160 below in mean_and_var) were missing the prior mean.

m_ε = mean(sva.q)
return (tr(cov(sva.q)) + m_ε'm_ε - length(m_ε) - logdet(post.data.C_ε)) / 2
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was unnecessarily making a copy (see JuliaStats/Distributions.jl#1373).

st-- and others added 3 commits December 23, 2021 00:00
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@willtebbutt
Copy link
Member

I'm not too surprised that this isn't hugely performant at the minute, given how little attention we've paid to the performance of this package so far. How would you like to proceed @st-- ? Ignore performance in this PR and look at it in some subsequent work?

@st--
Copy link
Member Author

st-- commented Dec 23, 2021

I'm not too surprised that this isn't hugely performant at the minute, given how little attention we've paid to the performance of this package so far. How would you like to proceed @st-- ? Ignore performance in this PR and look at it in some subsequent work?

I'm open to discussion. I do think this PR significantly simplifies the code (and removes a lot of near duplicate code), which will make it easier to change "big things". We might want to split it up into separate methods for Centered/NonCentered again later. But I think for now, this PR makes the package more maintainable. (It also fixes a subtle bug for non-zero mean functions. We could probably do with more tests...).

@st--
Copy link
Member Author

st-- commented Dec 23, 2021

The Laplace test failures are not related to this PR and fixed in #85.

If you have any ideas on the others, help welcome!

test/sparse_variational.jl Outdated Show resolved Hide resolved
test/sparse_variational.jl Outdated Show resolved Hide resolved
st-- and others added 4 commits December 23, 2021 19:43
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@st-- st-- marked this pull request as ready for review December 31, 2021 15:08
@codecov
Copy link

codecov bot commented Jan 4, 2022

Codecov Report

Merging #86 (b6d9026) into master (df0010d) will decrease coverage by 0.53%.
The diff coverage is 93.18%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #86      +/-   ##
==========================================
- Coverage   90.13%   89.60%   -0.54%     
==========================================
  Files           4        4              
  Lines         294      279      -15     
==========================================
- Hits          265      250      -15     
  Misses         29       29              
Impacted Files Coverage Δ
src/sparse_variational.jl 96.20% <93.02%> (-0.64%) ⬇️
src/utils.jl 42.85% <100.00%> (+9.52%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update df0010d...b6d9026. Read the comment docs.

Copy link
Member

@willtebbutt willtebbutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy for this to go in without further review on my part once

  1. my comments have been addressed
  2. the patch version has been bumped

f::ApproxPosteriorGP{<:SparseVariationalApproximation{NonCentered}}, x::AbstractVector
)
return mean(f.prior, x) + _A(f, x)' * mean(f.approx.q)
function _A_and_Cux(f, x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the risk of being a massive pedant, could we please rename Cux to Czx, i.e. consistently referring to inputs rather than random variables?

)
return mean(f.prior, x) + cov(f.prior, x, inducing_points(f)) * f.data.α
end

# Produces a matrix that is consistently referred to as A in this file. A more descriptive
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe update this to reflect the new functionality?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

@rossviljoen rossviljoen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice PR, thanks!

Just a couple of v minor things

@@ -11,3 +11,5 @@ end

_chol_cov(q::AbstractMvNormal) = cholesky(Symmetric(cov(q)))
_chol_cov(q::MvNormal) = cholesky(q.Σ)

_cov(q::MvNormal) = q.Σ
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_cov(q::MvNormal) = q.Σ
_cov(q::AbstractMvNormal) = cov(q)
_cov(q::MvNormal) = q.Σ

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, it's kinda a hack in the first place - so I think I'd rather have _cov(q) error when called with anything else, and see why... no?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@st-- I don't get why it should error - shouldn't you just be able to use any AbstractMvNormal for sva.q?

src/sparse_variational.jl Outdated Show resolved Hide resolved
@st-- st-- merged commit fee8c3d into master Jan 7, 2022
@st-- st-- deleted the st/sva_cleanup branch January 7, 2022 21:29
@st-- st-- mentioned this pull request Jan 11, 2022
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants