Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Symbolic RVs #6072

Merged
merged 13 commits into from
Sep 5, 2022
Merged

Implement Symbolic RVs #6072

merged 13 commits into from
Sep 5, 2022

Conversation

ricardoV94
Copy link
Member

@ricardoV94 ricardoV94 commented Aug 26, 2022

This PR wraps the graphs returned by the old SymbolicDistribution.rv_op in a SymbolicRV OpFromGraph Op so that they can be manipulated like vanilla RandomVariables. The main benefit is that we can now:

  1. Change their size, which is necessary for creating graphs with nested SymbolicRVs (nested mixtures, censored mixtures, AR with censored init_dist, you name it)
  2. Dispatch moment functions, which we noted was necessary in Incorporate AePPL into GaussianRandomWalk #5762

When these OpFromGraph don't have a specialized logprob dispatched, they can be inlined inside Aeppl, so that their logprob can be derived dynamically. This was one of the main reasons for creating SymbolicDistributions in the first place.

Closes #5533

Checklist

Major / Breaking Changes

  • ...

Bugfixes / New features

  • Implemented general univariate RandomWalk distribution factory
  • Distribution factories like Mixture, Censored, RandomWalk, and AR can now be composed together
  • Reintroduced support for nested Mixtures

Docs / Maintenance

  • Implemented SymbolicRVs to encapsulate graphs from distribution factories
  • Removed temporary SymbolicDistribution class in favor of Distribution
  • change_rv_size was removed in favor of the more general change_dist_size
  • _LKJCholeskyCov and GaussianRandomWalk are now SymbolicRVs

@codecov
Copy link

codecov bot commented Aug 26, 2022

Codecov Report

Merging #6072 (b1f705d) into main (4c4d71c) will decrease coverage by 2.06%.
The diff coverage is 95.75%.

❗ Current head b1f705d differs from pull request most recent head 8653d18. Consider uploading reports for the commit 8653d18 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #6072      +/-   ##
==========================================
- Coverage   91.74%   89.67%   -2.07%     
==========================================
  Files          86       72      -14     
  Lines       17692    12953    -4739     
==========================================
- Hits        16232    11616    -4616     
+ Misses       1460     1337     -123     
Impacted Files Coverage Δ
pymc/distributions/__init__.py 100.00% <ø> (ø)
pymc/sampling.py 82.13% <ø> (-0.02%) ⬇️
pymc/printing.py 86.08% <92.59%> (-0.58%) ⬇️
pymc/distributions/shape_utils.py 97.81% <93.33%> (-1.47%) ⬇️
pymc/distributions/timeseries.py 78.31% <94.04%> (+0.32%) ⬆️
pymc/aesaraf.py 92.92% <95.12%> (+0.73%) ⬆️
pymc/distributions/multivariate.py 92.30% <96.15%> (+0.06%) ⬆️
pymc/distributions/censored.py 92.30% <100.00%> (-0.72%) ⬇️
pymc/distributions/distribution.py 95.94% <100.00%> (+4.92%) ⬆️
pymc/distributions/mixture.py 95.37% <100.00%> (-0.35%) ⬇️
... and 22 more

@ricardoV94 ricardoV94 force-pushed the symbolic_rv branch 3 times, most recently from 825e90b to be4de60 Compare August 26, 2022 18:46
Copy link
Contributor

@lucianopaz lucianopaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work Ricardo! I skimmed through things and left a bunch of comments. Once you think this isn’ta draft anymore I’ll do a more thorough review of the tests. Did anything change in aeppl or aesara to enable this?

@ricardoV94
Copy link
Member Author

Once you think this isn’t a draft anymore I’ll do a more thorough review of the tests.

I'll re-request your review then.

Did anything change in aeppl or aesara to enable this?

Nope. It's all PyMC insfrastructure.

Copy link
Member

@larryshamalama larryshamalama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing @ricardoV94! 🙂 I feel like having an explicit SymbolicRandomVariable class would make the code structure of symbolic distributions much more clear.

Would #5931 need to be first merged before this PR?

@ricardoV94 ricardoV94 force-pushed the symbolic_rv branch 2 times, most recently from 7ad9621 to 7fcaa2a Compare August 29, 2022 10:20
@ricardoV94 ricardoV94 marked this pull request as ready for review August 29, 2022 10:21
@ricardoV94
Copy link
Member Author

Would #5931 need to be first merged before this PR?

I don't think so

@ricardoV94
Copy link
Member Author

The failing test_profile_variable is odd, and looks unrelated.

Copy link
Member

@junpenglao junpenglao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work Ricardo! I like that everything is cleaner now.

Copy link
Member

@larryshamalama larryshamalama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asking some more questions 😅

ricardoV94 and others added 12 commits September 5, 2022 10:50
* This meant that the intended checks were not being run
This allows us to treat these variables as if they were pure RandomVariables (dispatch moments, resize, etc.)

Co-authored-by: Michael Osthege <michael.osthege@outlook.com>
* change_rv_size is now called change_dist_size
* Display the dists and their inputs used by SymbolicDistributions
* Also fixes index error in AR when init_dist is scalar and size is None
* Remove unused rv_class attribute
This is necessary because the distribution is in fact a "Factory distribution" that needs to be able to resize `sd_dist` dynamically in order to work correctly.
* Implements distribution agnostic univariate RandomWalk
This risk is present when resizing a RandomVariable, whose new size depends on the size of the original RandomVariable. This can lead to wrong update expressions for the reused RNG variables that still depended on the original RandomVariable.

Similarly, there is a risk when clone-replacing RandomVariables, as the cloned variables will share the same RNGs. This happened when attempting to use Metropolis with Simulator variables.

* change_rv_size does not reuse old RNG
* compile_pymc raises if distinct update expression are inferred for the same RNG
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reimplement nested Mixtures
5 participants