Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discretization methods as a general type of object/function that takes Continuous Distributions #1091

Open
sbenthall opened this issue Dec 16, 2021 · 5 comments

Comments

@sbenthall
Copy link
Contributor

The same discretization method can be applied to multiple kinds of continuous distribution.

Continuous distribution's approximation methods should take one of these methods as an argument, and the returned discrete distribution should note this method, in addition to the original continuous distribution type.

Related to #949

@llorracc
Copy link
Collaborator

Let me expand on this a bit.

I feel strongly that we should avoid creating our own objects with names like "LogNormalDistribution" because there are competing very high quality libraries that have already done a great job of that. Where possible, we should use existing libraries, and where possible we should avoid choosing names for our objects that are identical to names in external libraries.

I'm a big fan of the way Pablo has handled this in Dolo: The "true" distribution (lognormal, or whatever) is declared in the definition of the "true" model. Then later, in what corresponds to our "solve" stage, the user is required to instruct Dolo to "discretize" the object whose abstract properties have been carried around until then. That is the point at which things like "method=equiprobable" and "gridpoints=13" and "truncationpoints={-2 sigma, 2 sigma}" or whatever get specified.

My main desideratum has been to ensure that, AFTER the numerical discretization of the distribution has been created, it is possible to reproduce the discretization; this will ensure that it is possible to recover, for example, the fact that the method was a truncated lognormal with truncation points at {-2 sigma, 2 sigma} (not, say, a normal with truncation points at {-3 sigma, 4 sigma}.

@sbenthall
Copy link
Contributor Author

pgmpy (probabilistic graphical models - py) has support for discretization of continuous random variables.

It looks like they've built their framework solidly on scipy.stats

https://pgmpy.org/factors/discretize.html

@llorracc
Copy link
Collaborator

llorracc commented Feb 9, 2022

This looks promising. Browsing their documentation, as best I can tell the user is required to provide the discretization points, but even if that's true adopting this would be a big step up in standardization of what we are doing. (The points for the equiprobable discretization of a lognormal are something that we are already calculating; we then just do an amateurish "good enough" job of translating those into point masses; so it seems like it would be an unambiguous improvement to instead feed them to a more sophisticated infrastructure like this -- and something that could be done by somebody with no knowledge of economics, but just a solid grasp of probability and statistics and good programming hygiene.

@sbenthall
Copy link
Contributor Author

Consider: discretization of a continuous distribution as a "bijection" or a "transformation" of a distribution, a la Tensorflow's probability package and Mathematica/Sympy.

@sbenthall
Copy link
Contributor Author

Another possible library for continuous distributions: http://pyro.ai/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants