Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

predict fails with Variational Inference #572

Closed
omrihar opened this issue Oct 10, 2022 · 4 comments
Closed

predict fails with Variational Inference #572

omrihar opened this issue Oct 10, 2022 · 4 comments
Labels

Comments

@omrihar
Copy link

omrihar commented Oct 10, 2022

I'm trying to use predict on out of sample data after sampling using variational inference. When I try to predict I get an "unexpected dimensions" error.

Minimal code to reproduce:

import pandas as pd
import bambi as bmb

test_model = bmb.Model('y ~ x', pd.DataFrame({'x': [1,2,3], 'y': [2, 3, 4]}))
test_approx = test_model.fit(inference_method='vi')
test_tr = test_approx.sample(draws=1000)

test_model.predict(idata=test_tr, data=pd.DataFrame({'x': [4, 5], 'y': [1,1]}))
@tomicapretto
Copy link
Collaborator

tomicapretto commented Oct 11, 2022

There's definitely something going on. The predictor x has a dimension when it shouldn't.

image

I tried reproducing the issue with PyMC, and I couldn't. So I realized it is something about how Bambi builds the model.

Then I looked into the random variables in the underlying PyMC model and found the problem:

test_model.backend.model.free_RVs
# [Intercept ~ N(3, 5.4), x ~ N(0, 2.5), y_sigma ~ HalfStudentT(4, 0.816)]
test_model.backend.model.free_RVs[1].shape
# TensorConstant{(1,) of 1}

which means that the distribution assigned for the slope is 1 dimensional when it should be scalar. That's why it adds an extra dimension.

If we have a look at the PyMC model, this does not happen

import pymc as pm

x = [1, 2, 3]
y = [2, 3, 4]

with pm.Model() as pm_model:
    a = pm.Normal("a")
    b = pm.Normal("b")
    pm.Normal("y", mu=a + b * x, sigma=pm.HalfNormal("sigma"), observed=y)
pm_model.free_RVs
# [a ~ N(0, 1), b ~ N(0, 1), sigma ~ N**+(0, 1)]
pm_model.free_RVs[1].shape
# TensorConstant{[]}

@omrihar
Copy link
Author

omrihar commented Oct 19, 2022

Is there a way to circumvent this until the PR gets merged? For example, changing the dimensionality of the scalars after fitting, so that the predict method would work?

@tomicapretto
Copy link
Collaborator

tomicapretto commented Oct 19, 2022

I don't know. But if you don't want to wait until the PR is merged, you can monkey-patch Bambi locally changing only these lines

shape = None if data.shape[1] == 1 else data.shape[1]
coef = distribution(label, shape=shape, **args)
coef = at.atleast_1d(coef)

in line 51 of bambi/backend/terms.py.

See the screenshot

image

All the other changes are related to predictions

@tomicapretto
Copy link
Collaborator

Since #575 is merged, we can close this one. Thanks for reporting the issue @omrihar!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants