You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If an ARIMA model is fit with exogenous variables, in-sample predictions do not appear to depend on the X values provided to predict_in_sample. In fact, even arrays with missing columns and rows are allowed. Are the true X values in sample saved somewhere, so they don't need to be provided? Or am I missing something here..?
To Reproduce
import pmdarima as pm
from pmdarima import model_selection
import numpy as np
import pandas as pd
np.random.seed(42)
y = pm.datasets.load_wineind()
df = pd.DataFrame(
{
"x1": y * np.random.uniform(0, 0.5, len(y)) + np.random.randint(1, 1000, len(y)),
"x2": y * np.random.uniform(0.5, 0.7, len(y)) + np.random.randint(1, 10000, len(y)),
}
)
df["y"] = y
train, test = model_selection.train_test_split(df, train_size=150)
arima = pm.auto_arima(
train["y"],
train.drop(columns="y"),
error_action="ignore",
trace=True,
suppress_warnings=True,
maxiter=5,
seasonal=True,
m=12,
)
# preds1 takes the expected X args
preds1 = arima.predict_in_sample(X=train.drop(columns="y"))
# preds2 takes xargs with the correct dims, but different values from those used for preds1
preds2 = arima.predict_in_sample(X=train.drop(columns="y") + 1000)
# preds3 takes only x2, not x1, and x2 is subset to only 10 observations
preds3 = arima.predict_in_sample(X=train[:10].drop(columns=["y", "x1"]))
len(preds1) # 150
len(preds2) # 150
len(preds3) # 150
all(preds1 == preds2) # True
all(preds2 == preds3) # True
arima.summary() # To confirm that indeed x1 and x2 are in the model
Describe the bug
If an ARIMA model is fit with exogenous variables, in-sample predictions do not appear to depend on the X values provided to
predict_in_sample
. In fact, even arrays with missing columns and rows are allowed. Are the true X values in sample saved somewhere, so they don't need to be provided? Or am I missing something here..?To Reproduce
Versions
Expected Behavior
I expect the code to break if an X of incorrect dimensions is provided, and predictions to depend on the values of a correctly-dimensioned X.
Actual Behavior
The code does not break, and there is no difference in output.
Additional Context
No response
The text was updated successfully, but these errors were encountered: