Added validation methods #290

Thomas9292 · 2021-01-26T09:54:25Z

Even though there are several ways of validating results (of metalearners) described in the documentation, it's still complicated to estimate how trustworthy the results are.

My question is, would it be possible to perform some sort of hold out validation for the outcome variable. Since, intuitively, metalearners come to a treatment effect estimate by predicting the outcome for both treatment options, it should be possible to predict the outcome for a holdout set. In my understanding, it should then be possible to apply traditional accuracy metrics to evaluate the model.

This will of course not give any insight into the accuracy of the predictions for unobserved outcomes, but it will allow for more confidence in the model if the observed predictions are are at least somewhat accurate.

What do you think of this method? Would it be worth implementing this somehow?

ppstacy · 2021-01-26T18:19:57Z

Hi @Thomas9292 thanks for using CausalML. I think it definitely makes sense to do the validation. To serve this purpose in our example notebook for meta-learner here Part-B, we did the validation and model performance comparison on a 20% hold-out dataset for different evaluation metrics (e.g., MSE, KL Divergence, AUUC). Please take a look and let me know if you have any questions.

baendigreydo · 2021-02-09T09:36:16Z

Hi all, thank you for sharing the CausalML package!
I am a complete beginner to coding, trying to finish my thesis, excusing already for a noob question.:
Regarding this Part-B of the example notebook. How am I able to calculate these summary metrics (Abs % Error of ATE | MSE | KL Divergence) for non-synthetic data (e.g. Hillstrom Data). I got lost at this point, unfortunately.
Plus, is there any way to calculate Qini Values or plot gain/lift/qini curves from UpliftTrees? An answer to that would help me tremendously!

ppstacy · 2021-02-10T21:37:28Z

Hi @baendigreydo, to calculate those metrics for non-synthetic data unfortunately we don't have the functions now to let you directly use them, but you can reference the code here to generate yourself.

As shown in this notebook you can calculate and plot gain/lift/qini. Please let us know if you have any questions.

Thomas9292 · 2021-02-11T11:20:54Z

Hi @baendigreydo, let me illustrate to you what I did for reference. Also really curious to hear from @ppstacy if this is the way you think it could be implemented/should be done. The problem is that for the non-synthetic data, only the observed treatment is known. So I did some masking to only compare the treatment for those values.

# Create holdout set
X_train, X_test, t_train, t_test, y_train, y_test_actual = train_test_split(df_confounder, df_treatment, target, test_size=0.2)

# Fit learner on training set
learner = XGBTRegressor()
learner.fit(X=X_train, treatment=t_train, y=y_train)

# Predict the TE for test, and request the components (predictions for t=1 and t=0)
te_test_preds, yhat_c, yhat_t = learner.predict(X_test, t_test, return_components=True)

# Mask the yhats to correspond with the observed treatment (we can only test accuracy for those)
yhat_c = yhat_c[1] * (1 - t_test)
yhat_t = yhat_t[1] * t_test
yhat_test = yhat_t + yhat_c

# Model prediction error
MSE = mean_squared_error(y_test_actual, yhat_test)
print(f"{'Model MSE:':25}{MSE}")

# Also plotted actuals vs. predictions in here, will spare you the code

baendigreydo · 2021-02-15T13:23:43Z

Thanks for your answers.
I was able to generate the plots for the meta learners and Trees easily.

I also looked into the solution proposed by you @Thomas9292.
I think this is a usable workaround method but the I think there is a logical fault when validation the meta-learner accuracy: to calculate summary tables like shown in this notebook Part B, it uses the function "get_synthetic_preds_holdout" within "get_synthetic_summary_holdout". For these functions to work, the actual treatment effects "tau" are necessary which can be generated within "synthetic_data". As "tau" is not known in real world data, one has to estimate it, just like you @Thomas9292 did it (named "te_test_preds" for test set, respectively "te_train_preds" for train set, to be inserted here ) Based on these new "preds_dict_trains[KEY ACTUAL] = te_train_preds" and "preds_dict_valid[KEY_ACTUAL] = te_test_preds", the summary table can be calculated. The problem here, however, is that these "taus" are assumed to be the ground truth and further models are compared against the model used to generate the "taus", effectively a second order comparison.

I am happy if someone can confirm or even better, refute this issue, as I am puzzleheaded right now.

@Thomas9292 was MSE the only metric you used for model selection? What about Gain/Qini? Would also love to see your code for the plots as I still have a lot to learn.

@ppstacy When I did the above calculation I noticed the following. I suspect that the function "regression_metrics" is not yet complete, since no return is specified. I think it should look like this
calculating regression metrics
` reg_metrics=[]

for name, func in metrics.items():
    if w is not None:
        assert y.shape[0] == w.shape[0]
        if w.dtype != bool:
            w = w == 1
        logger.info('{:>8s}   (Control): {:10.4f}'.format(name, func(y[~w], p[~w])))
        logger.info('{:>8s} (Treatment): {:10.4f}'.format(name, func(y[w], p[w])))
    else:
        logger.info('{:>8s}: {:10.4f}'.format(name, func(y, p)))
    reg_metrics.append({name:func(y,p)})

return np.array(reg_metrics)

`

Thomas9292 added the enhancement New feature or request label Jan 26, 2021

baendigreydo mentioned this issue Feb 20, 2021

return_components for R-Learner #304

Open

daheelee assigned ppstacy Mar 17, 2021

vincewu51 closed this as completed Oct 27, 2023

cheecharron mentioned this issue Jan 12, 2024

Validating model via predictions #732

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added validation methods #290

Added validation methods #290

Thomas9292 commented Jan 26, 2021

ppstacy commented Jan 26, 2021

baendigreydo commented Feb 9, 2021

ppstacy commented Feb 10, 2021

Thomas9292 commented Feb 11, 2021 •

edited

Loading

baendigreydo commented Feb 15, 2021

Added validation methods #290

Added validation methods #290

Comments

Thomas9292 commented Jan 26, 2021

ppstacy commented Jan 26, 2021

baendigreydo commented Feb 9, 2021

ppstacy commented Feb 10, 2021

Thomas9292 commented Feb 11, 2021 • edited Loading

baendigreydo commented Feb 15, 2021

Thomas9292 commented Feb 11, 2021 •

edited

Loading