Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Blog post on predict_proba #152

Closed
wants to merge 7 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 67 additions & 0 deletions _posts/2022-12-01-predict-proba.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
---
#### Blog Post Template ####

#### Post Information ####
# title: "What we should and should not expect from `predict_proba`"
title: "What to expect from `predict_proba`"
date: December 1, 2022

#### Post Category and Tags ####
# Format in titlecase without dashes (Ex. "Open Source" instead of "open-source")
categories:
- Updates
tags:
- Machine Learning

#### Featured Image ####
featured-image: sorting.png

#### Author Info ####
# Can accomodate multiple authors
# Add SQUARE Author Image to /assets/images/author_images/ folder
postauthors:
- name: Alexandre Perez-Lebel
website: https://perez-lebel.com
email: alexandre.perez@inria.fr
image: alexandre_perez.jpeg
usemathjax: true
---
<div>
<img src="/assets/images/posts_images/{{ page.featured-image }}" alt="" title="Image by storyset on Freepik">
{% include postauthor.html %}
</div>

In classification, many situations call for estimated probabilities beyond class labels. For example in decision making, cost sensitive learning or causal inference.
These probability estimates are typically accessible from the `predict_proba` method of scikit-learn's classifiers.

However, the quality of the estimated probabilities must be validated to provide trustworthiness, ensure fairness and robustness to operating conditions.
To be reliable, the estimated probabilities must be close to the true underlying posterior probabilities of the classes `P(Y=1|X)`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you know, there are several points of view on what such a probability may mean (controlled average error rate versus controlled individual probabilities). Maybe it would be good by first explaining what these two mean.


Similarly to validating a discriminant classifier through accuracy or ROC curves, tools have been developed to evaluate a probabilistic classifier.
Calibration is one of them [1-4]. Calibration is used as a proxy to evaluate the closeness of the estimated probabilities to the true ones. Many recalibration techniques have been developed to improve the estimated probabilities (see [scikit-learn's user guide on calibration](https://scikit-learn.org/stable/modules/calibration.html)). Estimated probabilities of a calibrated classifier can be interpreted as probability of correctness on population of same estimated probability, but not as the true posterior class probability.

Indeed, it is important to highlight that calibration only captures part of the error on the estimated probabilities. The remaining term is the grouping loss [5]. Together, the calibration and grouping losses fully characterize the error on the estimated probabilities, the epistemic loss.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are too formal. Give us the intuitions of why calibration is not the full story, rather than the maths.


$$\text{Epistemic loss} = \text{Calibration loss} + \text{Grouping loss}$$

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First mention Brier score for model selection. Later mention grouping loss.

However, estimating the grouping loss is a harder problem than calibration as its estimation involves directly the true probabilities. Recent work have focused on approximating the grouping loss through local estimations of the true probabilities [6].

When working with scikit-learn's classifiers, users must be equally as cautious on results obtained from `predict_proba` as on results from `predict`. Both output estimated quantities (probabilities and labels respectively) with no prior guarantees on their quality. In both cases, model's quality must be assessed with appropriate metrics: expected calibration error, brier score, accuracy, AUC.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put links to relevant pages in the scikit-learn documentation


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention quickly (with a separate section title) recalibration (and link to corresponding docs)

## References

<style>
ol > li::marker {
content: "[" counter(list-item) "]\2003";
}
</style>

<ol>
<li>Platt, J. C. (1999). Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods. ADVANCES IN LARGE MARGIN CLASSIFIERS, 61--74.</li>
<li>Zadrozny, B., & Elkan, C. (2001). Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers. Icml, 1, 609–616.</li>
<li>Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On calibration of modern neural networks. International Conference on Machine Learning, 1321–1330.</li>
<li>Minderer, M., Djolonga, J., Romijnders, R., Hubis, F., Zhai, X., Houlsby, N., Tran, D., & Lucic, M. (2021). Revisiting the calibration of modern neural networks. Advances in Neural Information Processing Systems, 34.</li>
<li>Kull, M., & Flach, P. (2015). Novel decompositions of proper scoring rules for classification: Score adjustment as precursor to calibration. Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 68–85.</li>
<li>Perez-Lebel, A., Le Morvan, M., & Varoquaux, G. (2022). Beyond calibration: estimating the grouping loss of modern neural networks. arXiv. <a href="https://doi.org/10.48550/arXiv.2210.16315">https://doi.org/10.48550/arXiv.2210.16315</a></li>
</ol>

Binary file added assets/images/author_images/alexandre_perez.jpeg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/posts_images/sorting.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.