There is an increasing need for transparency and fairness in Machine Learning (ML) models predictions. Consider for example a banker who has to explain to a client why his/her loan application is rejected, or a healthcare professional who must explain what constitutes his/her diagnosis. Some ML models are indeed very accurate, but are considered to be hard to explain, relatively to popular linear models.
Source of figure: James, Gareth, et al. An introduction to statistical learning. Vol. 112. New York: springer, 2013.
We do not want to sacrifice this high accuracy to explainability. Hence: ML explainability. There are a lot of ML explainability tools out there, in the wild.
The teller
is a model-agnostic tool for ML explainability. Agnostic, as long as the input ML model possesses methods fit
and predict
, and is applied to tabular data. The teller
relies on:
- Finite differences to explain ML models predictions: a little increase in model's explanatory variables + a little decrease, and we can obtain approximate sensitivities of its predictions to changes in these explanatory variables.
- Conformal prediction (so far, as of october 2022) to obtain prediction intervals for ML methods
- From Pypi, stable version:
pip install the-teller
- From Github, for the development version:
pip install git+https://github.com/Techtonique/teller.git
These notebooks will be some good introductions:
- Heterogeneity of marginal effects
- Significance of marginal effects
- Model comparison
- Classification
- Interactions
- Prediction intervals for regression
Your contributions are welcome, and valuable. Please, make sure to read the Code of Conduct first.
If you're not comfortable with Git/Version Control yet, please use this form.
In Pull Requests, let's strive to use black
for formatting:
pip install black
black --line-length=80 file_submitted_for_pr.py
https://techtonique.github.io/teller/
- Numpy
- Pandas
- Scipy
- scikit-learn
@misc{moudiki2019teller,
author={Moudiki, T.},
title={\code{teller}, {M}odel-agnostic {M}achine {L}earning explainability},
howpublished={\url{https://github.com/thierrymoudiki/teller}},
note={BSD 3-Clause Clear License. Version 0.x.x.},
year={2019--2020}
}
For sensitivity analysis:
- Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. (1992). Numerical recipes in C (Vol. 2). Cambridge: Cambridge university press.
- Jones E, Oliphant E, Peterson P, et al. SciPy: Open Source Scientific Tools for Python, 2001-, http://www.scipy.org/ [Online; accessed 2019-01-04]
- Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011.
For prediction intervals:
- Romano, Y., Patterson, E., & Candes, E. (2019). Conformalized quantile regression. Advances in neural information processing systems, 32.
BSD 3-Clause © Thierry Moudiki, 2019.