Decision Fusion is a combination of the decisions of multiple classifiers into a common decision;
i.e. a classifier ensemble operation.
The fusion of prediction vectors from multiple classifiers to a single prediction vector,
out of which the decision is taken via argmax
.
eFusor library provides an interface to common Decision Fusion methods;
such as Majority Voting and
less known Tournament-style Borda Counting,
as well as basic operations like max
and average
;
implemented using numpy
.
The expected input for fusion is either tensor
or matrix
.
Vector = list[float]
-- ordered list of predictions scores from a model for a queryMatrix = list[Vector]
-- ordered list of vectors; prediction scores for a query from a number of modelsTensor = list[Matrix]
-- ordered list of matrices; batch of predictions for several documents
scikit-learn provides common ensemble learning methods to combine the predictions of several classifiers; and train meta predictors. An alternative to the ensemble learning methods is to use a heuristic.
eFusor provides this heuristic based decision fusion functionality.
eFusor was developed specifically to address the scenario
where predictors (classifiers) may have different label spaces.
Consequently, the library makes distinction between classes predicted with a low score (0.0
)
and not predicted classes (nan
).
eFusor provides a vectorize
function to do the vectorization
making distinction between predicted and not predicted classes.
The function expects a list
of class labels
and a dict
of prediction scores.
from efusor import vectorize
labels = ["A", "B", "C", "D"]
scores = {"A": 0.75, "B": 0.25, "C": 0.00}
vector = vectorize(labels, scores)
# array([0.75, 0.25, 0. , nan])
The function supports scores input as a vector, a matrix or a tensor. That is a dict, a list of dicts or a list of lists of dicts.
Since decision fusion of prediction vectors boils down to the reduction of a matrix to a vector column-wise, i.e. reducing a column vector to a scalar; any mathematical operation on a vector of numbers applies.
In Kittler, Hatef, Duin, and Matas (1998) "On Combining Classifiers". IEEE Transactions on Pattern Analysis and Machine Intelligence, 20-3. The authors use the functions below as basic classifier combination schemes.
method | notes |
---|---|
average |
mean value of a vector; requires well calibrated scores. |
product |
product rule and product rule issues! |
sum |
approximation of product ; assumes posteriors to be not far from priors! |
max |
approximation of sum |
min |
bound version of product |
median |
approximation of sum ; robust version of average |
The basic fusion methods operate with the classifier prediction scores, a real number vectors. The problem could be reduced to operate on one-hot vectors; in a way first taking per-classifier decision, rather than postponing it. Combination of decision vectors is commonly done as a majority rule.
scikit-learn
provides VotingClassifier
as an ensemble method and makes distinction between Hard Voting and Soft Voting.
While Hard Voting is the Majority Voting;
Soft Voting is nothing other than an average
(or weighted arithmetic mean, if weights are provided).
Rank-based voting, specifically tournament-style borda count, is a decision technique commonly used is election decisions. While majority voting transforms prediction scores to a one-hot vector; rank-based voting transforms it to an integer vector of ranks (the higher the score the lower the rank).
The benefit is that we still consider all predictions for fusion and do not require well calibrated scores.
In certain scenarios (e.g. fusion of decisions of rule-based and machine learning predictors), it is desired to weigh different classifiers differently. Weighted Average is a commonly used scheme.
soft_voting
(an average) and hard_voting
both implement weighted fusion.
(While Borda Count also allows to weigh different classifiers differently, it is not implemented).
An alternative to the weighted fusion is to select a prediction vector from a matrix with respect to the weight vector. However, in the scenario where predictors are allowed to have different label spaces, this could lead to the final decision to be an all-NaN vector.
The priority
fusion method implements such a heuristic,
and yielding the first non-NaN prediction vector from a matrix with respect to the weight vectors.
In case of equal weight values, a max
fusion is applied on the set.
The primary decision fusion function is fuse
.
from efusor import fuse
methods = [
"max", "min", "sum", "product", "median", "average",
"hard_voting", "soft_voting",
"borda"
]
matrix = [[0.25, 0.60, 0.15], [0.00, 0.80, 0.00]]
weight = [0.75, 0.25]
# unweighted results
for method in methods:
result = fuse(matrix, method=method, digits=3)
print(f"{method:<16}: {result}")
max : [0.25, 0.8, 0.15]
min : [0.0, 0.6, 0.0]
sum : [0.0, 1.067, 0.0]
product : [0.0, 0.16, 0.0]
median : [0.125, 0.7, 0.075]
average : [0.125, 0.7, 0.075]
hard_voting : [0, 2, 0]
soft_voting : [0.125, 0.7, 0.075]
borda : [1.0, 4.0, 0.0]
from efusor import fuse
matrix = [[0.25, 0.60, 0.15], [0.00, 0.80, 0.00]]
weight = [0.75, 0.25]
for method in ["hard_voting", "soft_voting"]:
result = fuse(matrix, method=method, digits=3, weights=weight)
print(f"{method:<16}: {result}")
(rounded for readability)
hard_voting : [0.0, 1.0, 0.0]
soft_voting : [0.188, 0.65, 0.112]
- requires
weights
(priorities)
from efusor import fuse
matrix = [[0.25, 0.60, 0.15], [0.00, 0.80, 0.00]]
weight = [0.75, 0.25]
for method in ["priority"]:
result = fuse(matrix, method=method, digits=3, weights=weight)
print(f"{method:<16}: {result}")
priority : [0.25, 0.6, 0.15]