Measure to compare true observed labels with predicted labels in multiclass classification tasks.
Arguments
- truth
(
factor()
)
True (observed) labels. Must have the same levels and length asresponse
.- response
(
factor()
)
Predicted response labels. Must have the same levels and length astruth
.- sample_weights
(
numeric()
)
Vector of non-negative and finite sample weights. Must have the same length astruth
. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.- ...
(
any
)
Additional arguments. Currently ignored.
Details
The Balanced Accuracy computes the weighted balanced accuracy, suitable for imbalanced data sets. It is defined analogously to the definition in sklearn.
First, all sample weights \(w_i\) are normalized per class so that each class has the same influence:
$$
\hat{w}_i = \frac{w_i}{\sum_{j=1}^n w_j \cdot \mathbf{1}(t_j = t_i)}.
$$
The Balanced Accuracy is then calculated as
$$
\frac{1}{\sum_{i=1}^n \hat{w}_i} \sum_{i=1}^n \hat{w}_i \cdot \mathbf{1}(r_i = t_i).
$$
This definition is equivalent to acc()
with class-balanced sample weights.
References
Brodersen KH, Ong CS, Stephan KE, Buhmann JM (2010). “The Balanced Accuracy and Its Posterior Distribution.” In 2010 20th International Conference on Pattern Recognition. doi:10.1109/icpr.2010.764 .
Guyon I, Bennett K, Cawley G, Escalante HJ, Escalera S, Ho TK, Macia N, Ray B, Saeed M, Statnikov A, Viegas E (2015). “Design of the 2015 ChaLearn AutoML challenge.” In 2015 International Joint Conference on Neural Networks (IJCNN). doi:10.1109/ijcnn.2015.7280767 .