Adjusting the outputs of a classifier to new a priori probabilities:: A simple procedure

被引：211

作者：

Saerens, M

Latinne, P

Decaestecker, C

机构：

[1] Free Univ Brussels, IRIDA Lab, B-1050 Brussels, Belgium

[2] SmalSMvM, Res Sect, Brussels, Belgium

[3] Free Univ Brussels, Lab Histopathol, B-1070 Brussels, Belgium

来源：

NEURAL COMPUTATION | 2002年 / 14卷 / 01期

关键词：

D O I：

10.1162/089976602753284446

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

It sometimes happens (for instance in case control studies) that a classifier is trained on a data set that does not reflect the true a priori probabilities of the target classes on real-world data. This may have a negative effect on the classification accuracy obtained on the real-world data set, especially when the classifier's decisions are based on the a posteriori probabilities of class membership. Indeed, in this case, the trained classifier provides estimates of the a posteriori probabilities that are not valid for this real-world data set (they rely on the a priori probabilities of the training set). Applying the classifier as is (without correcting its outputs with respect to these new conditions) on this new data set may thus be suboptimal. In this note, we present a simple iterative procedure for adjusting the outputs of the trained classifier with respect to these new a priori probabilities without having to refit the model, even when these probabilities are not known in advance. As a by-product, estimates of the new a priori probabilities are also obtained. This iterative algorithm is a straightforward instance of the expectation-maximization (EM) algorithm and is shown to maximize the likelihood of the new data. Thereafter, we discuss a statistical test that can be applied to decide if the a priori class probabilities have changed from the training set to the real-world data. The procedure is illustrated on different classification problems involving a multilayer neural network, and comparisons with a standard procedure for a priori probability estimation are provided. Our original method, based on the EM algorithm, is shown to be superior to the standard one for a priori probability estimation. Experimental results also indicate that the classifier with adjusted outputs always performs better than the original one in terms of classification accuracy, when the a priori probability conditions differ from the training set to the real-world data. The gain in classification accuracy can be significant.

引用

页码：21 / 41

页数：21

共 22 条

[1]

[Anonymous], 1974, Introduction to the Theory of Statistics

[2]

Blake C.L., 1998, UCI repository of machine learning databases

[3]

Breiman L, 1998, ANN STAT, V26, P801

[4] ON THE EXPONENTIAL VALUE OF LABELED SAMPLES [J].

CASTELLI, V ;

COVER, TM .

PATTERN RECOGNITION LETTERS, 1995, 16 (01) :105-111

[5] Quantitative chromatin pattern description in feulgen-stained nuclei as a diagnostic tool to characterize the oligodendroglial and astroglial components in mixed oligo-astrocytomas [J].

Decaestecker, C ;

Lopes, BS ;

Gordower, L ;

Camby, I ;

Cras, P ;

Martin, JJ ;

Kiss, R ;

VandenBerg, SR ;

Salmon, I .

JOURNAL OF NEUROPATHOLOGY AND EXPERIMENTAL NEUROLOGY, 1997, 56 (04) :391-402

[6] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].

DEMPSTER, AP ;

LAIRD, NM ;

RUBIN, DB .

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38

[7]

Ghahramani Z., 1994, Advances in Neural Information Processing Systems, V6, P120

[8]

Hand D.J., 1981, DISCRIMINATION CLASS

[9]

KISH L, 1974, J ROY STAT SOC B MET, V36, P1

[10] Semiparametric methods for response-selective and missing data problems in regression [J].

Lawless, JF ;

Kalbfleisch, JD ;

Wild, CJ .

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1999, 61 :413-438

← 1 2 3 →