DEMOCRACY IN NEURAL NETS - VOTING SCHEMES FOR CLASSIFICATION

被引:182
作者
BATTITI, R [1 ]
COLLA, AM [1 ]
机构
[1] ELSAG BAILEY,GENOA,ITALY
关键词
MODULAR NEURAL NETWORKS; MULTILAYER PERCEPTRON; BAYESIAN CLASSIFICATION; ACCURACY REJECTION TRADE-OFF; OPTICAL CHARACTER RECOGNITION;
D O I
10.1016/0893-6080(94)90046-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we discuss some possible ways to combine the outputs of a set of neural network classifiers to reach a combined decision with a higher performance, in terms of lower rejection rates and/or better accuracy rates. The methods considered range from the requirement of a complete agreement among the individual classifications to election schemes based on the distribution of votes collected by the different classes. In addition, the rejection rules based on the different output classes can be complemented by rules that also consider the information in the individual output vectors, with the possibility of using threshold requirements and that of averaging the different vectors. Although the Bayesian framework and some probabilistic assumptions provide useful indications about the potential advantage of different combination schemes, the combined performance ultimately depends on the joint probability distribution of the outputs, and it can be estimated by joining the results of different nets on the same test set. The combination methods are very flexible, they permit a straightforward cooperation of neural and traditional recognizers, and they are appropriate in a development environment where experiments are performed with different kinds of nets and features for a selected application. From our experiments in the field of handwritten digit recognition (up to a total of more than 50,000 characters), we found that the use of a small number of nets (two to three) with a sufficiently large uncorrelation in their mistakes reaches a combined performance that is significantly higher than the best obtainable from the individual nets, with a negligible effort after starting from a pool of networks produced in the development phase of an application. In particular for a real-world OCR application, the best accuracy increase is about half the increase in the rejection rate, so that accuracies of the order of 99.5% can be reached by rejecting less than 5% of the patterns. This performance is significant for real applications.
引用
收藏
页码:691 / 707
页数:17
相关论文
共 20 条
  • [1] APOLLONI B, 1990, 3 P WORKSH PAR ARCH, P377
  • [2] Battiti R., 1992, Fifth Italian Workshop. Neural Nets WIRN VIETRI-92, P237
  • [3] BATTITI R, 1990, SILICON ARCHITECTURE, P31
  • [4] LEARNING VECTOR QUANTIZATION FOR THE PROBABILISTIC NEURAL NETWORK
    BURRASCANO, P
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1991, 2 (04): : 458 - 461
  • [5] A MEASURE OF ASYMPTOTIC EFFICIENCY FOR TESTS OF A HYPOTHESIS BASED ON THE SUM OF OBSERVATIONS
    CHERNOFF, H
    [J]. ANNALS OF MATHEMATICAL STATISTICS, 1952, 23 (04): : 493 - 507
  • [6] Denker J.S., 1991, ADV NEURAL INFORM PR, P853
  • [7] Duda R. O., 1973, PATTERN CLASSIFICATI, V3
  • [8] HANSEN LK, 1992, NEURAL NETWORKS SIGN, V2, P333
  • [9] JACOBS RA, 1991, ADV NEURAL INFORMATI, P767
  • [10] Adaptive Mixtures of Local Experts
    Jacobs, Robert A.
    Jordan, Michael I.
    Nowlan, Steven J.
    Hinton, Geoffrey E.
    [J]. NEURAL COMPUTATION, 1991, 3 (01) : 79 - 87