Bayesian predictiveness, exchangeability and sufficientness in bacterial taxonomy

被引:6
作者
Gyllenberg, M [1 ]
Koski, T
机构
[1] Univ Turku, Dept Math, Turku 20014, Finland
[2] Linkoping Univ, Dept Math, S-58183 Linkoping, Sweden
关键词
multivariate binary data; Bayesian risk consistency; Bahadur-Lazarsfeld expansions; supervised learning; multivariate Bernoulli distributions;
D O I
10.1016/S0025-5564(01)00096-7
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We present a theory of classification and predictive identification of bacteria. Bacterial strains are characterized by a binary vector and the taxonomy is specified by attaching a label to each vector. The theory is developed from only two basic assumptions, viz. that the sequence of pairs of feature vectors and the attached labels is judged (infinitely) exchangeable and predictively sufficient. We derive expressions for the training error and the probability of identification error and show that latter is an affine function of the former. We prove the law of large numbers for identification matrices, which contain the fundamental information of bacterial data. We prove the Bayesian risk consistency of the predictive identification rule given by the theory and show that the training error is a consistent estimate of the generalization error. (C) 2002 Published by Elsevier Science Inc.
引用
收藏
页码:161 / 184
页数:24
相关论文
共 86 条
[11]  
Baldi P., 1998, Bioinformatics: The machine learning approach
[12]   IDENTIFICATION OF BACTERIA BY COMPUTER - IDENTIFICATION OF REFERENCE STRAINS [J].
BASCOMB, S ;
LAPAGE, SP ;
CURTIS, MA ;
WILLCOX, WR .
JOURNAL OF GENERAL MICROBIOLOGY, 1973, 77 (AUG) :291-+
[13]   EXPERIMENTAL METHODS IN COMPUTER TAXONOMY [J].
BEERS, RJ ;
LOCKHART, WR .
JOURNAL OF GENERAL MICROBIOLOGY, 1962, 28 (04) :633-&
[14]  
Bender E. A., 2000, Mathematical Methods in Artificial Intelligence
[15]   LACK OF PRECISION IN COMMERCIAL IDENTIFICATION SYSTEMS - CORRECTION USING BAYESIAN-ANALYSIS [J].
BERGER, SA .
JOURNAL OF APPLIED BACTERIOLOGY, 1990, 68 (03) :285-288
[16]  
Bernardo J.M., 2009, Bayesian Theory, V405
[17]  
Cheeseman P., 1988, Computational Intelligence, V4, P58, DOI 10.1111/j.1467-8640.1988.tb00091.x
[18]   JEFFREYS PRIOR IS ASYMPTOTICALLY LEAST FAVORABLE UNDER ENTROPY RISK [J].
CLARKE, BS ;
BARRON, AR .
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 1994, 41 (01) :37-60
[19]   UNIVERSAL NOISELESS CODING [J].
DAVISSON, LD .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1973, 19 (06) :783-795
[20]  
Dawid A. P., 1992, Bayesian Statistics, V4, P109