Sliding window discretization: a new method for multiple band matching of bacterial genotyping fingerprints

被引:2
作者
Austin, B
Dawyndt, P
Gyllenberg, M [1 ]
Koski, T
Lund, T
Swings, J
Thompson, FL
机构
[1] Heriot Watt Univ, Sch Life Sci, Edinburgh EH14 4AS, Midlothian, Scotland
[2] State Univ Ghent, Microbiol Lab, B-9000 Ghent, Belgium
[3] Univ Turku, Dept Math, FIN-20014 Turku, Finland
[4] Linkoping Univ, Dept Math, S-58183 Linkoping, Sweden
[5] Nokia Mobile Phones, FIN-24100 Salo, Finland
基金
芬兰科学院;
关键词
D O I
10.1016/j.bulm.2004.02.004
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Microbiologists have traditionally applied hierarchical clustering algorithms as their mathematical tool of choice to unravel the taxonomic relationships between micro-organisms. However, the interpretation of such hierarchical classifications suffers from being subjective, in that a variety of ad hoc choices must be made during their construction. On the other hand, the application of more profound and objective mathematical methods-such as the minimization of stochastic complexity-for the classification of bacterial genotyping fingerprints data is hampered by the prerequisite that such methods only act upon vectorized data. In this paper we introduce a new method, coined sliding window discretization, for the transformation of genotypic fingerprint patterns into binary vector format. In the context of an extensive amplified fragment length polymorphism (AFLP) data set of 507 strains from the Vibrionaceae family that has previously been analysed, we demonstrate by comparison with a number of other discretization methods that this new discretization method results in minimal loss of the original information content captured in the banding patterns. Finally, we investigate the implications of the different discretization methods on the classification of bacterial genotyping fingerprints by minimization of stochastic complexity, as it is implemented in the BinClass software package for probabilistic clustering of binary vectors. The new taxonomic insights learned from the resulting classification of the AFLP patterns will prove the value of combining sliding window discretization with minimization of stochastic complexity, as an alternative classification algorithm for bacterial genotyping fingerprints. (C) 2004 Society for Mathematical Biology. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:1575 / 1596
页数:22
相关论文
共 28 条
[1]  
DAWYNDT P, UNPUB INT J SYST EVO
[2]   MEASURES OF THE AMOUNT OF ECOLOGIC ASSOCIATION BETWEEN SPECIES [J].
DICE, LR .
ECOLOGY, 1945, 26 (03) :297-302
[3]   RAPID METHODS IN BACTERIAL-DNA FINGERPRINTING [J].
FORBES, KJ ;
BRUCE, KD ;
JORDENS, JZ ;
BALL, A ;
PENNINGTON, TH .
JOURNAL OF GENERAL MICROBIOLOGY, 1991, 137 :2051-2058
[4]   Stochastic complexity as a taxonomic tool [J].
Gyllenberg, HG ;
Gyllenberg, M ;
Koski, T ;
Lund, T .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 1998, 56 (01) :11-22
[5]   Classification of Enterobacteriaceae by minimization of stochastic complexity [J].
Gyllenberg, HG ;
Gyllenberg, M ;
Koski, T ;
Lund, T ;
Schindler, J ;
Verlaan, M .
MICROBIOLOGY-UK, 1997, 143 :721-732
[6]  
GYLLENBERG HG, 1999, QUANT MICROBIOL, V1, P157
[7]   Classification of binary vectors by stochastic complexity [J].
Gyllenberg, M ;
Koski, T ;
Verlaan, M .
JOURNAL OF MULTIVARIATE ANALYSIS, 1997, 63 (01) :47-72
[8]   New methods for the analysis of binarized BIOLOG GN data of vibrio species:: Minimization of stochastic complexity and cumulative classification [J].
Gyllenberg, M ;
Koski, T ;
Dawyndt, P ;
Lund, T ;
Thompson, F ;
Austin, B ;
Swings, J .
SYSTEMATIC AND APPLIED MICROBIOLOGY, 2002, 25 (03) :403-415
[9]   Numerical taxonomy and the principle of maximum entropy [J].
Gyllenberg, M ;
Koski, T .
JOURNAL OF CLASSIFICATION, 1996, 13 (02) :213-229
[10]  
Gyllenberg M, 2001, INT STAT REV, V69, P249