A step toward barcoding life: A model-based, decision-theoretic method to assign genes to preexisting species groups

被引:73
作者
Abdo, Zaid [1 ]
Golding, G. Brian
机构
[1] Univ Idaho, Dept Math, Moscow, ID 83844 USA
[2] Univ Idaho, Dept Stat, Moscow, ID 83844 USA
[3] McMaster Univ, Dept Biol, Hamilton, ON L8S 4K1, Canada
基金
美国国家卫生研究院; 加拿大自然科学与工程研究理事会; 加拿大创新基金会;
关键词
assignment; barcoding; coalescent; decision theory;
D O I
10.1080/10635150601167005
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
A major part of the barcoding of life problem is assigning newly sequenced or sampled individuals to existing groups that are preidentified externally ( by a taxonomist, for example). This problem involves evaluating the statistical evidence towards associating a sequence from a new individual with one group or another. The main concern of our current research is to perform this task in a fast and accurate manner. To accomplish this we have developed a model-based, decision-theoretic framework based on the coalescent theory. Under this framework, we utilized both distance and the posterior probability of a group, given the sequences from members of this group and the sequence from a newly sampled individual to assign this new individual. We believe that this approach makes efficient use of the available information in the data. Our preliminary results indicated that this approach is more accurate than using a simple measure of distance for assignment.
引用
收藏
页码:44 / 56
页数:13
相关论文
共 43 条
[1]   Statistical methods for characterizing diversity of microbial communities by analysis of terminal restriction fragment length polymorphisms of 16S rRNA genes [J].
Abdo, Z ;
Schüette, UME ;
Bent, SJ ;
Williams, CJ ;
Forney, LJ ;
Joyce, P .
ENVIRONMENTAL MICROBIOLOGY, 2006, 8 (05) :929-938
[2]  
[Anonymous], 2004, Inferring Phylogenies
[3]  
BAIN LJ, 1991, INTRO MATH STAT
[4]   Stopping-time resampling for sequential Monte Carlo methods [J].
Chen, YG ;
Xie, JY ;
Liu, JS .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2005, 67 :199-217
[5]  
deWaard JR, 2003, P ROYAL SOC LONDON S, V1512, P270
[6]  
Dudoit S, 2003, INTERDISC STAT, P93
[7]  
Durbin R., 1998, Biological sequence analysis: Probabilistic models of proteins and nucleic acids
[8]   ESTIMATING EFFECTIVE POPULATION-SIZE FROM SAMPLES OF SEQUENCES - A BOOTSTRAP MONTE-CARLO INTEGRATION METHOD [J].
FELSENSTEIN, J .
GENETICS RESEARCH, 1992, 60 (03) :209-220
[9]   EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH [J].
FELSENSTEIN, J .
JOURNAL OF MOLECULAR EVOLUTION, 1981, 17 (06) :368-376
[10]   Molecular barcodes for soil nematode identification [J].
Floyd, R ;
Abebe, E ;
Papert, A ;
Blaxter, M .
MOLECULAR ECOLOGY, 2002, 11 (04) :839-850