SYNTHESIS AND RECOGNITION OF SEQUENCES

被引:11
作者
CHAN, SC
WONG, AKC
机构
[1] Department of Systems Design Engineering, University of Waterloo
关键词
HIERARCHICAL CLUSTERING; MULTIPLE SEQUENCE ALIGNMENT; SEQUENCES; STRINGS; SUPERVISED CLASSIFICATION; SYNTHESIS; UNSUPERVISED CLASSIFICATION;
D O I
10.1109/34.106998
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A string or sequence is a linear array of symbols that come from an alphabet. Due to unknown substitutions, insertions, and deletions of symbols, a sequence cannot be treated like a vector or a tuple of a fixed number of variables. The synthesis of an ensemble of sequences is a "sequence" of random elements that specify the probabilities of occurrence of the different symbols at the corresponding sites of the sequences. The synthesis is determined by a hierarchical sequence synthesis procedure (HSSP), which returns not only the taxonomic hierarchy of the whole ensemble of sequences but also the alignment and the synthesis of a group (a subset of the ensemble) of the sequences at each level of the hierarchy. The HSSP does not require the ensemble of sequences to be presented in the form of a tabulated array of data, the hierarchical information of the data, or the assumption of a stochastic process. This correspondence presents the concept of sequence synthesis and the applicability of the HSSP as a supervised classification procedure as well as an unsupervised classification procedure.
引用
收藏
页码:1245 / 1255
页数:11
相关论文
共 57 条
[1]   STATISTICAL-INFERENCE ABOUT MARKOV-CHAINS [J].
ANDERSON, TW ;
GOODMAN, LA .
ANNALS OF MATHEMATICAL STATISTICS, 1957, 28 (01) :89-110
[2]  
Baum L. E., 1972, INEQUALITIES, V3, P1
[3]   AN INEQUALITY WITH APPLICATIONS TO STATISTICAL ESTIMATION FOR PROBABILISTIC FUNCTIONS OF MARKOV PROCESSES AND TO A MODEL FOR ECOLOGY [J].
BAUM, LE ;
EAGON, JA .
BULLETIN OF THE AMERICAN MATHEMATICAL SOCIETY, 1967, 73 (03) :360-&
[4]   A MAXIMIZATION TECHNIQUE OCCURRING IN STATISTICAL ANALYSIS OF PROBABILISTIC FUNCTIONS OF MARKOV CHAINS [J].
BAUM, LE ;
PETRIE, T ;
SOULES, G ;
WEISS, N .
ANNALS OF MATHEMATICAL STATISTICS, 1970, 41 (01) :164-&
[5]  
BAUM LE, 1966, ANN MATH STAT, V37, P1559
[6]  
BRADLEY DW, 1983, TIME WARPS STRING ED
[7]  
Cavalli-Sforza LL, 1971, GENETICS HUMAN POPUL
[8]  
CAVALLISFORZA LL, 1969, 12 P INT C GEN TOK, V3, P405
[9]  
CHAN KCC, 1990, IN PRESS COMPUTATION
[10]  
CHAN SC, 1991, IN PRESS B MATH BIOL