A learning method of hidden Markov models for sequence discrimination

被引:13
作者
Mamitsuka, H
机构
[1] CandC Research Labs, NEC Corporation, Miyamae-ku, Kawasaki, Kanagawa 216, 4-1-1, Miyazaki
关键词
hidden Markov models; sequence discrimination; lipocalin family; gradient-descent; stochastic models;
D O I
10.1089/cmb.1996.3.361
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
We propose a learning method for hidden Markov models (HMM) for sequence discrimination, When given an HMM, our method sets a function that corresponds to the product of a difference between the observed and the desired likelihoods for each training sequence, and using a gradient descent algorithm, trains the HMM parameters so that the function should be minimized, This method allows us to use not only the examples belonging to a class that should be represented by the HMM, but also the examples not belonging to the class, i.e., negative examples, We evaluated our method in a series of experiments based on a type of cross-validation, and compared the results with those of two existing methods, Experimental results show that our method greatly reduces the discrimination errors made by the other two methods, We conclude that both the use of negative examples and our method of using negative examples are useful for training HMMs in discriminating unknown sequences.
引用
收藏
页码:361 / 373
页数:13
相关论文
共 33 条
[1]  
ASAI K, 1993, COMPUT APPL BIOSCI, V9, P141
[2]   THE SWISS-PROT PROTEIN-SEQUENCE DATA-BANK [J].
BAIROCH, A ;
BOECKMANN, B .
NUCLEIC ACIDS RESEARCH, 1991, 19 :2247-2248
[3]   THE PROSITE DICTIONARY OF SITES AND PATTERNS IN PROTEINS, ITS CURRENT STATUS [J].
BAIROCH, A .
NUCLEIC ACIDS RESEARCH, 1993, 21 (13) :3097-3103
[4]   The PROSITE database, its status in 1995 [J].
Bairoch, A ;
Bucher, P ;
Hofmann, K .
NUCLEIC ACIDS RESEARCH, 1996, 24 (01) :189-196
[5]   SMOOTH ONLINE LEARNING ALGORITHMS FOR HIDDEN MARKOV-MODELS [J].
BALDI, P ;
CHAUVIN, Y .
NEURAL COMPUTATION, 1994, 6 (02) :307-318
[6]  
Baldi P, 1994, J Comput Biol, V1, P311, DOI 10.1089/cmb.1994.1.311
[7]   HIDDEN MARKOV-MODELS OF BIOLOGICAL PRIMARY SEQUENCE INFORMATION [J].
BALDI, P ;
CHAUVIN, Y ;
HUNKAPILLER, T ;
MCCLURE, MA .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1994, 91 (03) :1059-1063
[8]  
Baum L.E., 1972, Inequalities III: Proceedings of the Third Symposium on Inequalities, page, V3, P1
[9]   PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES [J].
BERNSTEIN, FC ;
KOETZLE, TF ;
WILLIAMS, GJB ;
MEYER, EF ;
BRICE, MD ;
RODGERS, JR ;
KENNARD, O ;
SHIMANOUCHI, T ;
TASUMI, M .
JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) :535-542