An HMM model for coiled-coil domains and a comparison with PSSM-based predictions

被引:371
作者
Delorenzi, M [1 ]
Speed, T [1 ]
机构
[1] Walter & Eliza Hall Inst Med Res, Genet & Bioinformat Div, Melbourne, Vic 3050, Australia
关键词
D O I
10.1093/bioinformatics/18.4.617
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Large-scale sequence data require methods for the automated annotation of protein domains. Many of the predictive methods are based either on a Position Specific Scoring Matrix (PSSM) of fixed length or on a windowless Hidden Markov Model (HMM). The performance of the two approaches is tested for Coiled-Coil Domains (CCDs). The prediction of CCDs is used frequently, and its optimization seems worthwhile. Results: We have conceived MARCOIL, an HMM for the recognition of proteins with a CCD on a genomic scale. A cross-validated study suggests that MARCOIL improves predictions compared to the traditional PSSM algorithm, especially for some protein families and for short CCDs. The study was designed to reveal differences inherent in the two methods. Potential confounding factors such as differences in the dimension of parameter space and in the parameter values were avoided by using the same amino acid propensities and by keeping the transition probabilities of the HMM constant during cross-validation. Availability: The prediction program and the databases are available at http://www.wehi.edu.au/bioweb/Mauro/ Marcoil Contact: delorenzi@wehi.edu.au.
引用
收藏
页码:617 / 625
页数:9
相关论文
共 40 条
[1]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[2]   HIDDEN MARKOV-MODELS OF BIOLOGICAL PRIMARY SEQUENCE INFORMATION [J].
BALDI, P ;
CHAUVIN, Y ;
HUNKAPILLER, T ;
MCCLURE, MA .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1994, 91 (03) :1059-1063
[3]  
Baum L.E., 1972, Inequalities III: Proceedings of the Third Symposium on Inequalities, page, V3, P1
[4]  
Bengio Y., 1996, NEURAL NETWORKS SPEE
[5]   PREDICTING COILED COILS BY USE SF PAIRWISE RESIDUE CORRELATIONS [J].
BERGER, B ;
WILSON, DB ;
WOLF, E ;
TONCHEV, T ;
MILLA, M ;
KIM, PS .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1995, 92 (18) :8259-8263
[6]   An iterative method for improved protein structural motif recognition [J].
Berger, B ;
Singh, M .
JOURNAL OF COMPUTATIONAL BIOLOGY, 1997, 4 (03) :261-273
[7]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[8]   Computational approaches to identify leucine zippers [J].
Bornberg-Bauer, E ;
Rivals, E ;
Vingron, M .
NUCLEIC ACIDS RESEARCH, 1998, 26 (11) :2740-2746
[9]  
Brown JH, 1996, PROTEINS, V26, P134
[10]   THE PACKING OF ALPHA-HELICES - SIMPLE COILED-COILS [J].
CRICK, FHC .
ACTA CRYSTALLOGRAPHICA, 1953, 6 (8-9) :689-697