Linear programming optimization and a double statistical filter for protein threading protocols

被引:95
作者
Meller, J
Elber, R
机构
[1] Cornell Univ, Dept Comp Sci, Ithaca, NY 14853 USA
[2] Nicholas Copernicus Univ, Dept Comp Methods, Torun, Poland
来源
PROTEINS-STRUCTURE FUNCTION AND GENETICS | 2001年 / 45卷 / 03期
关键词
linear programming; potential optimization; decoy structures; threading; gaps;
D O I
10.1002/prot.1145
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The design of scoring functions (or potentials) for threading, differentiating native-like from non-native structures with a limited computational cost, is an active field of research. We revisit two widely used families of threading potentials: the pairwise and profile models. To design optimal scoring functions we use linear programming (LP). The LP protocol makes it possible to measure the difficulty of a particular training set in conjunction with a specific form of the scoring function. Gapless threading demonstrates that pair potentials have larger prediction capacity compared with profile energies. However, alignments with gaps are easier to compute with profile potentials. We therefore search and propose a new profile model with comparable prediction capacity to contact potentials. A protocol to determine optimal energy parameters for gaps, using LP, is also presented. A statistical test, based on a combination of local and global Z-scores, is employed to filter out false-positives. Extensive tests of the new protocol are presented. The new model provides an efficient alternative for threading with pair energies, maintaining comparable accuracy. The code, databases, and a prediction server are available at http://www.tc.cornell.edu/ CBIO/loopp. (C) 2001 Wiley-Liss, Inc.
引用
收藏
页码:241 / 261
页数:21
相关论文
共 53 条
[1]   LIMITING BEHAVIOR OF THE AFFINE SCALING CONTINUOUS TRAJECTORIES FOR LINEAR-PROGRAMMING PROBLEMS [J].
ADLER, I ;
MONTEIRO, RDC .
MATHEMATICAL PROGRAMMING, 1991, 50 (01) :29-51
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]  
ALTSCHUL SF, 1985, MOL BIOL EVOL, V2, P526
[4]   Neutral networks in protein space: a computational study based on knowledge-based potentials of mean force [J].
Babajide, A ;
Hofacker, IL ;
Sippl, MJ ;
Stadler, PF .
FOLDING & DESIGN, 1997, 2 (05) :261-269
[5]  
BABAJIDE A, 1999, IN PRESS J COMPAR BI
[6]  
Betancourt MR, 1999, PROTEIN SCI, V8, P361
[7]   A METHOD TO IDENTIFY PROTEIN SEQUENCES THAT FOLD INTO A KNOWN 3-DIMENSIONAL STRUCTURE [J].
BOWIE, JU ;
LUTHY, R ;
EISENBERG, D .
SCIENCE, 1991, 253 (5016) :164-170
[8]   STATISTICS OF SEQUENCE-STRUCTURE THREADING [J].
BRYANT, SH ;
ALTSCHUL, SF .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1995, 5 (02) :236-244
[9]  
Bryant SH, 1996, PROTEINS, V26, P172
[10]   AN EMPIRICAL ENERGY FUNCTION FOR THREADING PROTEIN-SEQUENCE THROUGH THE FOLDING MOTIF [J].
BRYANT, SH ;
LAWRENCE, CE .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 1993, 16 (01) :92-112