MACHINE LEARNING APPROACH FOR THE PREDICTION OF PROTEIN SECONDARY STRUCTURE

被引:71
作者
KING, RD
STERNBERG, MJE
机构
[1] IMPERIAL CANC RES FUND,BIOMOLEC MODELLING LAB,POB 123,44 LINCOLNS INN FIELDS,LONDON WC2A 3PX,ENGLAND
[2] TURING INST,GLASGOW G1 2AD,SCOTLAND
[3] UNIV LONDON BIRKBECK COLL,DEPT CRISTALLOG,MOLEC BIOL LAB,LONDON WC1E 7HX,ENGLAND
关键词
D O I
10.1016/S0022-2836(05)80333-X
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
PROMIS (protein machine induction system), a program for machine learning, was used to generalize rules that characterize the relationship between primary and secondary structure in globular proteins. These rules can be used to predict an unknown secondary structure from a known primary structure. The symbolic induction method used by PROMIS was specifically designed to produce rules that are meaningful in terms of chemical properties of the residues. The rules found were compared with existing knowledge of protein structure: some features of the rules were already recognized (e.g. amphipathic nature of α-helices). Other features are not understood, and are under investigation. The rules produced a prediction accuracy for three states (α-helix, β-strand and coil) of 60% for all proteins, 73% for proteins of known α domain type, 62% for proteins of known β domain type and 59% for proteins of known α/β domain type. We conclude that machine learning is a useful tool in the examination of the large databases generated in molecular biology. © 1990 Academic Press Limited.
引用
收藏
页码:441 / 457
页数:17
相关论文
共 34 条
[1]  
Angus J. E., 1989, International Journal of Neural Networks - Research & Applications, V1, P42
[2]  
[Anonymous], 1987, LEARNING INTERNAL RE
[3]   PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES [J].
BERNSTEIN, FC ;
KOETZLE, TF ;
WILLIAMS, GJB ;
MEYER, EF ;
BRICE, MD ;
RODGERS, JR ;
KENNARD, O ;
SHIMANOUCHI, T ;
TASUMI, M .
JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) :535-542
[4]  
Bisiani R, 1987, ENCY ARTIFICIAL INTE, P56
[5]  
BRATKO I, 1988, MACH INTELL, V11, P435
[6]  
Buchanan B. G., 1981, READINGS ARTIFICIAL, P313
[7]  
CARBONELL JG, 1987, ENCY ARTIFICAL INTEL, P464
[8]   CONFORMATIONAL PARAMETERS FOR AMINO-ACIDS IN HELICAL, BETA-SHEET, AND RANDOM COIL REGIONS CALCULATED FROM PROTEINS [J].
CHOU, PY ;
FASMAN, GD .
BIOCHEMISTRY, 1974, 13 (02) :211-222
[9]  
Clark Peter, 1987, EWSL, P11
[10]   TURN PREDICTION IN PROTEINS USING A PATTERN-MATCHING APPROACH [J].
COHEN, FE ;
ABARBANEL, RM ;
KUNTZ, ID ;
FLETTERICK, RJ .
BIOCHEMISTRY, 1986, 25 (01) :266-275