The discovery of indicator variables for QSAR using inductive logic programming

被引:19
作者
King, RD [1 ]
Srinivasan, A
机构
[1] Univ Wales, Dept Comp Sci, Aberystwyth SY23 3DB, Ceredigion, Wales
[2] Univ Oxford, Comp Lab, Oxford OX1 3QD, England
基金
英国工程与自然科学研究理事会;
关键词
artificial intelligence; machine learning; regression;
D O I
10.1023/A:1007967728701
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A central problem in forming accurate regression equations in QSAR studies is the selection of appropriate descriptors for the compounds under study. We describe a novel procedure for using inductive logic programming (ILP) to discover new indicator variables (attributes) for QSAR problems, and show that these improve the accuracy of the derived regression equations. ILP techniques have previously been shown to work well on drug design problems where there is a large structural component or where clear comprehensible rules are required. However, ILP techniques have had the disadvantage of only being able to make qualitative predictions e.g. active, inactive) and not to predict real numbers (regression). We unify ILP and linear regression techniques to give a QSAR method that has the strength of ILP at describing steric structure, with the familiarity and power of linear regression. We evaluated the utility of this new QSAR technique by examining the prediction of biological activity with and without the addition of new structural indicator variables formed by ILP. In three out of five datasets examined the addition of ILP variables produced statistically better results (P<0.01) over the original description. The new ILP variables did not increase the overall complexity of the derived QSAR equations and added insight into possible mechanisms of action. We conclude that ILP can aid in the process of drug design.
引用
收藏
页码:571 / 580
页数:10
相关论文
共 40 条
[11]   COMPARISON OF THE INHIBITION OF ESCHERICHIA-COLI AND LACTOBACILLUS-CASEI DIHYDROFOLATE-REDUCTASE BY 2,4-DIAMINO-5-(SUBSTITUTED-BENZYL)PYRIMIDINES - QUANTITATIVE STRUCTURE ACTIVITY RELATIONSHIPS, X-RAY CRYSTALLOGRAPHY, AND COMPUTER-GRAPHICS IN STRUCTURE ACTIVITY ANALYSIS [J].
HANSCH, C ;
LI, RL ;
BLANEY, JM ;
LANGRIDGE, R .
JOURNAL OF MEDICINAL CHEMISTRY, 1982, 25 (07) :777-784
[12]   QUANTITATIVE STRUCTURE-ACTIVITY-RELATIONSHIPS BY NEURAL NETWORKS AND INDUCTIVE LOGIC PROGRAMMING .2. THE INHIBITION OF DIHYDROFOLATE-REDUCTASE BY TRIAZINES [J].
HIRST, JD ;
KING, RD ;
STERNBERG, MJE .
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 1994, 8 (04) :421-432
[13]   QUANTITATIVE STRUCTURE-ACTIVITY-RELATIONSHIPS BY NEURAL NETWORKS AND INDUCTIVE LOGIC PROGRAMMING .1. THE INHIBITION OF DIHYDROFOLATE-REDUCTASE BY PYRIMIDINES [J].
HIRST, JD ;
KING, RD ;
STERNBERG, MJE .
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 1994, 8 (04) :405-420
[15]   RELATING CHEMICAL ACTIVITY TO STRUCTURE - AN EXAMINATION OF ILP SUCCESSES [J].
KING, RD ;
STERNBERG, MJE ;
SRINIVASAN, A .
NEW GENERATION COMPUTING, 1995, 13 (3-4) :411-433
[16]   Prediction of rodent carcinogenicity bioassays from molecular structure using inductive logic programming [J].
King, RD ;
Srinivasan, A .
ENVIRONMENTAL HEALTH PERSPECTIVES, 1996, 104 :1031-1040
[17]   Structure-activity relationships derived by machine learning: The use of atoms and their bond connectivities to predict mutagenicity by inductive logic programming [J].
King, RD ;
Muggleton, SH ;
Srinivasan, A ;
Sternberg, MJE .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1996, 93 (01) :438-442
[18]   DRUG DESIGN BY MACHINE LEARNING - THE USE OF INDUCTIVE LOGIC PROGRAMMING TO MODEL THE STRUCTURE-ACTIVITY-RELATIONSHIPS OF TRIMETHOPRIM ANALOGS BINDING TO DIHYDROFOLATE-REDUCTASE [J].
KING, RD ;
MUGGLETON, S ;
LEWIS, RA ;
STERNBERG, MJE .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (23) :11322-11326
[20]   A HIERARCHICAL COMPUTER AUTOMATED STRUCTURE EVALUATION PROGRAM .1. [J].
KLOPMAN, G .
QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIPS, 1992, 11 (02) :176-184