Analysis of a large structure/biological activity data set using recursive partitioning

被引:154
作者
Rusinko, A [1 ]
Farmen, MW [1 ]
Lambert, CG [1 ]
Brown, PL [1 ]
Young, SS [1 ]
机构
[1] Glaxo Wellcome Inc, Res Informat Syst, Res Triangle Pk, NC 27709 USA
来源
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES | 1999年 / 39卷 / 06期
关键词
D O I
10.1021/ci9903049
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Combinatorial chemistry and high-throughput screening are revolutionizing the process of lead discovery in the pharmaceutical industry. Large numbers of structures and vast quantities of biological assay data are quickly being accumulated, overwhelming traditional structure/activity relationship (SAR) analysis technologies. Recursive partitioning is a method for statistically determining rules that classify objects into similar categories or, in this case, structures into groups of molecules with similar potencies. SCAM is a computer program implemented to make extremely efficient use of this methodology. Depending on the size of the data set, rules explaining biological data can be determined interactively. An example data set of 1650 monoamine oxidase inhibitors exemplifies the method, yielding substructural rules and leading to general classifications of these inhibitors. The method scales linearly with the number of descriptors, so hundreds of thousands of structures can be analyzed utilizing thousands to millions of molecular descriptors. There are currently no methods to deal with statistical analysis problems of this size. An important aspect of this analysis is the ability to deal with mixtures, i.e., identify SAR rules for classes of compounds in the same data set that might be binding in different ways. Most current quantitative structure/activity relationship methods require that the compounds follow a single mechanism. Advantages and limitations of this methodology are presented.
引用
收藏
页码:1017 / 1026
页数:10
相关论文
共 58 条
[1]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[2]   A MACHINE LEARNING APPROACH TO COMPUTER-AIDED MOLECULAR DESIGN [J].
BOLIS, G ;
DIPACE, L ;
FABROCINI, F .
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 1991, 5 (06) :617-628
[3]   Use of structure Activity data to compare structure-based clustering methods and descriptors for use in compound selection [J].
Brown, RD ;
Martin, YC .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1996, 36 (03) :572-584
[4]   Using artificial neural networks to predict biological activity from simple molecular structural considerations [J].
Burden, FR .
QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIPS, 1996, 15 (01) :7-11
[5]   ATOM PAIRS AS MOLECULAR-FEATURES IN STRUCTURE ACTIVITY STUDIES - DEFINITION AND APPLICATIONS [J].
CARHART, RE ;
SMITH, DH ;
VENKATARAGHAVAN, R .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1985, 25 (02) :64-73
[6]   APPLICATIONS OF ARTIFICIAL INTELLIGENCE FOR CHEMICAL INFERENCE .17. APPROACH TO COMPUTER-ASSISTED ELUCIDATION OF MOLECULAR-STRUCTURE [J].
CARHART, RE ;
SMITH, DH ;
BROWN, H ;
DJERASSI, C .
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1975, 97 (20) :5755-5762
[7]   Recursive partitioning analysis of a large structure-activity data set using three-dimensional descriptors [J].
Chen, X ;
Rusinko, A ;
Young, SS .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1998, 38 (06) :1054-1062
[8]   Automated pharmacophore identification for large chemical data sets [J].
Chen, X ;
Rusinko, A ;
Tropsha, A ;
Young, SS .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1999, 39 (05) :887-896
[9]   IMPLEMENTATION OF ARTIFICIAL-INTELLIGENCE FOR AUTOMATIC DRUG DESIGN .1. STEPWISE COMPUTATION OF THE INTERACTIVE DRUG-DESIGN SEQUENCE [J].
COHEN, AA ;
SHATZMILLER, SE .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 1994, 15 (12) :1393-1402
[10]   COMPUTER-ASSISTED ANALYSIS OF COMPLEX SYNTHETIC PROBLEMS [J].
COREY, EJ .
QUARTERLY REVIEWS, 1971, 25 (04) :455-&