Combinatorial QSAR modeling of chemical toxicants tested against Tetrahymena pyriformis

被引:219
作者
Zhu, Hao [1 ,2 ]
Tropsha, Alexander [1 ,2 ]
Fourches, Denis [3 ]
Varnek, Alexandre [3 ]
Papa, Ester [4 ]
Gramatica, Paola [4 ]
Oberg, Tomas [5 ]
Dao, Phuong [6 ]
Cherkasov, Artem [6 ]
Tetko, Igor V. [7 ,8 ]
机构
[1] Univ N Carolina, Lab Mol Modeling, Div Med Chem & Nat Prod, Chapel Hill, NC 27599 USA
[2] Univ N Carolina, Sch Pharm, Carolina Exploratory Ctr Cheminformat Res, Chapel Hill, NC 27599 USA
[3] Univ Strasbourg 1, Inst Chem, Labs Chemoinformat, Strasbourg, France
[4] Univ Insubria, Dept Struct & Funct Biol, QSAR Res Unit Environm Chem & Ecotoxicol, Varese, Italy
[5] Univ Kalmar, Sch Pure & Appl Nat Sci, SE-39182 Kalmar, Sweden
[6] Univ British Columbia, Fac Med, Div Infect Dis, Vancouver, BC V5Z 3J5, Canada
[7] German Res Ctr Environm Hlth, Inst Bioinformat, Helmholtz Ctr Munich, D-85764 Neuherberg, Germany
[8] Inst Bioorgan & Petrochem, UA-02660 Kiev, Ukraine
关键词
D O I
10.1021/ci700443v
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Selecting most rigorous quantitative structure- activity relationship (QSAR) approaches is of great importance in the development of robust and predictive models of chemical toxicity. To address this issue in a systematic way, we have formed an international virtual collaboratory consisting of six independent groups with shared interests in computational chemical toxicology. We have compiled an aqueous toxicity data set containing 983 unique compounds tested in the same laboratory over a decade against Tetrahymena pyriformis. A modeling set including 644 compounds was selected randomly from the original set and distributed to all groups that used their own QSAR tools for model development. The remaining 339 compounds in the original set (external set 1) as well as 110 additional compounds (external set 11) published recently by the same laboratory (after this computational study was already in progress) were used as two independent validation sets to assess the external predictive power of individual models. In total, our virtual collaboratory has developed 15 different types of QSAR models of aquatic toxicity for the training set. The internal prediction accuracy for the modeling set ranged from 0.76 to 0.93 as measured by the leave-one-out cross-validation correlation coefficient (Q(abs)(2)). The prediction accuracy for the external validation sets I and 11 ranged from 0.71 to 0.85 (linear regression coefficient R-abI(2)) and from 0.38 to 0.83 (linear regression coefficient R-abII(2)), respectively. The use of an applicability domain threshold implemented in most models generally improved the external prediction accuracy but at the same time led to a decrease in chemical space coverage. Finally, several consensus models were developed by averaging the predicted aquatic toxicity for every compound using all 15 models, with or without taking into account their respective applicability domains. We find that consensus models afford higher prediction accuracy for the external validation data sets with the highest space coverage as compared to individual constituent models. Our studies prove the power of a collaborative and consensual approach to QSAR model development. The best validated models of aquatic toxicity developed by our collaboratory (both individual and consensus) can be used as reliable computational predictors of aquatic toxicity and are available from any of the participating laboratories.
引用
收藏
页码:766 / 784
页数:19
相关论文
共 72 条
[11]   Methods for reliability and uncertainty assessment and for applicability evaluations of classification- and regression-based QSARs [J].
Eriksson, L ;
Jaworska, J ;
Worth, AP ;
Cronin, MTD ;
McDowell, RM ;
Gramatica, P .
ENVIRONMENTAL HEALTH PERSPECTIVES, 2003, 111 (10) :1361-1375
[12]   Rational selection of training and test sets for the development of validated QSAR models [J].
Golbraikh, A ;
Shen, M ;
Xiao, ZY ;
Xiao, YD ;
Lee, KH ;
Tropsha, A .
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2003, 17 (02) :241-253
[13]   Beware of q2! [J].
Golbraikh, A ;
Tropsha, A .
JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2002, 20 (04) :269-276
[14]   Principles of QSAR models validation: internal and external [J].
Gramatica, Paola .
QSAR & COMBINATORIAL SCIENCE, 2007, 26 (05) :694-701
[15]   PROCEDURES FOR DETECTING OUTLYING OBSERVATIONS IN SAMPLES [J].
GRUBBS, FE .
TECHNOMETRICS, 1969, 11 (01) :1-&
[16]   DETERMINATION OF TOPOLOGICAL EQUIVALENCE IN MOLECULAR GRAPHS FROM THE TOPOLOGICAL STATE [J].
HALL, LH ;
KIER, LB .
QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIPS, 1990, 9 (02) :115-131
[17]   ELECTROTOPOLOGICAL STATE INDEXES FOR ATOM TYPES - A NOVEL COMBINATION OF ELECTRONIC, TOPOLOGICAL, AND VALENCE STATE INFORMATION [J].
HALL, LH ;
KIER, LB .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1995, 35 (06) :1039-1045
[18]   THE ELECTROTOPOLOGICAL STATE - AN ATOM INDEX FOR QSAR [J].
HALL, LH ;
MOHNEY, B ;
KIER, LB .
QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIPS, 1991, 10 (01) :43-51
[19]   The REACH concept and its impact on toxicological sciences [J].
Hengstler, JG ;
Foth, H ;
Kahl, R ;
Kramer, PJ ;
Lilienblum, W ;
Schulz, T ;
Schweinfurth, H .
TOXICOLOGY, 2006, 220 (2-3) :232-239
[20]   Neural network modeling for estimation of partition coefficient based on atom-type electrotopological state indices [J].
Huuskonen, JJ ;
Livingstone, DJ ;
Tetko, IV .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2000, 40 (04) :947-955