Predicting the genotoxicity of secondary and aromatic amines using data subsetting to generate a model ensemble

被引:42
作者
Mattioni, BE
Kauffman, GW
Jurs, PC
Custer, LL
Durham, SK
Pearl, GM
机构
[1] Penn State Univ, Dept Chem, University Pk, PA 16802 USA
[2] Bristol Myers Squibb Co, Princeton, NJ 08453 USA
来源
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES | 2003年 / 43卷 / 03期
关键词
D O I
10.1021/ci034013i
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Binary quantitative structure-activity relationship (QSAR) models are developed to classify a data set of 334 aromatic and secondary amine compounds as genotoxic or nongenotoxic based on information calculated solely from chemical structure. Genotoxic endpoints for each compound were determined using the SOS Chromotest in both the presence and absence of an S9 rat liver homogenate. Compounds were considered genotoxic if assay results indicated a positive genotoxicity hit for either the S9 inactivated or S9 activated assay. Each compound in the data set was encoded through the calculation of numerical descriptors that describe various aspects of chemical structure (e.g. topological, geometric, electronic, polar surface area). Furthermore, five additional descriptors that focused on the secondary and aromatic nitrogen atoms in each molecule were calculated specifically for this study. Descriptor subsets were examined using a genetic algorithm search engine interfaced with a k-Nearest Neighbor fitness evaluator to find the most information-rich subsets, which ultimately served as the final predictive models. Models were chosen for their ability to minimize the total number of misclassifications, with special attention given to those models that possessed fewer occurrences of positive toxicity hits being misclassified as nontoxic (false negatives). In addition, a subsetting procedure was used to form an ensemble of models using different combinations of compounds in the training and prediction sets. This was done to ensure that consistent results could be obtained regardless of training set composition. The procedure also allowed for each compound to be externally validated three times by different training set data with the resultant predictions being used in a "majority rules" voting scheme to produce a consensus prediction for each member of the data set. The individual models produced an average training set classification rate of 71.6% and an average prediction set classification rate of 67.7%. However, the model ensemble was able to correctly classify the genotoxicity of 72.2% of all prediction set compounds.
引用
收藏
页码:949 / 963
页数:15
相关论文
共 85 条
[1]   On the use of neural network ensembles in QSAR and QSPR [J].
Agrafiotis, DK ;
Cedeño, W ;
Lobanov, VS .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (04) :903-911
[2]   CHEMICAL-STRUCTURE, SALMONELLA MUTAGENICITY AND EXTENT OF CARCINOGENICITY AS INDICATORS OF GENOTOXIC CARCINOGENESIS AMONG 222 CHEMICALS TESTED IN RODENTS BY THE UNITED-STATES NCI/NTP [J].
ASHBY, J ;
TENNANT, RW .
MUTATION RESEARCH, 1988, 204 (01) :17-115
[3]   EVALUATION OF 2 SUGGESTED METHODS OF DEACTIVATING ORGANIC CARCINOGENS BY MOLECULAR MODIFICATION [J].
ASHBY, J ;
PATON, D ;
LEFEVRE, PA ;
STYLES, JA ;
ROSE, FL .
CARCINOGENESIS, 1982, 3 (11) :1277-1282
[4]   Rule extraction from a mutagenicity data set using adaptively grown phylogenetic-like trees [J].
Bacha, PA ;
Gruver, HS ;
Den Hartog, BK ;
Tamura, SY ;
Nutt, RF .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (05) :1104-1111
[5]   HIGHLY DISCRIMINATING DISTANCE-BASED TOPOLOGICAL INDEX [J].
BALABAN, AT .
CHEMICAL PHYSICS LETTERS, 1982, 89 (05) :399-404
[6]   Prediction of mutagenicity of aromatic and heteroaromatic amines from structure: A hierarchical QSAR approach [J].
Basak, SC ;
Mills, DR ;
Balaban, AT ;
Gute, BD .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2001, 41 (03) :671-678
[7]   QSAR MODELS FOR BOTH MUTAGENIC POTENCY AND ACTIVITY - APPLICATION TO NITROARENES AND AROMATIC-AMINES [J].
BENIGNI, R ;
ANDREOLI, C ;
GIULIANI, A .
ENVIRONMENTAL AND MOLECULAR MUTAGENESIS, 1994, 24 (03) :208-219
[8]   Carcinogenicity of the aromatic amines: From structure-activity relationships to mechanisms of action and risk assessment [J].
Benigni, R ;
Passerini, L .
MUTATION RESEARCH-REVIEWS IN MUTATION RESEARCH, 2002, 511 (03) :191-206
[9]  
Benigni R, 1998, ENVIRON MOL MUTAGEN, V32, P75, DOI 10.1002/(SICI)1098-2280(1998)32:1<75::AID-EM9>3.0.CO
[10]  
2-A