Virtual screening system for finding structurally diverse hits by active learning

被引:31
作者
Fujiwara, Yukiko [1 ]
Yamashita, Yoshiko [2 ]
Osoda, Tsutomu [3 ]
Asogawa, Minoru [4 ]
Fukushima, Chiaki [5 ]
Asao, Masaaki [6 ]
Shimadzu, Hideshi [5 ]
Nakao, Kazuya [6 ]
Shimizu, Ryo [6 ]
机构
[1] NEC Corp Ltd, Serv Platform Labs, Minato Ku, Tokyo 1088557, Japan
[2] NEC Corp Ltd, Business Innovat Ctr, Tsukuba, Ibaraki 3058501, Japan
[3] NEC Corp Ltd, Business Innovat Ctr, Tokyo 1088001, Japan
[4] NEC Corp Ltd, Nanoelect Labs, Tsukuba, Ibaraki 3058501, Japan
[5] Tanabe Seiyaku Co Ltd, Saitama 3358505, Japan
[6] Tanabe Seiyaku Co Ltd, Yodogawa Ku, Osaka 5328505, Japan
关键词
D O I
10.1021/ci700085q
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Two virtual screening strategies, "query by bagging" (QBag) and "query by bagging with descriptor-sampling" (QBagDS), based on active learning were devised. The QBag strategy generates multiple structure-activity relationship rules by bagging and selects compounds to improve the rules. To find many structurally diverse hits, the QBagDS strategy generates rules by bagging with descriptor sampling. They can also use prior knowledge about hits to improve the efficiency at the beginning of screening. We performed simulation experiments and clustering analysis for several G-protein coupled receptors and showed that the QBag and QBagDS strategies outperform the conventional similarity-based strategy and that using both descriptor sampling and prior knowledge are effective for finding many hits. We applied the bagging with descriptor sampling strategy to novel hit finding, and 4 of the 10 selected compounds showed high inhibition.
引用
收藏
页码:930 / 940
页数:11
相关论文
共 23 条
[1]  
ABE N, 1998, P 15 INT C MACH LEAR, P1
[2]  
*ACC INC, 2005, CERIUS2 VER 4 8
[3]  
[Anonymous], 2005, Data Mining Pratical Machine Learning Tools and Techniques
[4]  
BEIMAN L, 1996, MACH LEARN, V24, P123
[5]   The information content of 2D and 3D structural descriptors relevant to ligand-receptor binding [J].
Brown, RD ;
Martin, YC .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1997, 37 (01) :1-9
[6]   Use of structure Activity data to compare structure-based clustering methods and descriptors for use in compound selection [J].
Brown, RD ;
Martin, YC .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1996, 36 (03) :572-584
[7]  
COHN D, 1994, MACH LEARN, V15, P201, DOI 10.1007/BF00993277
[8]  
Collobert R, 2006, J MACH LEARN RES, V7, P1687
[9]  
*DAYL CHEM INF SYS, 2005, DAYL VER 4 82
[10]   Reoptimization of MDL keys for use in drug discovery [J].
Durant, JL ;
Leland, BA ;
Henry, DR ;
Nourse, JG .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (06) :1273-1280