MAP: An iterative experimental design methodology for the optimization of catalytic search space structure modeling

被引:22
作者
Baumes, LA
机构
[1] Max Planck Inst Kohlenforsch, D-45470 Mulheim, Germany
[2] Inst Rech Catalyse, CNRS, F-69626 Villeurbanne, France
来源
JOURNAL OF COMBINATORIAL CHEMISTRY | 2006年 / 8卷 / 03期
关键词
D O I
10.1021/cc050130+
中图分类号
O69 [应用化学];
学科分类号
081704 ;
摘要
One of the main problems in high-throughput research for materials is still the design of experiments. At early stages of discovery programs, purely exploratory methodologies coupled with fast screening tools should be employed. This should lead to opportunities to find unexpected catalytic results and identify the "groups" of catalyst outputs, providing well-defined boundaries for future optimizations. However, very few new papers deal with strategies that guide exploratory studies. Mostly, traditional designs, homogeneous covering, or simple random samplings are exploited. Typical catalytic output distributions exhibit unbalanced datasets for which an efficient learning is hardly carried out, and interesting but rare classes are usually unrecognized. Here is suggested a new iterative algorithm for the characterization of the search space structure, working independently of learning processes. It enhances recognition rates by transferring catalysts to be screened from "performance-stable" space zones to "unsteady" ones which necessitate more experiments to be well-modeled. The evaluation of new algorithm attempts through benchmarks is compulsory due to the lack of past proofs about their efficiency. The method is detailed and thoroughly tested with mathematical functions exhibiting different levels of complexity. The strategy is not only empirically evaluated, the effect or efficiency of sampling on future Machine Learning performances is also quantified. The minimum sample size required by the algorithm for being statistically discriminated from simple random sampling is investigated.
引用
收藏
页码:304 / 314
页数:11
相关论文
共 30 条
[1]  
Aha D. W., 1992, P 9 INT C MACH LEARN, P1
[2]  
[Anonymous], 1983, Statistical methods
[3]   Using Artificial Neural Networks to boost high-throughput discovery in heterogeneous catalysis [J].
Baumes, L ;
Farrusseng, D ;
Lengliz, M ;
Mirodatos, C .
QSAR & COMBINATORIAL SCIENCE, 2004, 23 (09) :767-778
[4]  
BAUMES LA, UNPUB J COMB CHEM
[5]  
BAUMES LA, 2003, LECT NOTES AI LNCS L
[6]  
Bem D. S., 2003, EXPT DESIGN HIGH THR, P89
[7]  
BLICKLE T, 1995, 6 INT C GEN ALG SAN
[8]  
Chakravarti IM., 1967, HDB METHODS APPL STA, P392, DOI DOI 10.1080/01621459.1968.11009335
[9]  
CHRISTENSEN LB, 1994, EXPT METHODOLOGY
[10]   NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].
COVER, TM ;
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+