Choosing the Sample Size of a Computer Experiment: A Practical Guide

被引:479
作者
Loeppky, Jason L. [1 ]
Sacks, Jerome [2 ]
Welch, William J. [3 ]
机构
[1] Univ British Columbia, Kelowna, BC V1V 1V7, Canada
[2] Natl Inst Stat Sci, Res Triangle Pk, NC 27709 USA
[3] Univ British Columbia, Dept Stat, Vancouver, BC V6T 1Z2, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Curse of dimensionality; Effect sparsity; Gaussian process; Latin hypercube design; Prediction accuracy; Random function; GAUSSIAN PROCESS MODELS; CALIBRATION; DESIGNS;
D O I
10.1198/TECH.2009.08040
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We provide reasons and evidence supporting the informal rule that the number of runs for an effective initial computer experiment should be about 10 times the input dimension. Our arguments quantify two key characteristics of computer codes that affect the sample size required for a desired level of accuracy when approximating the code via a Gaussian process (GP). The first characteristic is the total sensitivity of a code output variable to all input variables; the second corresponds to the way this total sensitivity is distributed across the input variables, specifically the possible presence of a few prominent input factors and many impotent ones (i.e., effect sparsity). Both measures relate directly to the correlation structure in the GP approximation of the code. In this way, the article moves toward a more formal treatment of sample size for a computer experiment. The evidence supporting these arguments stems primarily from a simulation study and via specific codes modeling climate and ligand activation of G-protein.
引用
收藏
页码:366 / 376
页数:11
相关论文
共 22 条
[1]  
[Anonymous], P FDN SYST BIOL ENG
[2]  
[Anonymous], 1992, Bayesian Statistics
[3]   Circuit optimization via sequential computer experiments: design of an output buffer [J].
Aslett, R ;
Buck, RJ ;
Duvall, SG ;
Sacks, J ;
Welch, WJ .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 1998, 47 :31-48
[4]   Computer model validation with functional output [J].
Bayarri, M. J. ;
Berger, J. O. ;
Cafeo, J. ;
Garcia-Donato, G. ;
Liu, F. ;
Palomo, J. ;
Parthasarathy, R. J. ;
Paulo, R. ;
Sacks, J. ;
Walsh, D. .
ANNALS OF STATISTICS, 2007, 35 (05) :1874-1906
[5]   ARCTIC SEA-ICE VARIABILITY - MODEL SENSITIVITIES AND A MULTIDECADAL SIMULATION [J].
CHAPMAN, WL ;
WELCH, WJ ;
BOWMAN, KP ;
SACKS, J ;
WALSH, JE .
JOURNAL OF GEOPHYSICAL RESEARCH-OCEANS, 1994, 99 (C1) :919-935
[6]  
CHEN X, 1996, THESIS U WATERLOO
[7]   BAYESIAN PREDICTION OF DETERMINISTIC FUNCTIONS, WITH APPLICATIONS TO THE DESIGN AND ANALYSIS OF COMPUTER EXPERIMENTS [J].
CURRIN, C ;
MITCHELL, T ;
MORRIS, M ;
YLVISAKER, D .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1991, 86 (416) :953-963
[8]   PARAMETER SPACE EXPLORATION OF AN OCEAN GENERAL-CIRCULATION MODEL USING AN ISOPYCNIC MIXING PARAMETERIZATION [J].
GOUGH, WA ;
WELCH, WJ .
JOURNAL OF MARINE RESEARCH, 1994, 52 (05) :773-796
[9]   Bayesian Treed Gaussian Process Models With an Application to Computer Modeling [J].
Gramacy, Robert B. ;
Lee, Herbert K. H. .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (483) :1119-1130
[10]   Combining field data and computer simulations for calibration and prediction [J].
Higdon, D ;
Kennedy, M ;
Cavendish, JC ;
Cafeo, JA ;
Ryne, RD .
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2004, 26 (02) :448-466