Extended Bayesian information criteria for model selection with large model spaces

被引:1324
作者
Chen, Jiahua [1 ]
Chen, Zehua [2 ]
机构
[1] Univ British Columbia, Dept Stat, Vancouver, BC V6T 1Z2, Canada
[2] Natl Univ Singapore, Dept Stat & Appl Probabil, Singapore 117546, Singapore
关键词
Bayesian paradigm; consistency; genome-wide association study; tournament approach; variable selection;
D O I
10.1093/biomet/asn034
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The ordinary Bayesian information criterion is too liberal for model selection when the model space is large. In this paper, we re-examine the Bayesian paradigm for model selection and propose an extended family of Bayesian information criteria, which take into account both the number of unknown parameters and the complexity of the model space. Their consistency is established, in particular allowing the number of covariates to increase to infinity with the sample size. Their performance in various situations is evaluated by simulation studies. It is demonstrated that the extended Bayesian information criteria incur a small loss in the positive selection rate but tightly control the false discovery rate, a desirable property in many applications. The extended Bayesian information criteria are extremely useful for variable selection in problems with a moderate sample size but with a huge number of covariates, especially in genome-wide association studies, which are now an active area in genetics research.
引用
收藏
页码:759 / 771
页数:13
相关论文
共 22 条
[1]  
Akaike H., 1973, 2 INT S INFORM THEOR, P267, DOI [DOI 10.1007/978-1-4612-1694-0_15, 10.1007/978-1-4612-1694-0_15]
[2]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[3]  
Berger J.O., 2001, Model Selection, VVolume 38, P135, DOI DOI 10.1214/LNMS/1215540968
[4]   Modifying the Schwarz Bayesian information criterion to locate multiple interacting quantitative trait loci [J].
Bogdan, M ;
Ghosh, JK ;
Doerge, RW .
GENETICS, 2004, 167 (02) :989-999
[5]   A model selection approach for the identification of quantitative trait loci in experimental crosses [J].
Broman, KW ;
Speed, TP .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2002, 64 :641-656
[6]  
Clyde MA, 2007, ASTR SOC P, V371, P224
[7]   SMOOTHING NOISY DATA WITH SPLINE FUNCTIONS [J].
WAHBA, G .
NUMERISCHE MATHEMATIK, 1975, 24 (05) :383-393
[8]  
Csorgo M, 1997, LIMIT THEOREMS CHANG
[9]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360