Approximately unbiased tests of regions using multistep-multiscale bootstrap resampling

被引:256
作者
Shimodaira, H [1 ]
机构
[1] Tokyo Inst Technol, Dept Math & Comp Sci, Meguro Ku, Tokyo 1528552, Japan
关键词
problem of regions; approximately unbiased tests; third-order accuracy; bootstrap probability; curvature; bias correction;
D O I
10.1214/009053604000000823
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Approximately unbiased tests based on bootstrap probabilities are considered for the exponential family of distributions with unknown expectation parameter vector, where the null hypothesis is represented as an arbitrary-shaped region with smooth boundaries. This problem has been discussed previously in Efron and Tibshirani [Ann. Statist. 26 (1998) 1687-1718], and a corrected p-value with second-order asymptotic accuracy is calculated by the two-level bootstrap of Efron, Halloran and Holmes [Proc. Natl. Acad. Sci. U.S.A. 93 (1996) 13429-13434] based on the ABC bias correction of Efron [J. Amer Statist. Assoc. 82 (1987) 171-185]. Our argument is an extension of their asymptotic theory, where the geometry, such as the signed distance and the curvature of the boundary, plays an important role. We give another calculation of the corrected p-value without finding the "nearest point" on the boundary to the observation, which is required in the two-level bootstrap and is an implementational burden in complicated problems. The key idea is to alter the sample size of the replicated dataset from that of the observed dataset. The frequency of the replicates falling in the region is counted for several sample sizes, and then the p-value is calculated by looking at the change in the frequencies along the changing sample sizes. This is the multiscale bootstrap of Shimodaira [Systematic Biology 51 (2002) 492-508], which is third-order accurate for the multivariate normal model. Here we introduce a newly devised multistep-multiscale bootstrap, calculating a third-order accurate p-value for the exponential family of distributions. In fact, our p-value is asymptotically equivalent to those obtained by the double bootstrap of Hall [The Bootstrap and Edgeworth Expansion (1992) Springer, New York] and the modified signed likelihood ratio of Barndorff-Nielsen [Biometrika 73 (1986) 307-322] ignoring O(n(-3/2)) terms, yet the computation is less demanding and free from model specification. The algorithm is remarkably simple despite complexity of the theory behind it. The differences of the p-values are illustrated in simple examples, and the accuracies of the bootstrap methods are shown in a systematic way.
引用
收藏
页码:2616 / 2641
页数:26
相关论文
共 21 条
[1]  
[Anonymous], 1998, Applied regression analysis, DOI 10.1002/9781118625590
[2]  
Barndorff-Nielsen OE., 1994, INFERENCE ASYMPTOTIC
[3]  
BARNDORFFNIELSEN OE, 1986, BIOMETRIKA, V73, P307
[4]   MORE ACCURATE CONFIDENCE-INTERVALS IN EXPONENTIAL-FAMILIES [J].
DICICCIO, T ;
EFRON, B .
BIOMETRIKA, 1992, 79 (02) :231-245
[5]   Bootstrap confidence levels for phylogenetic trees (vol 93, pg 7085, 1996) [J].
Efron, B ;
Halloran, E ;
Holmes, S .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1996, 93 (23) :13429-13434
[6]  
EFRON B, 1987, J AM STAT ASSOC, V82, P171, DOI 10.2307/2289144
[7]   BOOTSTRAP CONFIDENCE-INTERVALS FOR A CLASS OF PARAMETRIC PROBLEMS [J].
EFRON, B .
BIOMETRIKA, 1985, 72 (01) :45-58
[8]  
Efron B, 1998, ANN STAT, V26, P1687
[9]  
FELSENSTEIN J, 1985, EVOLUTION, V39, P783, DOI 10.1111/j.1558-5646.1985.tb00420.x
[10]  
Hall P., 1992, BOOTSTRAP EDGEWORTH