Estimating Box-Cox power transformation parameter via goodness-of-fit tests

被引:43
作者
Asar, Ozgur [1 ]
Ilk, Ozlem [2 ]
Dag, Osman [3 ]
机构
[1] Univ Lancaster, Lancaster Med Sch, CHICAS, Lancaster, England
[2] Middle East Tech Univ, Dept Stat, Ankara, Turkey
[3] Hacettepe Univ, Dept Biostat, Fac Med, Ankara, Turkey
关键词
Artificial covariate; Data transformation; Normality tests; Searching algorithms; Statistical software; FALSE DISCOVERY RATE; VARIANCE TEST; APPROXIMATE ANALYSIS; NORMALITY;
D O I
10.1080/03610918.2014.957839
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Box-Cox power transformation is a commonly used methodology to transform the distribution of the data into a normal distribution. The methodology relies on a single transformation parameter. In this study, we focus on the estimation of this parameter. For this purpose, we employ seven popular goodness-of-fit tests for normality, namely Shapiro-Wilk, Anderson-Darling, Cramer-von Mises, Pearson Chi-square, Shapiro-Francia, Lilliefors and Jarque-Bera tests, together with a searching algorithm. The searching algorithm is based on finding the argument of the minimum or maximum depending on the test, i.e., maximum for the Shapiro-Wilk and Shapiro-Francia, minimum for the rest. The artificial covariate method of Dag etal. (2014) is also included for comparison purposes. Simulation studies are implemented to compare the performances of the methods. Results show that Shapiro-Wilk and the artificial covariate method are more effective than the others and Pearson Chi-square is the worst performing method. The methods are also applied to two real-life datasets. The R package AID is proposed for implementation of the aforementioned methods.
引用
收藏
页码:91 / 105
页数:15
相关论文
共 27 条
[1]   A TEST OF GOODNESS OF FIT [J].
ANDERSON, TW ;
DARLING, DA .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1954, 49 (268) :765-769
[2]  
[Anonymous], 2013, R LANG ENV STAT COMP
[3]  
[Anonymous], 2010, NEW PALGRAVE DICT EC
[4]  
[Anonymous], 2010, MATLAB LANG TECHN CO
[5]  
Barrios E, 2012, BHH2 USEFUL FUNCTION
[6]  
Benjamini Y, 2001, ANN STAT, V29, P1165
[7]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[8]   AN ANALYSIS OF TRANSFORMATIONS [J].
BOX, GEP ;
COX, DR .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1964, 26 (02) :211-252
[9]   The design of simulation studies in medical statistics [J].
Burton, Andrea ;
Altman, Douglas G. ;
Royston, Patrick ;
Holder, Roger L. .
STATISTICS IN MEDICINE, 2006, 25 (24) :4279-4292
[10]  
Cramér H, 1928, SKAND AKTUARIETIDSKR, V11, P141