UNDERSTANDING ELONGATION - THE SCALE CONTAMINATED NORMAL FAMILY

被引:24
作者
GLEASON, JR [1 ]
机构
[1] SYRACUSE UNIV, DEPT PSYCHOL, SYRACUSE, NY 13244 USA
关键词
GRAPHICAL METHODS; HEAVY-TAILED DISTRIBUTIONS; QUANTILES; SIMULATION METHODS; TAIL WEIGHT; TUKEY-H FAMILY;
D O I
10.2307/2290728
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Scale-contaminated normal distributions have been widely used in numerical studies of robustness requiring distributions that are elongated; that is, stretched relative to Gaussian behavior. The contaminated nor-mal family has much appeal, in part because the mechanism used to generate such random variables seems a realistic model for creating outliers. But this family also has serious deficiencies as a test bed for studying the effects of elongation. Specifically, there are no truly heavy-tailed contaminated normal distributions, and the parameters of the family control elongation in a way that is readily misunderstood. Thus, when sampling from the contaminated nor-mal family, it is difficult to systematically manipulate elongation and easy to misconstrue results. A symmetric distribution is elongated to the extent that its quantiles depart from the center of symmetry more rapidly than do those of a Gaussian distribution. It is essential to distinguish between elongation or stretching and scale or dispersion. We argue that failure to make this distinction leads to misconceptions about the manner in which contaminated normals are non-Gaussian and about the way in which their shape is controlled by the two parameters: contamination rate and contaminant scale. The methods used here are largely graphical in nature and rely heavily on a scale-free diagnostic plot that shows both the magnitude and location of elongation in a symmetric distribution. This plot is supplemented by a simple approximation, in terms of standard normal quantiles, for the quantiles of the contaminated normal family. We also examine some published research based on the contaminated normal family and show that the diagnostic plot can aid understanding trends in such results. Finally, we briefly consider some alternatives to the contaminated normal family, distributions that may be useful in designing numerical studies because they permit a more clear-cut manipulation of elongation with easy generation of random variates.
引用
收藏
页码:327 / 337
页数:11
相关论文
共 19 条
  • [1] Abramowitz M., 1965, HDB MATH FUNCTIONS
  • [2] Andrews D F, 1972, ROBUST ESTIMATES LOC
  • [3] BRENT RP, 1973, ALGORITHMS MINIMIZAT, P58
  • [4] MALLOWS-TYPE BOUNDED-INFLUENCE-REGRESSION TRIMMED MEANS
    DEJONGH, PJ
    DEWET, T
    WELSH, AH
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1988, 83 (403) : 805 - 810
  • [5] A RETRIEVABLE RECIPE FOR INVERSE-T
    GAVER, DP
    KAFADAR, K
    [J]. AMERICAN STATISTICIAN, 1984, 38 (04) : 308 - 311
  • [6] GLEASON JR, 1991, COMPUTING SCI STATIS, P348
  • [7] HAMPEL FR, 1986, ROBUST STATISTICS AP
  • [8] HILL RW, 1977, J AM STAT ASSOC, V72, P828
  • [9] Hoaglin D. C., 1985, EXPLORING DATA TABLE, P461, DOI DOI <PUB ID="DOI">10.1002/9781118150702.CHL1
  • [10] HOAGLIN DC, 1985, EXPLORING DATA TABLE, P417