Detection of outliers in reference distributions: Performance of Horn's algorithm

被引:96
作者
Solberg, HE
Lahti, A
机构
[1] Radium Hosp HF, Rikshosp, Dept Biochem Med, Oslo, Norway
[2] Univ Hosp No Norway, Dept Gen Psychiat, Tromso, Norway
关键词
D O I
10.1373/clinchem.2005.058339
中图分类号
R446 [实验室诊断]; R-33 [实验医学、医学实验];
学科分类号
1001 [基础医学];
摘要
Background: Medical laboratory reference data may be contaminated with outliers that should be eliminated before estimation of the reference interval. A statistical test for outliers has been proposed by Paul S. Horn and coworkers (Clin Chem 2001;47:2137-45). The algorithm operates in 2 steps: (a) mathematically transform the original data to approximate a gaussian distribution; and (b) establish detection limits (Tukey fences) based on the central part of the transformed distribution. Methods: We studied the specificity of Horn's test algorithm (probability of false detection of outliers), using Monte Carlo computer simulations performed on 13 types of probability distributions covering a wide range of positive and negative skewness. Distributions with 3% of the original observations replaced by random outliers were used to also examine the sensitivity of the test (probability of detection of true outliers). Three data transformations were used: the Box and Cox function (used in the priginal Horn's test), the Manly exponential function, and the John and Draper modulus function. Results: For many of the probability distributions, the specificity of Horn's algorithm was rather poor compared with the theoretical expectation. The cause for such poor performance was at least partially related to remaining nongaussian kurtosis (peakedness). The sensitivity showed great variation, dependent on both the type of underlying distribution and the location of the outliers (upper and/or lower tail). Conclusion: Although Horn's algorithm undoubtedly is an improvement compared with older methods for outlier detection, reliable statistical identification of outliers in reference data remains a challenge. (c) 2005 American Association for Clinical Chemistry.
引用
收藏
页码:2326 / 2332
页数:7
相关论文
共 13 条
[1]
Barnett V., 1984, Outliers in Statistical Data, V2nd
[2]
AN ANALYSIS OF TRANSFORMATIONS [J].
BOX, GEP ;
COX, DR .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1964, 26 (02) :211-252
[3]
PROCESSING DATA FOR OUTLIERS [J].
DIXON, WJ .
BIOMETRICS, 1953, 9 (01) :74-89
[4]
Hawkins D.M, 1980, IDENTIFICATION OUTLI, V11, DOI [10.1007/978-94-015-3994-4, DOI 10.1007/978-94-015-3994-4]
[5]
Horn PS, 2001, CLIN CHEM, V47, P2137
[6]
John J. A., 1980, Applied Statistics, V29, P190, DOI 10.2307/2986305
[7]
EXPONENTIAL DATA TRANSFORMATIONS [J].
MANLY, BFJ .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES D-THE STATISTICIAN, 1976, 25 (01) :37-42
[8]
The IFCC recommendation on estimation of reference intervals. The RefVal Program [J].
Solberg, HE .
CLINICAL CHEMISTRY AND LABORATORY MEDICINE, 2004, 42 (07) :710-714
[9]
SOLBERG HE, 1987, J CLIN CHEM CLIN BIO, V25, P645
[10]
RefVal: A program implementing the recommendations of the International Federation of Clinical Chemistry on the statistical treatment of reference values [J].
Solberg, HE .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 1995, 48 (03) :247-256