Analyzing bivariate continuous data grouped into categories defined by empirical quantiles of marginal distributions

被引:16
作者
Borkowf, CB
Gail, MH
Carroll, RJ
Gill, RD
机构
[1] NCI, DIV CANC EPIDEMIOL & GENET, BIOSTAT BRANCH, BETHESDA, MD 20892 USA
[2] TEXAS A&M UNIV, DEPT STAT, COLLEGE STN, TX 77843 USA
[3] UNIV UTRECHT, DEPT MATH, NL-3508 TA UTRECHT, NETHERLANDS
关键词
agreement; contingency table; empirical bivariate quantile-partitioned distribution; kappa statistic; quantile;
D O I
10.2307/2533563
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Epidemiologists sometimes study the association between two measurements of exposure on the same subjects by grouping the original bivariate continuous data into categories that are defined by the empirical quantiles of the two marginal distributions. Although such grouped data are presented in a two-way contingency table, the cell counts in this table do not have a multinomial distribution. We describe the joint distribution of counts in such a table by the term empirical bivariate quantile-partitioned (EBQP) distribution. Blomqvist (1950, Annals of Mathematical Statistics 21, 539-600) gave an asymptotic EBQP theory for bivariate data partitioned by the sample medians. We demonstrate that his asymptotic theory is not correct, however, except in special cases. We present a general asymptotic theory for tables of arbitrary dimensions and apply this theory to construct confidence intervals for the kappa statistic. We show by simulations that the confidence interval procedures we propose have near nominal coverage for sample sizes exceeding 60 for both 2 x 2 and 3 x 3 tables. These simulations also illustrate that the asymptotic theory of Blomqvist (1950) and the methods that Fleiss, Cohen, and Everitt (1969, Psychological Bulletin 72, 323-327) give for multinomial tables can yield subnominal coverage for kappa calculated from EBQP tables, although in some cases the coverage for these procedures is near nominal levels.
引用
收藏
页码:1054 / 1069
页数:16
相关论文
共 22 条
[1]  
Agresti A., 1990, CATEGORICAL DATA ANA
[2]  
*APT SYST INC, 1992, GAUSS SYST VERS 3 0
[3]  
Bishop Y.M., 2007, DISCRETE MULTIVARIAT
[4]   ON A MEASURE OF DEPENDENCE BETWEEN 2 RANDOM VARIABLES [J].
BLOMQVIST, N .
ANNALS OF MATHEMATICAL STATISTICS, 1950, 21 (04) :593-600
[5]  
BORKOWF CB, 1997, THESIS CORNELL U ITH
[6]   A COEFFICIENT OF AGREEMENT FOR NOMINAL SCALES [J].
COHEN, J .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20 (01) :37-46
[7]  
Cox D. R., 1970, The analysis of binary data
[8]  
CSORGO M, 1983, CBMS REG C SER APPL, V42
[9]  
Efron B., 1986, Statistical science, V1, P54, DOI 10.1214/ss/1177013815
[10]   LARGE SAMPLE STANDARD ERRORS OF KAPPA AND WEIGHTED KAPPA [J].
FLEISS, JL ;
COHEN, J ;
EVERITT, BS .
PSYCHOLOGICAL BULLETIN, 1969, 72 (05) :323-&