The limits of agreement and the intraclass correlation coefficient may be inconsistent in the interpretation of agreement

被引:83
作者
Costa-Santos, Cristina [1 ,2 ]
Bernardes, Joao [3 ,4 ,5 ]
Ayres-de-Campos, Diogo [3 ,4 ,5 ]
Costa, Antonia [3 ,4 ,5 ]
Costa, Celia [3 ,4 ]
机构
[1] Univ Porto, Fac Med, Dept Biostat & Med Informat, P-4200319 Oporto, Portugal
[2] Univ Porto, Fac Med, Ctr Res Hlth Technol & Informat Syst, P-4200319 Oporto, Portugal
[3] Univ Porto, Fac Med, Dept Obstet & Gynaecol, P-4200319 Oporto, Portugal
[4] Inst Biomed Engn, Oporto, Portugal
[5] Hosp Sao Joao, Dept Obstet & Gynaecol, Oporto, Portugal
关键词
Agreement; Reproducibility of results; Observer variation; Apgar score; Umbilical artery blood pH; Statistical data interpretation; OBSERVER AGREEMENT; RELIABILITY;
D O I
10.1016/j.jclinepi.2009.11.010
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
100404 [儿少卫生与妇幼保健学];
摘要
Objective: To compare the interpretation of agreement in the prediction of neonatal outcome variables, using the limits of agreement (LA) and the intraclass correlation coefficient (ICC). Study Design and Setting: Three obstetricians were asked to predict neonatal outcomes independently based on the evaluation of intrapartum cardiotocographic tracings. Interobserver agreement was assessed with the LA and the ICC, and the results obtained were interpreted by six clinicians and six statisticians on a scale that established agreement as very poor, poor, fair, good, or very good. Results: Interpretation of the LA results was less consensual than the ICC results, with proportions of agreement of 0.36 (95% confidence interval [CI]: 0.28-0.44) vs. 0.63 (95% CI: 0.54-0.73), respectively. LA results suggested a fair to good agreement among obstetricians, whereas "interpretation of ICC results suggested a poor to fair agreement. LA results were more plausible with reality, suggesting that obstetricians predicted neonatal outcomes better than randomly generated values, whereas it was not always the case with the ICC. Conclusions: LA and ICC can provide inconsistent results in agreement studies. Accordingly, in the absence of better strategies to assess agreement, both should be used for this purpose, but their results need to be interpreted with caution keeping their respective limitations in mind. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:264 / 269
页数:6
相关论文
共 15 条
[1]
Inconsistencies in classification by experts of cardiotocograms and subsequent clinical decision [J].
Ayres-De-Campos, D ;
Bernardes, J ;
Costa-Pereira, T ;
Pereira-Leite, L .
BRITISH JOURNAL OF OBSTETRICS AND GYNAECOLOGY, 1999, 106 (12) :1307-1310
[2]
INTRACLASS CORRELATION COEFFICIENT AS A MEASURE OF RELIABILITY [J].
BARTKO, JJ .
PSYCHOLOGICAL REPORTS, 1966, 19 (01) :3-&
[3]
STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT [J].
BLAND, JM ;
ALTMAN, DG .
LANCET, 1986, 1 (8476) :307-310
[4]
A NEW VIEW OF INTER-OBSERVER AGREEMENT [J].
BURDOCK, EI ;
FLEISS, JL ;
HARDESTY, AS .
PERSONNEL PSYCHOLOGY, 1963, 16 (04) :373-384
[5]
Carpenter J, 2000, STAT MED, V19, P1141, DOI 10.1002/(SICI)1097-0258(20000515)19:9<1141::AID-SIM479>3.0.CO
[6]
2-F
[7]
The continuing value of the apgar score for the assessment of newborn infants. [J].
Casey, BM ;
McIntire, DD ;
Leveno, KJ .
NEW ENGLAND JOURNAL OF MEDICINE, 2001, 344 (07) :467-471
[8]
DEVET H, 1999, ENCY BIOSTATISTICS, P3123
[9]
FLEISS JL, 1971, PSYCHOL BULL, V76, P378, DOI 10.1037/h0031619
[10]
MEASUREMENT OF OBSERVER AGREEMENT FOR CATEGORICAL DATA [J].
LANDIS, JR ;
KOCH, GG .
BIOMETRICS, 1977, 33 (01) :159-174