Inter-rater and test-retest reliability of quality assessments by novice student raters using the Jadad and Newcastle-Ottawa Scales

被引:157
作者
Oremus, Mark [1 ,2 ]
Oremus, Carolina [3 ,4 ]
Hall, Geoffrey B. C. [3 ,4 ]
McKinnon, Margaret C. [3 ,4 ]
机构
[1] McMaster Univ, McMaster Evidence Based Practice Ctr, Hamilton, ON, Canada
[2] McMaster Univ, Dept Clin Epidemiol & Biostat, Hamilton, ON, Canada
[3] McMaster Integrat Neurosci Discovery & Study MIND, Hamilton, ON, Canada
[4] Dept Psychiat & Behav Neurosci, Hamilton, ON, Canada
来源
BMJ OPEN | 2012年 / 2卷 / 04期
关键词
RANDOMIZED CONTROLLED-TRIALS; TOOLS; RISK; BIAS;
D O I
10.1136/bmjopen-2012-001368
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Introduction: Quality assessment of included studies is an important component of systematic reviews. Objective: The authors investigated inter-rater and test-retest reliability for quality assessments conducted by inexperienced student raters. Design: Student raters received a training session on quality assessment using the Jadad Scale for randomised controlled trials and the Newcastle-Ottawa Scale (NOS) for observational studies. Raters were randomly assigned into five pairs and they each independently rated the quality of 13-20 articles. These articles were drawn from a pool of 78 papers examining cognitive impairment following electroconvulsive therapy to treat major depressive disorder. The articles were randomly distributed to the raters. Two months later, each rater re-assessed the quality of half of their assigned articles. Setting: McMaster Integrative Neuroscience Discovery and Study Program. Participants: 10 students taking McMaster Integrative Neuroscience Discovery and Study Program courses. Main outcome measures: The authors measured inter-rater reliability using kappa and the intraclass correlation coefficient type 2,1 or ICC(2,1). The authors measured test-retest reliability using ICC (2,1). Results: Inter-rater reliability varied by scale question. For the six-item Jadad Scale, question-specific kappa s ranged from 0.13 (95% CI -0.11 to 0.37) to 0.56 (95% CI 0.29 to 0.83). The ranges were -0.14 (95% CI -0.28 to 0.00) to 0.39 (95% CI -0.02 to 0.81) for the NOS cohort and -0.20 (95% CI -0.49 to 0.09) to 1.00 (95% CI 1.00 to 1.00) for the NOS case-control. For overall scores on the six-item Jadad Scale, ICC(2,1)s for inter-rater and test-retest reliability (accounting for systematic differences between raters) were 0.32 (95% CI 0.08 to 0.52) and 0.55 (95% CI 0.41 to 0.67), respectively. Corresponding ICC(2,1) s for the NOS cohort were -0.19 (95% CI -0.67 to 0.35) and 0.62 (95% CI 0.25 to 0.83), and for the NOS case-control, the ICC(2,1) s were 0.46 (95% CI -0.13 to 0.92) and 0.83 (95% CI 0.48 to 0.95). Conclusions: Inter-rater reliability was generally poor to fair and test-retest reliability was fair to excellent. A pilot rating phase following rater training may be one way to improve agreement.
引用
收藏
页数:6
相关论文
共 33 条
[1]   Classification and appraisal of the level of clinical evidence of publications from the Canadian Association of Pediatric Surgeons for the past 10 years [J].
Al-Harbi, Khalad ;
Farrokhyar, Forough ;
Mulla, Sohail ;
Fitzgerald, Peter .
JOURNAL OF PEDIATRIC SURGERY, 2009, 44 (05) :1013-1017
[2]  
Altman DG, 1990, PRACTICAL STAT MED R, DOI DOI 10.1201/9780429258589
[3]  
[Anonymous], EFF HLTH CAR PROGR
[4]  
[Anonymous], 2008, HLTH MEASUREMENT SCA, DOI DOI 10.1093/ACPROF:OSO/9780199231881.001.0001
[5]  
[Anonymous], COCHRANE HDB SYSTEMA
[6]  
[Anonymous], 2003, Statistical Methods for Rates and Proportions
[7]  
[Anonymous], E47 AHRQ
[8]   Reliability of Chalmers' scale to assess quality in meta-analyses on pharmacological treatments for osteoporosis [J].
Bérard, A ;
Andreu, N ;
Tétrault, JP ;
Niyonsenga, T ;
Myhal, D .
ANNALS OF EPIDEMIOLOGY, 2000, 10 (08) :498-503
[9]   A METHOD FOR ASSESSING THE QUALITY OF A RANDOMIZED CONTROL TRIAL [J].
CHALMERS, TC ;
SMITH, H ;
BLACKBURN, B ;
SILVERMAN, B ;
SCHROEDER, B ;
REITMAN, D ;
AMBROZ, A .
CONTROLLED CLINICAL TRIALS, 1981, 2 (01) :31-49
[10]   A COEFFICIENT OF AGREEMENT FOR NOMINAL SCALES [J].
COHEN, J .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20 (01) :37-46