Undesired variance due to examiner stringency/leniency effect in communication skill scores assessed in OSCEs

被引:86
作者
Harasym, Peter H. [2 ]
Woloschuk, Wayne [1 ]
Cunning, Leslie [3 ]
机构
[1] Univ Calgary, Fac Med, Calgary, AB T2N 4N1, Canada
[2] Univ Calgary, Dept Community Hlth Sci, Calgary, AB T2N 4N1, Canada
[3] Univ Calgary, Dept Family Med, Calgary, AB T2N 4N1, Canada
关键词
Communication skills; Error variance; Item response theory; Leniency; OSCE; Rasch; Stringency; True score;
D O I
10.1007/s10459-007-9068-0
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Physician-patient communication is a clinical skill that can be learned and has a positive impact on patient satisfaction and health outcomes. A concerted effort at all medical schools is now directed at teaching and evaluating this core skill. Student communication skills are often assessed by an Objective Structure Clinical Examination (OSCE). However, it is unknown what sources of error variance are introduced into examinee communication scores by various OSCE components. This study primarily examined the effect different examiners had on the evaluation of students' communication skills assessed at the end of a family medicine clerkship rotation. The communication performance of clinical clerks from Classes 2005 and 2006 were assessed using six OSCE stations. Performance was rated at each station using the 28-item Calgary-Cambridge guide. Item Response Theory analysis using a Multifaceted Rasch model was used to partition the various sources of error variance and generate a "true" communication score where the effects of examiner, case, and items are removed. Variance and reliability of scores were as follows: communication scores (.20 and .87), examiner stringency/leniency (.86 and .91), case (.03 and .96), and item (.86 and .99), respectively. All facet scores were reliable (.87-.99). Examiner variance (.86) was more than four times the examinee variance (.20). About 11% of the clerks' outcome status shifted using "true" rather than observed/raw scores. There was large variability in examinee scores due to variation in examiner stringency/leniency behaviors that may impact pass-fail decisions. Exploring the benefits of examiner training and employing "true" scores generated using Item Response Theory analyses prior to making pass/fail decisions are recommended.
引用
收藏
页码:617 / 632
页数:16
相关论文
共 43 条
[1]  
Adamson T E, 1984, Mobius, V4, P33, DOI 10.1002/chp.4760040409
[2]  
Bass EB, 1997, AM J MED, V102, P564
[3]  
Bond Trevor G., 2001, Applying the rasch model: fundamental measurement in the human science
[4]  
Bowers R, 1993, J Tenn Med Assoc, V86, P112
[5]  
Brown G, 1999, Eur J Dent Educ, V3, P117, DOI 10.1111/j.1600-0579.1999.tb00077.x
[6]  
CANDIB LM, 1999, FAMILIES SYSTEMS HLT, V17, P349, DOI DOI 10.1037/H0089876
[7]   RELIABILITY AND VALIDITY OF THE OBJECTIVE STRUCTURED CLINICAL EXAMINATION IN ASSESSING SURGICAL RESIDENTS [J].
COHEN, R ;
REZNICK, RK ;
TAYLOR, BR ;
PROVAN, J ;
ROTHMAN, A .
AMERICAN JOURNAL OF SURGERY, 1990, 160 (03) :302-305
[8]   Assessing professional behaviour and medical error [J].
Cohen, R .
MEDICAL TEACHER, 2001, 23 (02) :145-151
[9]  
Frymoyer John W, 2002, J Am Acad Orthop Surg, V10, P95
[10]  
Gordon GH, 1995, WESTERN J MED, V163, P527