Validity of observer-based aggregate scoring systems as descriptors of elbow pain, function, and disability

被引:190
作者
Turchin, DC
Beaton, DE
Richards, RR
机构
[1] St Michaels Hosp, Upper Extrem Reconstruct Serv, Toronto, ON M5C 1R6, Canada
[2] Univ Toronto, Toronto, ON, Canada
关键词
D O I
10.2106/00004623-199802000-00002
中图分类号
R826.8 [整形外科学]; R782.2 [口腔颌面部整形外科学]; R726.2 [小儿整形外科学]; R62 [整形外科学(修复外科学)];
学科分类号
摘要
Current elbow-scoring systems are based on the observer-derived assessment of a variety of clinical and functional criteria,,which are scored separately and then aggregated, The aggregate score then is assigned a categorical ranking that ranges from excellent to poor, The developers of different elbow-scoring systems have chosen different outcome criteria, assigned different weights to each criterion, and accorded different ranges of values to each categorical ranking, Five different elbow-scoring systems (the Mayo elbow-performance index and the systems of Broberg and Morrey; Ewald et al., The Hospital for Special Surgery and Pritchard) were used to evaluate the same group of patients, The validity of the scoring systems was as determined with use of visual-analog scales for the assessment of pain and function, patient and physician-derived ratings of the severity of impairment of the elbow; and two functional questionnaires completed by the patient (the Disabilities of the Arm, Shoulder and Hand questionnaire and the Modified American Shoulder and Elbow Surgeons patient self-evaluation form). The study sample consisted of sixty-nine patients who had sought treatment at one of two tertiary referral clinics because of problems related to the elbow Pearson product-moment correlation coefficients were used to compare the raw aggregate scores, and kappa statistics were used to determine the level of agreement among the categorical rankings (excellent, good, fair, and poor), Examination of the five scoring systems revealed a remarkable lack of concordance with regard to the aspects of elbow function that were assessed, Good correlation was observed when the systems were compared on the basis of raw scores (Pearson product-moment correlation coefficients, 0.79 to 0.90), but only slight-to-moderate correlation was noted when the systems were compared on the basis of categorical rankings (quadratic weighted kappa coefficients, 0.18 to 0.49), Validity testing showed the system of Ewald et al, and the Mayo elbow-performance index to be the most discriminating, the system of Pritchard to be the least discriminating, and the system of The Hospital for Special Surgery and the system of Broberg and Morrey to be intermediate, The scores determined with the elbow-scoring systems demonstrated only moderate correlation with the score for function on the visual analog scale (Pearson product-moment correlation coefficients, 0.44 to 0.66), whereas those derived from the functional questionnaires completed by the patient demonstrated moderate-to good correlation with the score for function (Pearson product-moment correlation coefficients, 0.72 and 0.80), CLINICAL RELEVANCE: We observed a remarkable lack of agreement when five different elbow-scoring systems were used to determine categorical rankings for the same cohort of patients, The correlations between the raw aggregate scores were better, On the basis of these findings,,ve believe that outcomes should be expressed as ran scores rather than as categorical rankings, me also found that scores derived from patient-completed functional questionnaires correlated more closely with perceived functional loss than did those determined with aggregate elbow-scoring systems, It must be recognized that comparisons between studies that are based on different scoring systems are not valid and that the categorical rankings of different systems are not interchangeable. The outcome of therapies designed for the treatment of the elbow should be determined on the basis of a patient-derived assessment of function, a clinical examination, and an assessment of pain.
引用
收藏
页码:154 / 162
页数:9
相关论文
共 30 条
[1]  
[Anonymous], 1987, Can J Occup Ther
[2]   INTRACLASS CORRELATION COEFFICIENT AS A MEASURE OF RELIABILITY [J].
BARTKO, JJ .
PSYCHOLOGICAL REPORTS, 1966, 19 (01) :3-&
[3]  
BEATON DE, 1995, J HAND SURG-AM, V20A, P747, DOI 10.1016/S0363-5023(05)80425-3
[4]   Measuring function of the shoulder - A cross-sectional comparison of five questionnaires [J].
Beaton, DE ;
Richards, RR .
JOURNAL OF BONE AND JOINT SURGERY-AMERICAN VOLUME, 1996, 78A (06) :882-890
[5]  
BEATON DE, UNPUB EVALUATING REL
[6]  
BERGNER M, 1987, ANNU REV PUBL HEALTH, V8, P191, DOI 10.1146/annurev.publhealth.8.1.191
[7]   TEST-RETEST RELIABILITY OF HAND-HELD DYNAMOMETRY DURING A SINGLE SESSION OF STRENGTH ASSESSMENT [J].
BOHANNON, RW .
PHYSICAL THERAPY, 1986, 66 (02) :206-209
[8]   RESULTS OF DELAYED EXCISION OF THE RADIAL HEAD AFTER FRACTURE [J].
BROBERG, MA ;
MORREY, BF .
JOURNAL OF BONE AND JOINT SURGERY-AMERICAN VOLUME, 1986, 68A (05) :669-674
[9]   OUTCOME MEASURES FOR STUDYING PATIENTS WITH LOW-BACK-PAIN [J].
DEYO, RA ;
ANDERSSON, G ;
BOMBARDIER, C ;
CHERKIN, DC ;
KELLER, RB ;
LEE, CK ;
LIANG, MH ;
LIPSCOMB, B ;
SHEKELLE, P ;
SPRATT, KF ;
WEINSTEIN, JN .
SPINE, 1994, 19 (18) :S2032-S2036
[10]   SAMPLE-SIZE REQUIREMENTS FOR RELIABILITY STUDIES [J].
DONNER, A ;
ELIASZIW, M .
STATISTICS IN MEDICINE, 1987, 6 (04) :441-448