Common problems in using, modifying, and reporting on classic measurement instruments

被引:4
作者
Daltroy, LH
机构
[1] Brigham & Womens Hosp, Robert B Brigham Multipurpose Arthrit & Musculosk, Boston, MA 02115 USA
[2] Brigham & Womens Hosp, Dept Rheumatol Immunol, Boston, MA 02115 USA
[3] Harvard Univ, Sch Med, Dept Med, Boston, MA USA
[4] Harvard Univ, Sch Publ Hlth, Dept Hlth & Social Behav, Boston, MA 02115 USA
关键词
D O I
10.1002/art.1790100612
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Users of classic summated scales can clear up some common quandaries by considering the underlying assumptions that a scale's items represent a random sample drawn from an infinite pool of items representing a unidimensional domain. Scale reliability is higher with more items and a higher average correlation among items. Standards of reliability need to be higher when decisions about treatment are to be based on scores, because confidence intervals around individual scores expand quickly below α = 0.95. Scales that are longer than needed may sometimes be shortened, using a formula to determine the approximate number of questions that will yield the desired reliability. Reliability may be improved by addition of new items or expansion of response sets for existing items. Such alterations of scales should always be corroborated with new data. The use of coefficient alpha for scales with severely limited domains, such as self-care knowledge, is rarely useful or appropriate psychometrically, and can be misleading. Standard item selection procedures in classic scales maximize reliability at the scale center and overestimate change at the center and underestimate it at the ends; there are both classic and modern techniques to improve scale construction in this regard. The omission of specific items (such as symptoms) on many scales is often immaterial to the reliability of the scale and to the usefulness of the summary score; this is because each item is well correlated with other scale items, so that information about the underlying domain from any single item is redundant. Finally, despite the attractiveness of disease-specific scales, generic scales often do as well, even in arthritis populations, and have the added benefit of allowing comparisons with a wider range of diseases and cultural groups. The development, modification, and use of classic summated scales can be complex in practice, calling on both statistical skills and content area judgment, but at heart the principles are fairly simple. Keeping the principles in mind is half the battle when solving many common problems. The interested reader is referred to the texts referenced in this article and the one by DeVellis (1) for further discourse on this topic.
引用
收藏
页码:441 / 447
页数:7
相关论文
共 14 条
[1]  
Andrich D., 1988, RASCH MODELS MEASURE, DOI DOI 10.4135/9781412985598
[2]  
Daltroy L H, 1992, Arthritis Care Res, V5, P146, DOI 10.1002/art.1790050306
[3]   The North American spine society lumbar spine outcome assessment instrument - Reliability and validity tests [J].
Daltroy, LH ;
CatsBaril, WL ;
Katz, JN ;
Fossel, AH ;
Liang, MH .
SPINE, 1996, 21 (06) :741-748
[4]  
DeVellis R F, 1996, Arthritis Care Res, V9, P239, DOI 10.1002/1529-0131(199606)9:3<239::AID-ANR1790090313>3.0.CO
[5]  
2-O
[6]  
Devellis RF., 2017, Scale Development. Theory and applications, V4th
[7]   MEASUREMENT OF PATIENT OUTCOME IN ARTHRITIS [J].
FRIES, JF ;
SPITZ, P ;
KRAINES, RG ;
HOLMAN, HR .
ARTHRITIS AND RHEUMATISM, 1980, 23 (02) :137-145
[8]  
McDowell I., 1996, MEASURING HLTH GUIDE
[9]   AIMS2 - THE CONTENT AND PROPERTIES OF A REVISED AND EXPANDED ARTHRITIS IMPACT MEASUREMENT SCALES HEALTH-STATUS QUESTIONNAIRE [J].
MEENAN, RF ;
MASON, JH ;
ANDERSON, JJ ;
GUCCIONE, AA ;
KAZIS, LE .
ARTHRITIS AND RHEUMATISM, 1992, 35 (01) :1-10
[10]  
Nunnally J., 1994, PSYCHOMETRIC THEORY