Performance of SIBTEST when the percentage of DIF items is large

被引:25
作者
Gierl, MJ
Gotzmann, A
Boughton, KA
机构
[1] Univ Alberta, Dept Educ Psychol, Edmonton, AB T6G 2M7, Canada
[2] CTB McGraw Hill, Monterey, CA USA
[3] Educ Testing Serv, Princeton, NJ 08541 USA
关键词
D O I
10.1207/s15324818ame1703_2
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Differential item functioning (DIF) analyses are used to identify items that operate differently between two groups, after controlling for ability. The Simultaneous Item Bias Test (SIBTEST) is a popular DIF detection method that matches examinees on a true score estimate of ability. However in some testing situations, like test translation and adaptation, the percentage of DIF items can be large. In these situations, the effectiveness of SIBTEST has not been thoroughly evaluated. The problem is addressed in this study. Four variables were manipulated in a simulation study: The amount of DIF on a 40-item test (20%, 40%, and 60% of the items on the test had moderate and large DIF), the direction of DIF (balanced and unbalanced DIF items), sample size (500, 1,000, 1,500, and 2,000 examinees in each group), and ability distribution differences between groups (equal and unequal). Each condition was replicated 100 times to facilitate the computation of the DIF detection rates. The results from the simulation study indicated that SIBTEST yielded adequate DIF detection rates, even when 60% of the items contained DIF, providing DIF was balanced between the reference and focal groups and sample sizes were at least 1,000 examinees per group. SIBTEST also had adequate detection rates in the 20% unbalanced DIF conditions with samples of 1,000 examinees per group. However, SIBTEST had poor detection rates across all 40% and 60% unbalanced DIF conditions. Implications for practice and future directions for research are discussed.
引用
收藏
页码:241 / 264
页数:24
相关论文
共 54 条
[11]  
Camilli G., 1994, METHODS IDENTIFYING
[12]  
Clauser B., 1993, APPL MEAN EDUC, V6, P269, DOI [DOI 10.1207/S15324818AME0604_2, 10.1207/s15324818ame0604_2]
[13]  
Clauser B.E., 1998, Educational Measurement: Issues and Practice, V17, P31, DOI DOI 10.1111/J.1745-3992.1998.TB00619.X
[14]   A comparison of alternative matching strategies for DIF detection in tests that are multidimensional [J].
Clauser, BE ;
Nungester, RJ ;
Mazor, K ;
Ripkey, D .
JOURNAL OF EDUCATIONAL MEASUREMENT, 1996, 33 (02) :202-214
[15]   A POWER PRIMER [J].
COHEN, J .
PSYCHOLOGICAL BULLETIN, 1992, 112 (01) :155-159
[16]  
Cohen J., 1988, STAT POWER ANAL BEHA
[17]  
CROCKER L, 1985, INTRO CLASSICAL MODE
[18]  
Dorans N.J., 1989, APPLIED MEASUREMENT, V2, P217, DOI DOI 10.1207/S15324818AME0203_3
[19]  
Dorans N. J., 1993, Differential item functioning, P35, DOI 10.1002/j.2333-8504.1992.tb01440.x
[20]   Item-bundle DIF hypothesis testing: Identifying suspect bundles and assessing their differential functioning [J].
Douglas, JA ;
Roussos, LA ;
Stout, W .
JOURNAL OF EDUCATIONAL MEASUREMENT, 1996, 33 (04) :465-484