A comparison of two scoring methods for an automated speech scoring system

被引：22

作者：

Xi, Xiaoming ^{[1
]}

Higgins, Derrick ^{[1
]}

Zechner, Klaus ^{[1
]}

Williamson, David ^{[1
]}

机构：

[1] Educ Testing Serv, Ctr Valid Res, Res & Dev, Princeton, NJ 08541 USA

来源：

LANGUAGE TESTING | 2012年 / 29卷 / 03期

关键词：

automated speech scoring; validity; classification trees; TOEFL; practice test; NETWORKS; SCORES;

D O I：

10.1177/0265532211425673

中图分类号：

H0 [语言学];

学科分类号：

030303 ; 0501 ; 050102 ;

摘要：

This paper compares two alternative scoring methods - multiple regression and classification trees - for an automated speech scoring system used in a practice environment. The two methods were evaluated on two criteria: construct representation and empirical performance in predicting human scores. The empirical performance of the two scoring models is reported in Zechner, Higgins, Xi, & Williamson (2009), which discusses the development of the entire automated speech scoring system; the current paper shifts the focus to the comparison of the two scoring methods, elaborating both technical and substantive considerations and providing a reasoned argument for the trade-off between them. We concluded that a multiple regression model with expert weights was superior to the classification tree model. In addition to comparing the relative performance of the two models, we also evaluated the adequacy of the regression model for the intended use. In particular, the construct representation of the model was sufficiently broad to justify its use in a low-stakes application. The correlation of the model-predicted total test scores with human scores (r = 0.7) was also deemed acceptable for practice purposes.

引用

页码：371 / 394

页数：24

共 33 条

[1]

[Anonymous], 2002, J. Technol. Learn. Assess

[2]

[Anonymous], 2006, J TECHNOLOGY LEARNIN

[3]

[Anonymous], 1984, WADSWORTH INC

[4]

[Anonymous], 1979, New Developments

[5]

Bennett R.E., 1998, Educational Measurement: Issues and Practice, V17, P9, DOI DOI 10.1111/J.1745-3992.1998.TB00631.X

[6]

Bernstein J., 1989, J ACOUSTIC SOC AM S1, pS77

[7]

BERNSTEIN J, 1990, P INT C SPOK LANG PR, P1185

[8]

Bernstein J., 1999, PHONEPASS TESTING ST

[9]

Bernstein J, 2008, ROUT STUD COMP ASSIS, V4, P174

[10]

Braun H., 2006, Automated scoring for complex constructed response tasks in computer based testing, P83

← 1 2 3 4 →