How close is close enough? Evaluating propensity score matching using data from a class size reduction experiment

被引:61
作者
Wilde, Elizabeth Ty [1 ]
Hollister, Robinson
机构
[1] Princeton Univ, Dept Econ, Princeton, NJ 08544 USA
[2] Swarthmore Coll, Swarthmore, PA 19081 USA
关键词
D O I
10.1002/pam.20262
中图分类号
F [经济];
学科分类号
02 ;
摘要
In recent years, propensity score matching (PSM) has gained attention as a potential method for estimating the impact of public policy programs in the absence of experimental evaluations. In this study, we evaluate the usefulness of PSM for estimating the impact of a program change in an educational context (Tennessee Student Teacher Achievement Ratio Project [Project STAR]). Because Tennessee Project STAR experiment involved an effective random assignment procedure, the experimental results from this policy intervention can be used as a benchmark, to which we compare the impact estimates produced using propensity score matching methods. We use several different methods to assess these nonexperimental estimates of the impact of the program. We try to determine "how close is close enough, "putting greatest emphasis on the question: Would the nonexperimental estimate have led to the wrong decision when compared to the experimental estimate of the program? We find that propensity score methods perform poorly with respect to measuring the impact of a reduction in class size on achievement test scores. We conclude that further research is needed before policymakers rely on PSM as an evaluation tool. (c) 2007 by the Association for Public Policy Analysis and Management.
引用
收藏
页码:455 / 477
页数:23
相关论文
共 22 条
[1]   Implementing matching estimators for average treatment effects in Stata [J].
Abadie, Alberto ;
Drukker, David ;
Herr, Jane Leber ;
Imbens, Guido W. .
STATA JOURNAL, 2004, 4 (03) :290-311
[2]   Are experiments the only option? A look at dropout prevention programs [J].
Agodini, R ;
Dynarski, M .
REVIEW OF ECONOMICS AND STATISTICS, 2004, 86 (01) :180-194
[3]  
[Anonymous], NBER TECHNICAL WORKI
[4]   Causal effects in, nonexperimental studies: Reevaluating the evaluation of training programs [J].
Dehejia, RH ;
Wahba, S .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1999, 94 (448) :1053-1062
[5]   Propensity score-matching methods for nonexperimental causal studies [J].
Dehejia, RH ;
Wahba, S .
REVIEW OF ECONOMICS AND STATISTICS, 2002, 84 (01) :151-161
[6]   An assessment of propensity score matching as a nonexperimental impact estimator - Evidence from Mexico's PROGRESA program [J].
Diaz, JJ ;
Handa, S .
JOURNAL OF HUMAN RESOURCES, 2006, 41 (02) :319-345
[7]   THE ADEQUACY OF COMPARISON GROUP DESIGNS FOR EVALUATIONS OF EMPLOYMENT-RELATED PROGRAMS [J].
FRAKER, T ;
MAYNARD, R .
JOURNAL OF HUMAN RESOURCES, 1987, 22 (02) :194-227
[8]  
FRIEDLANDER D, 1995, AM ECON REV, V85, P923
[9]   Nonexperimental versus experimental estimates of earnings impacts [J].
Glazerman, S ;
Levy, DM ;
Myers, D .
ANNALS OF THE AMERICAN ACADEMY OF POLITICAL AND SOCIAL SCIENCE, 2003, 589 :63-93
[10]   Characterizing selection bias using experimental data [J].
Heckman, J ;
Ichimura, H ;
Smith, J ;
Todd, P .
ECONOMETRICA, 1998, 66 (05) :1017-1098