Matrix Completion Methods for Causal Panel Data Models

被引:161
作者
Athey, Susan [1 ,2 ]
Bayati, Mohsen [3 ]
Doudchenko, Nikolay [3 ]
Imbens, Guido [1 ,2 ,4 ]
Khosravi, Khashayar [5 ]
机构
[1] Stanford Univ, SIEPR, Grad Sch Business, Stanford, CA 94305 USA
[2] NBER, Stanford, CA 94305 USA
[3] Stanford Univ, Grad Sch Business, Stanford, CA 94305 USA
[4] Stanford Univ, Dept Econ, SIEPR, Stanford, CA 94305 USA
[5] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA
关键词
Causality; Interactive fixed effects; Low-rank matrix estimation; Synthetic controls; Unconfoundedness; LOW-RANK MATRICES; REGRESSION; INFERENCE; NUMBER; RATES;
D O I
10.1080/01621459.2021.1891924
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this article, we study methods for estimating causal effects in settings with panel data, where some units are exposed to a treatment during some periods and the goal is estimating counterfactual (untreated) outcomes for the treated unit/period combinations. We propose a class of matrix completion estimators that uses the observed elements of the matrix of control outcomes corresponding to untreated unit/periods to impute the "missing" elements of the control outcome matrix, corresponding to treated units/periods. This leads to a matrix that well-approximates the original (incomplete) matrix, but has lower complexity according to the nuclear norm for matrices. We generalize results from the matrix completion literature by allowing the patterns of missing data to have a time series dependency structure that is common in social science applications. We present novel insights concerning the connections between the matrix completion literature, the literature on interactive fixed effects models and the literatures on program evaluation under unconfoundedness and synthetic control methods. We show that all these estimators can be viewed as focusing on the same objective function. They differ solely in the way they deal with identification, in some cases solely through regularization (our proposed nuclear norm matrix completion estimator) and in other cases primarily through imposing hard restrictions (the unconfoundedness and synthetic control approaches). The proposed method outperforms unconfoundedness-based or synthetic control estimators in simulations based on real data.
引用
收藏
页码:1716 / 1730
页数:15
相关论文
共 63 条