Estimation and Inference of Heterogeneous Treatment Effects using Random Forests

被引:800
作者
Wager, Stefan [1 ]
Athey, Susan [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
关键词
Adaptive nearest neighbors matching; Asymptotic normality; Potential outcomes; Unconfoundedness; PROPENSITY SCORE ESTIMATION; VALID POST-SELECTION; SUBGROUP ANALYSIS; CONFIDENCE-INTERVALS; NONPARAMETRIC-TESTS; REGRESSION; JACKKNIFE; SUPPORT;
D O I
10.1080/01621459.2017.1319839
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Many scientific and engineering challengesranging from personalized medicine to customized marketing recommendationsrequire an understanding of treatment effect heterogeneity. In this article, we develop a nonparametric causal forest for estimating heterogeneous treatment effects that extends Breiman's widely used random forest algorithm. In the potential outcomes framework with unconfoundedness, we show that causal forests are pointwise consistent for the true treatment effect and have an asymptotically Gaussian and centered sampling distribution. We also discuss a practical method for constructing asymptotic confidence intervals for the true treatment effect that are centered at the causal forest estimates. Our theoretical results rely on a generic Gaussian theory for a large family of random forest algorithms. To our knowledge, this is the first set of results that allows any type of random forest, including classification and regression forests, to be used for provably valid statistical inference. In experiments, we find causal forests to be substantially more powerful than classical methods based on nearest-neighbor matching, especially in the presence of irrelevant covariates.
引用
收藏
页码:1228 / 1242
页数:15
相关论文
共 75 条
  • [71] Post hoc subgroups in clinical trials: Anathema or analytics?
    Weisberg, Herbert I.
    Pontes, Victor P.
    [J]. CLINICAL TRIALS, 2015, 12 (04) : 357 - 364
  • [72] Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression
    Westreich, Daniel
    Lessler, Justin
    Funk, Michele Jonsson
    [J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 2010, 63 (08) : 826 - 833
  • [73] From concepts, theory, and evidence of heterogeneity of treatment effects to methodological 6approaches: a primer
    Willke, Richard J.
    Zheng, Zhiyuan
    Subedi, Prasun
    Althin, Rikard
    Mullins, C. Daniel
    [J]. BMC MEDICAL RESEARCH METHODOLOGY, 2012, 12
  • [74] Model-based recursive partitioning
    Zelleis, Achim
    Hothorn, Torsten
    Hornik, Kurt
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2008, 17 (02) : 492 - 514
  • [75] Reinforcement Learning Trees
    Zhu, Ruoqing
    Zeng, Donglin
    Kosorok, Michael R.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2015, 110 (512) : 1770 - 1784