Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies

被引:3103
作者
Austin, Peter C. [1 ,2 ,3 ]
Stuart, Elizabeth A. [4 ,5 ,6 ]
机构
[1] Inst Clin Evaluat Sci, Toronto, ON M4N 3M5, Canada
[2] Univ Toronto, Inst Hlth Policy Management & Evaluat, Toronto, ON, Canada
[3] Sunnybrook Res Inst, Schulich Heart Res Program, Toronto, ON, Canada
[4] Johns Hopkins Bloomberg Sch Publ Hlth, Dept Mental Hlth, Baltimore, MD USA
[5] Johns Hopkins Bloomberg Sch Publ Hlth, Dept Biostat, Baltimore, MD USA
[6] Johns Hopkins Bloomberg Sch Publ Hlth, Dept Hlth Policy & Management, Baltimore, MD USA
基金
加拿大健康研究院;
关键词
observational study; propensity score; inverse probability of treatment weighting; IPTW; causal inference; MARGINAL STRUCTURAL MODELS; CRITICAL-APPRAISAL; SURVIVAL; ZIDOVUDINE; VARIABLES; THERAPY; REPAIR; BIAS; CARE;
D O I
10.1002/sim.6607
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The propensity score is defined as a subject's probability of treatment selection, conditional on observed baseline covariates. Weighting subjects by the inverse probability of treatment received creates a synthetic sample in which treatment assignment is independent of measured baseline covariates. Inverse probability of treatment weighting (IPTW) using the propensity score allows one to obtain unbiased estimates of average treatment effects. However, these estimates are only valid if there are no residual systematic differences in observed baseline characteristics between treated and control subjects in the sample weighted by the estimated inverse probability of treatment. We report on a systematic literature review, in which we found that the use of IPTW has increased rapidly in recent years, but that in the most recent year, a majority of studies did not formally examine whether weighting balanced measured covariates between treatment groups. We then proceed to describe a suite of quantitative and qualitative methods that allow one to assess whether measured baseline covariates are balanced between treatment groups in the weighted sample. The quantitative methods use the weighted standardized difference to compare means, prevalences, higher-order moments, and interactions. The qualitative methods employ graphical methods to compare the distribution of continuous baseline covariates between treated and control subjects in the weighted sample. Finally, we illustrate the application of these methods in an empirical case study. We propose a formal set of balance diagnostics that contribute towards an evolving concept of best practice' when using IPTW to estimate causal treatment effects using observational data. (c) 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
引用
收藏
页码:3661 / 3679
页数:19
相关论文
共 52 条
[1]   Directly-Observed Intermittent Therapy versus Unsupervised Daily Regimen during the Intensive Phase of Antituberculosis Therapy in HIV Infected Patients [J].
Alvarez-Uria, Gerardo ;
Midde, Manoranjan ;
Pakam, Raghavakalyan ;
Naik, Praveen Kumar .
BIOMED RESEARCH INTERNATIONAL, 2014, 2014
[2]  
[Anonymous], 2004, Handbook of parametric and nonparametric statistical procedures
[3]   A comparison of propensity score methods: A case-study estimating the effectiveness of post-AMI statin use [J].
Austin, PC ;
Mamdani, MM .
STATISTICS IN MEDICINE, 2006, 25 (12) :2084-2106
[4]   The use of the propensity score for estimating treatment effects: administrative versus clinical data [J].
Austin, PC ;
Mamdani, MM ;
Stukel, TA ;
Anderson, GM ;
Tu, JV .
STATISTICS IN MEDICINE, 2005, 24 (10) :1563-1578
[5]  
Austin PC, 2008, STAT MED, V27, P2037, DOI 10.1002/sim.3150
[6]   A comparison of the ability of different propensity score models to balance measured variables between treated and untreated subjects: a Monte Carlo study [J].
Austin, Peter C. ;
Grootendorst, Paul ;
Anderson, Geoffrey M. .
STATISTICS IN MEDICINE, 2007, 26 (04) :734-753
[7]   An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies [J].
Austin, Peter C. .
MULTIVARIATE BEHAVIORAL RESEARCH, 2011, 46 (03) :399-424
[8]   Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples [J].
Austin, Peter C. .
STATISTICS IN MEDICINE, 2009, 28 (25) :3083-3107
[9]   Using the Standardized Difference to Compare the Prevalence of a Binary Variable Between Two Groups in Observational Research [J].
Austin, Peter C. .
COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2009, 38 (06) :1228-1234
[10]   Goodness-of-fit diagnostics for the propensity score model when estimating treatment effects using covariate adjustment with the propensity score [J].
Austin, Peter C. .
PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2008, 17 (12) :1202-1217