Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples

被引：4476

作者：

Austin, Peter C. ^{[1
,2
,3
]}

机构：

[1] Inst Clin Evaluat Sci, Toronto, ON M4N 3M5, Canada

[2] Univ Toronto, Dalla Lana Sch Publ Hlth, Toronto, ON, Canada

[3] Univ Toronto, Dept Hlth Policy Management & Evaluat, Toronto, ON M5S 1A1, Canada

来源：

STATISTICS IN MEDICINE | 2009年 / 28卷 / 25期

基金：

加拿大健康研究院;

关键词：

balance; goodness-of-fit; observational study; propensity score; matching; propensity-score matching; standardized difference; bias; ACUTE MYOCARDIAL-INFARCTION; HEART-FAILURE; ODDS RATIO; MODELS; PRINCIPLES; REGRESSION;

D O I：

10.1002/sim.3697

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

The propensity score is a subject's probability of treatment, conditional on observed baseline covariates. Conditional on the true propensity score, treated and untreated subjects have similar distributions of observed baseline covariates. Propensity-score matching is a popular method of using the propensity score in the medical literature. Using this approach, matched sets of treated and untreated Subjects with similar values of the propensity score are formed. Inferences about treatment effect made using propensity-score matching are valid only if, in the matched sample, treated and untreated subjects have similar distributions of measured baseline covariates. In this paper we discuss the following methods for assessing whether the propensity score model has been correctly specified: comparing means and prevalences of baseline characteristics using standardized differences; ratios comparing the variance of continuous covariates between treated and untreated subjects; comparison of higher order moments and interactions; five-number summaries; and graphical methods such as quantile-quantile plots, side-by-side boxplots, and non-parametric density plots for comparing the distribution of baseline covariates between treatment groups. We describe methods to determine the sampling distribution of the standardized difference when the true standardized difference is equal to zero, thereby allowing one to determine the range of standardized differences that are plausible with the propensity score model having been correctly specified. We highlight the limitations of some previously used methods for assessing the adequacy of the specification of the propensity-score model. In particular, methods based on comparing the distribution of the estimated propensity score between treated and untreated subjects are uninformative. Copyright (C) 2009 John Wiley & Sons, Ltd.

引用

页码：3083 / 3107

页数：25

共 47 条

[1] Heart failure, chronic diuretic use, and increase in mortality and hospitalization: an observational study using propensity score methods
Ahmed, Ali
Husain, Ahsan
Love, Thomas E.
Gambassi, Giovanni
Dell'Italia, Louis J.
Francis, Gary S.
Gheorghiade, Mihai
Allman, Richard M.
Meleth, Sreelatha
Bourge, Robert C.
[J]. EUROPEAN HEART JOURNAL, 2006, 27 (12) : 1431 - 1439
[2] Outcomes in ambulatory chronic systolic and diastolic heart failure: A propensity score analysis
Ahmed, Ali
Perry, Gilbert J.
Fleg, Jerome L.
Love, Thomas E.
Goff, David C., Jr.
Kitzman, Dalane W.
[J]. AMERICAN HEART JOURNAL, 2006, 152 (05) : 956 - 966
[3] ALTMAN DG, 1991, STAT MED, V10, P797
[4] The revised CONSORT statement for reporting randomized trials: Explanation and elaboration
Altman, DG
Schulz, KF
Moher, D
Egger, M
Davidoff, F
Elbourne, D
Gotzsche, PC
Lang, T
[J]. ANNALS OF INTERNAL MEDICINE, 2001, 134 (08) : 663 - 694
[5] A comparison of propensity score methods: A case-study estimating the effectiveness of post-AMI statin use
Austin, PC
Mamdani, MM
[J]. STATISTICS IN MEDICINE, 2006, 25 (12) : 2084 - 2106
[6] Comparing clinical data with administrative data for producing acute myocardial infarction report cards
Austin, PC
Tu, JV
[J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2006, 169 : 115 - 126
[7] Austin PC, 2008, STAT MED, V27, P2037, DOI 10.1002/sim.3150
[8] Propensity-score matching in the cardiovascular surgery literature from 2004 to 2006: A systematic review and suggestions for improvement
Austin, Peter C.
[J]. JOURNAL OF THORACIC AND CARDIOVASCULAR SURGERY, 2007, 134 (05) : 1128 - U7
[9] A comparison of regression trees, logistic regression, generalized additive models, and multivariate adaptive regression splines for predicting AMI mortality
Austin, Peter C.
[J]. STATISTICS IN MEDICINE, 2007, 26 (15) : 2937 - 2957
[10] A comparison of the ability of different propensity score models to balance measured variables between treated and untreated subjects: a Monte Carlo study
Austin, Peter C.
Grootendorst, Paul
Anderson, Geoffrey M.
[J]. STATISTICS IN MEDICINE, 2007, 26 (04) : 734 - 753

← 1 2 3 4 5 →