Imputation of missing values of tumour stage in population-based cancer registration

被引:84
作者
Eisemann, Nora [1 ]
Waldmann, Annika [2 ]
Katalinic, Alexander [1 ,2 ]
机构
[1] Univ Lubeck, Inst Canc Epidemiol, D-23562 Lubeck, Germany
[2] Univ Hosp Schleswig Holstein, Inst Clin Epidemiol, D-23562 Lubeck, Germany
关键词
BREAST-CANCER; MULTIPLE IMPUTATION; SURVIVAL; RATES;
D O I
10.1186/1471-2288-11-129
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
100404 [儿少卫生与妇幼保健学];
摘要
Background: Missing data on tumour stage information is a common problem in population-based cancer registries. Statistical analyses on the level of tumour stage may be biased, if no adequate method for handling of missing data is applied. In order to determine a useful way to treat missing data on tumour stage, we examined different imputation models for multiple imputation with chained equations for analysing the stage-specific numbers of cases of malignant melanoma and female breast cancer. Methods: This analysis was based on the malignant melanoma data set and the female breast cancer data set of the cancer registry Schleswig-Holstein, Germany. The cases with complete tumour stage information were extracted and their stage information partly removed according to a MAR missingness-pattern, resulting in five simulated data sets for each cancer entity. The missing tumour stage values were then treated with multiple imputation with chained equations, using polytomous regression, predictive mean matching, random forests and proportional sampling as imputation models. The estimated tumour stages, stage-specific numbers of cases and survival curves after multiple imputation were compared to the observed ones. Results: The amount of missing values for malignant melanoma was too high to estimate a reasonable number of cases for each UICC stage. However, multiple imputation of missing stage values led to stage-specific numbers of cases of T-stage for malignant melanoma as well as T-and UICC-stage for breast cancer close to the observed numbers of cases. The observed tumour stages on the individual level, the stage-specific numbers of cases and the observed survival curves were best met with polytomous regression or predictive mean matching but not with random forest or proportional sampling as imputation models. Conclusions: This limited simulation study indicates that multiple imputation with chained equations is an appropriate technique for dealing with missing information on tumour stage in population-based cancer registries, if the amount of unstaged cases is on a reasonable level.
引用
收藏
页数:13
相关论文
共 38 条
[1]
Male Breast Cancer: A Population-Based Comparison With Female Breast Cancer [J].
Anderson, William F. ;
Jatoi, Ismail ;
Tse, Julia ;
Rosenberg, Philip S. .
JOURNAL OF CLINICAL ONCOLOGY, 2010, 28 (02) :232-239
[2]
[Anonymous], 2013, International Classification of disease for Oncology
[3]
Imputations of missing values in practice: Results from imputations of serum cholesterol in 28 cohort studies [J].
Barzi, F ;
Woodward, M .
AMERICAN JOURNAL OF EPIDEMIOLOGY, 2004, 160 (01) :34-45
[4]
BRAND J, 1994, J AM MED INFORM ASSN, P303
[5]
Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]
Multiple Imputation for Missing Data via Sequential Regression Trees [J].
Burgette, Lane F. ;
Reiter, Jerome P. .
AMERICAN JOURNAL OF EPIDEMIOLOGY, 2010, 172 (09) :1070-1076
[7]
The design of simulation studies in medical statistics [J].
Burton, Andrea ;
Altman, Douglas G. ;
Royston, Patrick ;
Holder, Roger L. .
STATISTICS IN MEDICINE, 2006, 25 (24) :4279-4292
[8]
Buuren, 2010, J STAT SOFTWARE, V2010
[9]
Review: A gentle introduction to imputation of missing values [J].
Donders, A. Rogier T. ;
van der Heijden, Geert J. M. G. ;
Stijnen, Theo ;
Moons, Karel G. M. .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 2006, 59 (10) :1087-1091
[10]
The relative contributions of screen-detected in situ and invasive breast carcinomas in reducing mortality from the disease [J].
Duffy, SW ;
Tabar, L ;
Vitak, B ;
Day, NE ;
Smith, RA ;
Chen, HHT ;
Yen, MFA .
EUROPEAN JOURNAL OF CANCER, 2003, 39 (12) :1755-1760