A comparative study of survival models for breast cancer prognostication based on microarray data: does a single gene beat them all?

被引:175
作者
Haibe-Kains, B. [1 ,2 ]
Desmedt, C. [2 ]
Sotiriou, C. [2 ]
Bontempi, G. [1 ]
机构
[1] Univ Libre Bruxelles, Machine Learning Grp, Dept Comp Sci, Inst Jules Bordet, Brussels, Belgium
[2] Univ Libre Bruxelles, Dept Med Oncol, Funct Genom Unit, Inst Jules Bordet, Brussels, Belgium
关键词
D O I
10.1093/bioinformatics/btn374
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Survival prediction of breast cancer (BC) patients independently of treatment, also known as prognostication, is a complex task since clinically similar breast tumors, in addition to be molecularly heterogeneous, may exhibit different clinical outcomes. In recent years, the analysis of gene expression profiles by means of sophisticated data mining tools emerged as a promising technology to bring additional insights into BC biology and to improve the quality of prognostication. The aim of this work is to assess quantitatively the accuracy of prediction obtained with state-of-the-art data analysis techniques for BC microarray data through an independent and thorough framework. Results: Due to the large number of variables, the reduced amount of samples and the high degree of noise, complex prediction methods are highly exposed to performance degradation despite the use of cross-validation techniques. Our analysis shows that the most complex methods are not significantly better than the simplest one, a univariate model relying on a single proliferation gene. This result suggests that proliferation might be the most relevant biological process for BC prognostication and that the loss of interpretability deriving from the use of overcomplex methods may be not sufficiently counterbalanced by an improvement of the quality of prediction.
引用
收藏
页码:2200 / 2208
页数:9
相关论文
共 56 条
[1]   NEAREST-NEIGHBOR ESTIMATION OF A BIVARIATE DISTRIBUTION UNDER RANDOM CENSORING [J].
AKRITAS, MG .
ANNALS OF STATISTICS, 1994, 22 (03) :1299-1327
[2]  
[Anonymous], INT HIST CLASSIFICAT
[3]  
Barrett T, 2005, NUCLEIC ACIDS RES, V33, pD562
[4]   A blocking strategy to improve gene selection for classification of gene expression data [J].
Bontempi, Gianluca .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2007, 4 (02) :293-300
[5]  
Brier G. W., 1950, MON WEATHER REV, V78, P1, DOI [10.1175/1520-0493(1950)078%3C0001:VOFEIT%3E2.0.CO
[6]  
2, 10.1175/1520-0493(1950)0782.0.co
[7]  
2, DOI 10.1016/0016-0032(94)90228-3]
[8]   Validation and clinical utility of a 70-gene prognostic signature for women with node-negative breast cancer [J].
Buyse, Marc ;
Loi, Sherene ;
van't Veer, Laura ;
Viale, Giuseppe ;
Delorenzi, Mauro ;
Glas, Annuska M. ;
d'Assignies, Mahasti Saghatchian ;
Bergh, Jonas ;
Lidereau, Rosette ;
Ellis, Paul ;
Harris, Adrian ;
Bogaerts, Jan ;
Therasse, Patrick ;
Floore, Arno ;
Amakrane, Mohamed ;
Piette, Fanny ;
Rutgers, Emiel ;
Sotiriou, Christos ;
Cardoso, Fatima ;
Piccart, Martine J. .
JNCI-JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2006, 98 (17) :1183-1192
[9]  
COX DR, 1972, J R STAT SOC B, V34, P187
[10]  
DESMEDT C, 2008, CLIN CANC R IN PRESS