Prognostic gene signatures for patient stratification in breast cancer - accuracy, stability and interpretability of gene selection approaches using prior knowledge on protein-protein interactions

被引:40
作者
Cun, Yupeng [1 ]
Froehlich, Holger [1 ]
机构
[1] Bonn Aachen Int Ctr IT, D-53113 Bonn, Germany
关键词
SUPPORT VECTOR MACHINES; PATHWAY KNOWLEDGE; HISTOLOGIC GRADE; R-PACKAGE; CLASSIFICATION; NETWORK; INFORMATION; INTEGRATION; REGRESSION; SYSTEM;
D O I
10.1186/1471-2105-13-69
中图分类号
Q5 [生物化学];
学科分类号
070307 [化学生物学];
摘要
Background: Stratification of patients according to their clinical prognosis is a desirable goal in cancer treatment in order to achieve a better personalized medicine. Reliable predictions on the basis of gene signatures could support medical doctors on selecting the right therapeutic strategy. However, during the last years the low reproducibility of many published gene signatures has been criticized. It has been suggested that incorporation of network or pathway information into prognostic biomarker discovery could improve prediction performance. In the meanwhile a large number of different approaches have been suggested for the same purpose. Methods: We found that on average incorporation of pathway information or protein interaction data did not significantly enhance prediction performance, but indeed greatly interpretability of gene signatures. Some methods (specifically network-based SVMs) could greatly enhance gene selection stability, but revealed only a comparably low prediction accuracy, whereas Reweighted Recursive Feature Elimination (RRFE) and average pathway expression led to very clearly interpretable signatures. In addition, average pathway expression, together with elastic net SVMs, showed the highest prediction performance here. Results: The results indicated that no single algorithm to perform best with respect to all three categories in our study. Incorporating network of prior knowledge into gene selection methods in general did not significantly improve classification accuracy, but greatly interpretability of gene signatures compared to classical algorithms.
引用
收藏
页数:13
相关论文
共 51 条
[1]
NCBI GEO: archive for functional genomics data sets-10 years on [J].
Barrett, Tanya ;
Troup, Dennis B. ;
Wilhite, Stephen E. ;
Ledoux, Pierre ;
Evangelista, Carlos ;
Kim, Irene F. ;
Tomashevsky, Maxim ;
Marshall, Kimberly A. ;
Phillippy, Katherine H. ;
Sherman, Patti M. ;
Muertter, Rolf N. ;
Holko, Michelle ;
Ayanbule, Oluwabukunmi ;
Yefanov, Andrey ;
Soboleva, Alexandra .
NUCLEIC ACIDS RESEARCH, 2011, 39 :D1005-D1010
[2]
USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING [J].
BATTITI, R .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04) :537-550
[3]
penalizedSVM: a R-package for feature selection SVM classification [J].
Becker, Natalia ;
Werft, Wiebke ;
Toedt, Grischa ;
Lichter, Peter ;
Benner, Axel .
BIOINFORMATICS, 2009, 25 (13) :1711-1712
[4]
Benjamini Y, 2001, ANN STAT, V29, P1165
[5]
CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[6]
Incorporating pathway information into boosting estimation of high-dimensional risk prediction models [J].
Binder, Harald ;
Schumacher, Martin .
BMC BIOINFORMATICS, 2009, 10
[7]
MULTIPLE SIGNIFICANCE TESTS - THE BONFERRONI METHOD .10. [J].
BLAND, JM ;
ALTMAN, DG .
BRITISH MEDICAL JOURNAL, 1995, 310 (6973) :170-170
[8]
Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[9]
Carlson M, 2009, BIOCONDUCTOR VERSION, V2, P12
[10]
Pathway Commons, a web resource for biological pathway data [J].
Cerami, Ethan G. ;
Gross, Benjamin E. ;
Demir, Emek ;
Rodchenkov, Igor ;
Babur, Oezguen ;
Anwar, Nadia ;
Schultz, Nikolaus ;
Bader, Gary D. ;
Sander, Chris .
NUCLEIC ACIDS RESEARCH, 2011, 39 :D685-D690