Constrained mixture estimation for analysis and robust classification of clinical time series

被引:27
作者
Costa, Ivan G. [1 ]
Schoenhuth, Alexander [2 ]
Hafemeister, Christoph [3 ]
Schliep, Alexander [3 ]
机构
[1] Univ Fed Pernambuco, Ctr Informat, Recife, PE, Brazil
[2] Simon Fraser Univ, Sch Comp Sci, Burnaby, BC V5A 1S6, Canada
[3] Max Planck Inst Mol Genet, Dept Computat Mol Biol, Berlin, Germany
关键词
EXPRESSION; MICROARRAY; DISCOVERY; CELLS; MODEL;
D O I
10.1093/bioinformatics/btp222
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Personalized medicine based on molecular aspects of diseases, such as gene expression pro. ling, has become increasingly popular. However, one faces multiple challenges when analyzing clinical gene expression data; most of the well-known theoretical issues such as high dimension of feature spaces versus few examples, noise and missing data apply. Special care is needed when designing classification procedures that support personalized diagnosis and choice of treatment. Here, we particularly focus on classification of interferon-beta (IFN beta) treatment response in Multiple Sclerosis (MS) patients which has attracted substantial attention in the recent past. Half of the patients remain unaffected by IFN beta treatment, which is still the standard. For them the treatment should be timely ceased to mitigate the side effects. Results: We propose constrained estimation of mixtures of hidden Markov models as a methodology to classify patient response to IFN beta treatment. The advantages of our approach are that it takes the temporal nature of the data into account and its robustness with respect to noise, missing data and mislabeled samples. Moreover, mixture estimation enables to explore the presence of response sub-groups of patients on the transcriptional level. We clearly outperformed all prior approaches in terms of prediction accuracy, raising it, for the first time, >90%. Additionally, we were able to identify potentially mislabeled samples and to sub-divide the good responders into two sub-groups that exhibited different transcriptional response programs. This is supported by recent findings on MS pathology and therefore may raise interesting clinical follow-up questions.
引用
收藏
页码:I6 / I14
页数:9
相关论文
共 36 条
[1]  
[Anonymous], 2006, IEEE T NEURAL NETWOR
[2]  
[Anonymous], 1998, TR97021 INT COMP SCI
[3]  
Archelos JJ, 2000, ANN NEUROL, V47, P694, DOI 10.1002/1531-8249(200006)47:6<694::AID-ANA2>3.0.CO
[4]  
2-W
[5]  
Bar-Joseph Z., 2002, Proceedings of the Sixth Annual International Conference on Research in Computational Molecular Biology (RECOMB), p39C48
[6]   Transcription-based prediction of response to IFNβ using supervised computational methods [J].
Baranzini, SE ;
Mousavi, P ;
Rio, J ;
Caillier, SJ ;
Stillman, A ;
Villoslada, P ;
Wyatt, MM ;
Comabella, M ;
Greller, LD ;
Somogyi, R ;
Montalban, X ;
Oksenberg, JR .
PLOS BIOLOGY, 2005, 3 (01) :166-176
[7]  
Basu S, 2004, SIAM PROC S, P333
[8]  
Borgwardt Karsten M, 2006, Pac Symp Biocomput, P547, DOI 10.1142/9789812701626_0051
[9]   Metagenes and molecular pattern discovery using matrix factorization [J].
Brunet, JP ;
Tamayo, P ;
Golub, TR ;
Mesirov, JP .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (12) :4164-4169
[10]   ON THE EXPONENTIAL VALUE OF LABELED SAMPLES [J].
CASTELLI, V ;
COVER, TM .
PATTERN RECOGNITION LETTERS, 1995, 16 (01) :105-111