Multi-way clustering of microarray data using probabilistic sparse matrix factorization

被引:41
作者
Dueck, D [1 ]
Morris, QD [1 ]
Frey, BJ [1 ]
机构
[1] Univ Toronto, Dept Elect & Comp Engn, Toronto, ON M5S 3G4, Canada
基金
加拿大健康研究院; 加拿大自然科学与工程研究理事会;
关键词
D O I
10.1093/bioinformatics/bti1041
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: We address the problem of multi-way clustering of microarray data using a generative model. Our algorithm, probabilistic sparse matrix factorization (PSMF), is a probabilistic extension of a previous hard-decision algorithm for this problem. PSMF allows for varying levels of sensor noise in the data, uncertainty in the hidden prototypes used to explain the data and uncertainty as to the prototypes selected to explain each data vector. Results: We present experimental results demonstrating that our method can better recover functionally-relevant clusterings in mRNA expression data than standard clustering techniques, including hierarchical agglomerative clustering, and we show that by computing probabilities instead of point estimates, our method avoids converging to poor solutions.
引用
收藏
页码:I144 / I151
页数:8
相关论文
共 14 条
[1]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[2]   AN INFORMATION MAXIMIZATION APPROACH TO BLIND SEPARATION AND BLIND DECONVOLUTION [J].
BELL, AJ ;
SEJNOWSKI, TJ .
NEURAL COMPUTATION, 1995, 7 (06) :1129-1159
[3]  
BESAG J, 1986, J R STAT SOC B, V48, P259
[4]   Knowledge-based analysis of microarray gene expression data by using support vector machines [J].
Brown, MPS ;
Grundy, WN ;
Lin, D ;
Cristianini, N ;
Sugnet, CW ;
Furey, TS ;
Ares, M ;
Haussler, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) :262-267
[5]  
Dueck D., 2004, PSI200423 U TOR
[6]  
FREY BJ, 2005, IN PRESS P 9 INT C R
[7]   Functional discovery via a compendium of expression profiles [J].
Hughes, TR ;
Marton, MJ ;
Jones, AR ;
Roberts, CJ ;
Stoughton, R ;
Armour, CD ;
Bennett, HA ;
Coffey, E ;
Dai, HY ;
He, YDD ;
Kidd, MJ ;
King, AM ;
Meyer, MR ;
Slade, D ;
Lum, PY ;
Stepaniants, SB ;
Shoemaker, DD ;
Gachotte, D ;
Chakraburtty, K ;
Simon, J ;
Bard, M ;
Friend, SH .
CELL, 2000, 102 (01) :109-126
[8]   Fast and robust fixed-point algorithms for independent component analysis [J].
Hyvärinen, A .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1999, 10 (03) :626-634
[9]  
Jordan MI., 1998, LEARNING GRAPHICAL M
[10]   Network component analysis: Reconstruction of regulatory signals in biological systems [J].
Liao, JC ;
Boscolo, R ;
Yang, YL ;
Tran, LM ;
Sabatti, C ;
Roychowdhury, VP .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (26) :15522-15527