Query-driven module discovery in microarray data

被引:17
作者
Dhollander, Thomas
Sheng, Qizheng
Lemmens, Karen
De Moor, Bart
Marchal, Kathleen
Moreau, Yves
机构
[1] Katholieke Univ Leuven, ESAT SCD, Dept Elect Engn, B-3001 Louvain, Belgium
[2] Katholieke Univ Leuven, Dept Microbiol & Mol Syst, CMPG, B-3001 Louvain, Belgium
关键词
D O I
10.1093/bioinformatics/btm387
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Existing (bi) clustering methods for microarray data analysis often do not answer the specific questions of interest to a biologist. Such specific questions could be derived from other information sources, including expert prior knowledge. More specifically, given a set of seed genes which are believed to have a common function, we would like to recruit genes with similar expression profiles as the seed genes in a significant subset of experimental conditions. Results: We introduce QDB, a novel Bayesian query-driven biclustering framework in which the prior distributions allow introducing knowledge from a set of seed genes (query) to guide the pattern search. In two well-known yeast compendia, we grow highly functionally enriched biclusters from small sets of seed genes using a resolution sweep approach. In addition, relevant conditions are identified and modularity of the biclusters is demonstrated, including the discovery of overlapping modules. Finally, our method deals with missing values naturally, performs well on artificial data from a recent biclustering benchmark study and has a number of conceptual advantages when compared to existing approaches for focused module search.
引用
收藏
页码:2573 / 2580
页数:8
相关论文
共 23 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]  
[Anonymous], 2006, R LANG ENV STAT COMP
[3]  
[Anonymous], 2021, Bayesian Data Analysis
[4]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[5]   Iterative signature algorithm for the analysis of large-scale gene expression data [J].
Bergmann, S ;
Ihmels, J ;
Barkai, N .
PHYSICAL REVIEW E, 2003, 67 (03) :18
[6]  
Bernard A, 2005, PACIFIC SYMPOSIUM ON BIOCOMPUTING 2005, P459
[7]   Inferring cellular networks using probabilistic graphical models [J].
Friedman, N .
SCIENCE, 2004, 303 (5659) :799-805
[8]   Genomic expression programs in the response of yeast cells to environmental changes [J].
Gasch, AP ;
Spellman, PT ;
Kao, CM ;
Carmel-Harel, O ;
Eisen, MB ;
Storz, G ;
Botstein, D ;
Brown, PO .
MOLECULAR BIOLOGY OF THE CELL, 2000, 11 (12) :4241-4257
[9]   Bioconductor: open software development for computational biology and bioinformatics [J].
Gentleman, RC ;
Carey, VJ ;
Bates, DM ;
Bolstad, B ;
Dettling, M ;
Dudoit, S ;
Ellis, B ;
Gautier, L ;
Ge, YC ;
Gentry, J ;
Hornik, K ;
Hothorn, T ;
Huber, W ;
Iacus, S ;
Irizarry, R ;
Leisch, F ;
Li, C ;
Maechler, M ;
Rossini, AJ ;
Sawitzki, G ;
Smith, C ;
Smyth, G ;
Tierney, L ;
Yang, JYH ;
Zhang, JH .
GENOME BIOLOGY, 2004, 5 (10)
[10]   Predicting the outcome of pregnancies of unknown location: Bayesian networks with expert prior information compared to logistic regression [J].
Gevaert, O. ;
De Smet, F. ;
Kirk, E. ;
Van Calster, B. ;
Bourne, T. ;
Van Huffel, S. ;
Moreau, Y. ;
Timmerman, D. ;
De Moor, B. ;
Condous, G. .
HUMAN REPRODUCTION, 2006, 21 (07) :1824-1831