Growing Bayesian network models of gene networks from seed genes

被引:50
作者
Peña, JM
Björkegren, J
Tegnér, J
机构
[1] Linkoping Univ, Dept Phys & Measurement Technol, S-58183 Linkoping, Sweden
[2] Karolinska Inst, Ctr Genom & Bioinformat, S-17177 Stockholm, Sweden
关键词
D O I
10.1093/bioinformatics/bti1137
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: For the last few years, Bayesian networks (BNs) have received increasing attention from the computational biology community as models of gene networks, though learning them from gene-expression data is problematic. Most gene-expression databases contain measurements for thousands of genes, but the existing algorithms for learning BNs from data do not scale to such high-dimensional databases. This means that the user has to decide in advance which genes are included in the learning process, typically no more than a few hundreds, and which genes are excluded from it. This is not a trivial decision. We propose an alternative approach to overcome this problem. Results: We propose a new algorithm for learning BN models of gene networks from gene-expression data. Our algorithm receives a seed gene S and a positive integer R from the user, and returns a BN for the genes that depend on S such that less than R other genes mediate the dependency. Our algorithm grows the BN, which initially only contains S, by repeating the following step R + 1 times and, then, pruning some genes; find the parents and children of all the genes in the BN and add them to it. Intuitively, our algorithm provides the user with a window of radius R around S to look at the BN model of a gene network without having to exclude any gene in advance. We prove that our algorithm is correct under the faithfulness assumption. We evaluate our algorithm on simulated and biological data (Rosetta compendium) with satisfactory results.
引用
收藏
页码:224 / 229
页数:6
相关论文
共 27 条
[1]  
[Anonymous], THESIS STANFORD U ST
[2]  
BADEA L, 2004, P 16 EUR C ART INT, P263
[3]  
BADEA L, 2003, P WORKSH LEARN GRAPH
[4]  
Bernard A, 2005, PACIFIC SYMPOSIUM ON BIOCOMPUTING 2005, P459
[5]   Using Bayesian networks to analyze expression data [J].
Friedman, N ;
Linial, M ;
Nachman, I ;
Pe'er, D .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (3-4) :601-620
[6]  
Friedman N, 1999, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, P206
[7]  
GEIGER D, 1994, P 10 C UNC ART INT, P235
[8]  
Hartemink Alexander J, 2002, Pac Symp Biocomput, P437
[9]   Growing genetic regulatory networks from seed genes [J].
Hashimoto, RF ;
Kim, S ;
Shmulevich, I ;
Zhang, W ;
Bittner, ML ;
Dougherty, ER .
BIOINFORMATICS, 2004, 20 (08) :1241-1247
[10]   Functional discovery via a compendium of expression profiles [J].
Hughes, TR ;
Marton, MJ ;
Jones, AR ;
Roberts, CJ ;
Stoughton, R ;
Armour, CD ;
Bennett, HA ;
Coffey, E ;
Dai, HY ;
He, YDD ;
Kidd, MJ ;
King, AM ;
Meyer, MR ;
Slade, D ;
Lum, PY ;
Stepaniants, SB ;
Shoemaker, DD ;
Gachotte, D ;
Chakraburtty, K ;
Simon, J ;
Bard, M ;
Friend, SH .
CELL, 2000, 102 (01) :109-126