Efficient identification of independence networks using mutual information

被引:12
作者
Bacciu, Davide [1 ]
Etchells, Terence A. [2 ]
Lisboa, Paulo J. G. [2 ]
Whittaker, Joe [3 ]
机构
[1] Univ Pisa, Dipartimento Informat, Pisa, Italy
[2] Liverpool John Moores Univ, Sch Comp & Math Sci, Liverpool L3 5UX, Merseyside, England
[3] Univ Lancaster, Dept Math & Stat, Lancaster, England
关键词
Bayesian networks; Constraint based search; Dense networks; False discovery rate; False negative reduction; Graphical models; Mutual information; PC algorithm; Skeleton; FALSE DISCOVERY RATE;
D O I
10.1007/s00180-012-0320-6
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
070103 [概率论与数理统计]; 140311 [社会设计与社会创新];
摘要
Conditional independence graphs are now widely applied in science and industry to display interactions between large numbers of variables. However, the computational load of structure identification grows with the number of nodes in the network and the sample size. A tailored version of the PC algorithm is proposed which is based on mutual information tests with a specified testing order, combined with false negative reduction and false positive control. It is found to be competitive with current structure identification methodologies for both estimation accuracy and computational speed and outperforms these in large scale scenarios. The methodology is also shown to approximate dense networks. The comparisons are made on standard benchmarking data sets and an anonymized large scale real life example.
引用
收藏
页码:621 / 646
页数:26
相关论文
共 21 条
[1]
Aliferis CF, 2003, METMBS'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MATHEMATICS AND ENGINEERING TECHNIQUES IN MEDICINE AND BIOLOGICAL SCIENCES, P371
[2]
[Anonymous], 2007, Bayesian networks and decision graphs, DOI DOI 10.1007/978-0-387-68282-2
[3]
Benjamini Y, 2001, ANN STAT, V29, P1165
[4]
CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[5]
Bishop M.M., 1975, DISCRETE MULTIVARIAT
[6]
Learning Bayesian networks from data: An information-theory based approach [J].
Cheng, J ;
Greiner, R ;
Kelly, J ;
Bell, D ;
Liu, WR .
ARTIFICIAL INTELLIGENCE, 2002, 137 (1-2) :43-90
[7]
DAWID AP, 1979, J ROY STAT SOC B MET, V41, P1
[8]
Fast A, 2008, 0848 U MASS AMH COMP
[9]
Friedman N, 1999, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, P206
[10]
Goebel B, 2005, IEEE ICC, P1102