Disease candidate gene identification and prioritization using protein interaction networks

被引:263
作者
Chen, Jing [1 ,2 ]
Aronow, Bruce J. [1 ,2 ,3 ]
Jegga, Anil G. [1 ,3 ]
机构
[1] Cincinnati Childrens Hosp, Med Ctr, Div Biomed Informat, Cincinnati, OH USA
[2] Univ Cincinnati, Dept Biomed Engn, Cincinnati, OH USA
[3] Univ Cincinnati, Coll Med, Dept Pediat, Cincinnati, OH USA
来源
BMC BIOINFORMATICS | 2009年 / 10卷
关键词
CENTRALITY; SEQUENCE; DISCOVERY; MUTATION; ERBB2; LEADS; MICE;
D O I
10.1186/1471-2105-10-73
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Although most of the current disease candidate gene identification and prioritization methods depend on functional annotations, the coverage of the gene functional annotations is a limiting factor. In the current study, we describe a candidate gene prioritization method that is entirely based on protein-protein interaction network (PPIN) analyses. Results: For the first time, extended versions of the PageRank and HITS algorithms, and the KStep Markov method are applied to prioritize disease candidate genes in a training-test schema. Using a list of known disease-related genes from our earlier study as a training set ("seeds"), and the rest of the known genes as a test list, we perform large-scale cross validation to rank the candidate genes and also evaluate and compare the performance of our approach. Under appropriate settings - for example, a back probability of 0.3 for PageRank with Priors and HITS with Priors, and step size 6 for K-Step Markov method - the three methods achieved a comparable AUC value, suggesting a similar performance. Conclusion: Even though network-based methods are generally not as effective as integrated functional annotation-based methods for disease candidate gene prioritization, in a one-to-one comparison, PPIN-based candidate gene prioritization performs better than all other gene features or annotations. Additionally, we demonstrate that methods used for studying both social and Web networks can be successfully used for disease candidate gene prioritization.
引用
收藏
页数:14
相关论文
共 55 条
[1]   Speeding disease gene discovery by sequence based candidate prioritization [J].
Adie, EA ;
Adams, RR ;
Evans, KL ;
Porteous, DJ ;
Pickard, BS .
BMC BIOINFORMATICS, 2005, 6 (1)
[2]   SUSPECTS: enabling fast and effective prioritization of positional candidates [J].
Adie, EA ;
Adams, RR ;
Evans, KL ;
Porteous, DJ ;
Pickard, BS .
BIOINFORMATICS, 2006, 22 (06) :773-774
[3]   Gene prioritization through genomic data fusion [J].
Aerts, S ;
Lambrechts, D ;
Maity, S ;
Van Loo, P ;
Coessens, B ;
De Smet, F ;
Tranchevent, LC ;
De Moor, B ;
Marynen, P ;
Hassan, B ;
Carmeliet, P ;
Moreau, Y .
NATURE BIOTECHNOLOGY, 2006, 24 (05) :537-544
[4]  
[Anonymous], 2003, KDD '03
[5]   Computing topological parameters of biological networks [J].
Assenov, Yassen ;
Ramirez, Fidel ;
Schelhorn, Sven-Eric ;
Lengauer, Thomas ;
Albrecht, Mario .
BIOINFORMATICS, 2008, 24 (02) :282-284
[6]  
Bader GD, 2003, NUCLEIC ACIDS RES, V31, P248, DOI 10.1093/nar/gkg056
[7]   Cardiac malformations, adrenal agenesis, neural crest defects and exencephaly in mice lacking Cited2, a new Tfap2 co-activator [J].
Bamforth, SD ;
Bragança, J ;
Eloranta, JJ ;
Murdoch, JN ;
Marques, FIR ;
Kranc, KR ;
Farza, H ;
Henderson, DJ ;
Hurst, HC ;
Bhattacharya, S .
NATURE GENETICS, 2001, 29 (04) :469-474
[8]   Emergence of scaling in random networks [J].
Barabási, AL ;
Albert, R .
SCIENCE, 1999, 286 (5439) :509-512
[9]  
Becker KG, 2004, NAT GENET, V36, P431, DOI 10.1038/ng0504-431
[10]   Structure and evolution of protein interaction networks:: a statistical model for link dynamics and gene duplications -: art. no. 51 [J].
Berg, J ;
Lässig, M ;
Wagner, A .
BMC EVOLUTIONARY BIOLOGY, 2004, 4 (1)