An unbiased evaluation of gene prioritization tools

被引:68
作者
Bornigen, Daniela [1 ,2 ,3 ]
Tranchevent, Leon-Charles [1 ,2 ]
Bonachela-Capdevila, Francisco [4 ]
Devriendt, Koenraad [5 ]
De Moor, Bart [1 ,2 ]
De Causmaecker, Patrick [4 ]
Moreau, Yves [1 ,2 ]
机构
[1] Katholieke Univ Leuven, ESAT SCD, Dept Elect Engn, Louvain, Belgium
[2] Katholieke Univ Leuven, IBBT KULeuven Future Hlth Dept, Louvain, Belgium
[3] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
[4] Katholieke Univ Leuven, ITEC IBBT KULEUVEN, CODeS Grp, Kortrijk, Belgium
[5] Katholieke Univ Leuven, Ctr Human Genet, Louvain, Belgium
关键词
GENOME-WIDE ASSOCIATION; IDENTIFIES SUSCEPTIBILITY LOCI; CANDIDATE GENES; CONGENITAL-ANOMALIES; RECEPTOR GENE; DISEASE GENES; MUTATIONS; VARIANTS; DUPLICATION; HAPLOINSUFFICIENCY;
D O I
10.1093/bioinformatics/bts581
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Gene prioritization aims at identifying the most promising candidate genes among a large pool of candidates-so as to maximize the yield and biological relevance of further downstream validation experiments and functional studies. During the past few years, several gene prioritization tools have been defined, and some of them have been implemented and made available through freely available web tools. In this study, we aim at comparing the predictive performance of eight publicly available prioritization tools on novel data. We have performed an analysis in which 42 recently reported disease-gene associations from literature are used to benchmark these tools before the underlying databases are updated. Results: Cross-validation on retrospective data provides performance estimate likely to be overoptimistic because some of the data sources are contaminated with knowledge from disease-gene association. Our approach mimics a novel discovery more closely and thus provides more realistic performance estimates. There are, however, marked differences, and tools that rely on more advanced data integration schemes appear more powerful.
引用
收藏
页码:3081 / 3088
页数:8
相关论文
共 77 条
[61]   Comparison of automated candidate gene prediction systems using genes implicated in type 2 diabetes by genome-wide association studies [J].
Teber, Erdahl T. ;
Liu, Jason Y. ;
Ballouz, Sara ;
Fatkin, Diane ;
Wouters, Merridee A. .
BMC BIOINFORMATICS, 2009, 10
[62]   Association Analysis of PALB2 and BRCA2 in Bipolar Disorder and Schizophrenia in a Scandinavian Case-Control Sample [J].
Tesli, Martin ;
Athanasiu, Lavinia ;
Mattingsdal, Morten ;
Kahler, Anna K. ;
Gustafsson, Omar ;
Andreassen, Bettina K. ;
Werge, Thomas ;
Hansen, Thomas ;
Mors, Ole ;
Mellerup, Erling ;
Koefoed, Pernille ;
Jonsson, Erik G. ;
Agartz, Ingrid ;
Melle, Ingrid ;
Morken, Gunnar ;
Djurovic, Srdjan ;
Andreassen, Ole A. .
AMERICAN JOURNAL OF MEDICAL GENETICS PART B-NEUROPSYCHIATRIC GENETICS, 2010, 153B (07) :1276-1282
[63]   Haploinsufficiency of TAB2 Causes Congenital Heart Defects in Humans [J].
Thienpont, Bernard ;
Zhang, Litu ;
Postma, Alex V. ;
Breckpot, Jeroen ;
Tranchevent, Leon-Charles ;
Van Loo, Peter ;
Mollgard, Kjeld ;
Tommerup, Niels ;
Bache, Iben ;
Tumer, Zeynep ;
van Engelen, Klaartje ;
Menten, Bjorn ;
Mortier, Geert ;
Waggoner, Darrel ;
Gewillig, Marc ;
Moreau, Yves ;
Devriendt, Koen ;
Larsen, Lars Allan .
AMERICAN JOURNAL OF HUMAN GENETICS, 2010, 86 (06) :839-849
[64]   Prioritization of positional candidate genes using multiple web-based software tools [J].
Thornblad, Tobias A. ;
Elliott, Kate S. ;
Jowett, Jeremy ;
Visscher, Peter M. .
TWIN RESEARCH AND HUMAN GENETICS, 2007, 10 (06) :861-870
[65]   Computational disease gene identification: a concert of methods prioritizes type 2 diabetes and obesity candidate genes [J].
Tiffin, Nicki ;
Adie, Euan ;
Turner, Frances ;
Brunner, Han G. ;
van Driel, Marc A. ;
Oti, Martin ;
Lopez-Bigas, Nuria ;
Ouzounis, Christos ;
Perez-Iratxeta, Carolina ;
Andrade-Navarro, Miguel A. ;
Adeyemo, Adebowale ;
Patti, Mary Elizabeth ;
Semple, Colin A. M. ;
Hide, Winston .
NUCLEIC ACIDS RESEARCH, 2006, 34 (10) :3067-3081
[66]  
Tiffin N, 2011, METHODS MOL BIOL, V760, P175, DOI 10.1007/978-1-61779-176-5_11
[67]   Linking genes to diseases: it's all in the data [J].
Tiffin, Nicki ;
Andrade-Navarro, Miguel A. ;
Perez-Iratxeta, Carolina .
GENOME MEDICINE, 2009, 1
[68]   A guide to web tools to prioritize candidate genes [J].
Tranchevent, Leon-Charles ;
Capdevila, Francisco Bonachela ;
Nitsch, Daniela ;
De Moor, Bart ;
De Causmaecker, Patrick ;
Moreau, Yves .
BRIEFINGS IN BIOINFORMATICS, 2011, 12 (01) :22-32
[69]   Variants near DMRT1, TERT and ATF7IP are associated with testicular germ cell cancer [J].
Turnbull, Clare ;
Rapley, Elizabeth A. ;
Seal, Sheila ;
Pernet, David ;
Renwick, Anthony ;
Hughes, Deborah ;
Ricketts, Michelle ;
Linger, Rachel ;
Nsengimana, Jeremie ;
Deloukas, Panagiotis ;
Huddart, Robert A. ;
Bishop, D. Timothy ;
Easton, Douglas F. ;
Stratton, Michael R. ;
Rahman, Nazneen .
NATURE GENETICS, 2010, 42 (07) :604-U178
[70]   A text-mining analysis of the human phenome [J].
van Driel, MA ;
Bruggeman, J ;
Vriend, G ;
Brunner, HG ;
Leunissen, JA .
EUROPEAN JOURNAL OF HUMAN GENETICS, 2006, 14 (05) :535-542