An unbiased evaluation of gene prioritization tools

被引:68
作者
Bornigen, Daniela [1 ,2 ,3 ]
Tranchevent, Leon-Charles [1 ,2 ]
Bonachela-Capdevila, Francisco [4 ]
Devriendt, Koenraad [5 ]
De Moor, Bart [1 ,2 ]
De Causmaecker, Patrick [4 ]
Moreau, Yves [1 ,2 ]
机构
[1] Katholieke Univ Leuven, ESAT SCD, Dept Elect Engn, Louvain, Belgium
[2] Katholieke Univ Leuven, IBBT KULeuven Future Hlth Dept, Louvain, Belgium
[3] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
[4] Katholieke Univ Leuven, ITEC IBBT KULEUVEN, CODeS Grp, Kortrijk, Belgium
[5] Katholieke Univ Leuven, Ctr Human Genet, Louvain, Belgium
关键词
GENOME-WIDE ASSOCIATION; IDENTIFIES SUSCEPTIBILITY LOCI; CANDIDATE GENES; CONGENITAL-ANOMALIES; RECEPTOR GENE; DISEASE GENES; MUTATIONS; VARIANTS; DUPLICATION; HAPLOINSUFFICIENCY;
D O I
10.1093/bioinformatics/bts581
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Gene prioritization aims at identifying the most promising candidate genes among a large pool of candidates-so as to maximize the yield and biological relevance of further downstream validation experiments and functional studies. During the past few years, several gene prioritization tools have been defined, and some of them have been implemented and made available through freely available web tools. In this study, we aim at comparing the predictive performance of eight publicly available prioritization tools on novel data. We have performed an analysis in which 42 recently reported disease-gene associations from literature are used to benchmark these tools before the underlying databases are updated. Results: Cross-validation on retrospective data provides performance estimate likely to be overoptimistic because some of the data sources are contaminated with knowledge from disease-gene association. Our approach mimics a novel discovery more closely and thus provides more realistic performance estimates. There are, however, marked differences, and tools that rely on more advanced data integration schemes appear more powerful.
引用
收藏
页码:3081 / 3088
页数:8
相关论文
共 77 条
[1]   A shared susceptibility locus in PLCE1 at 10q23 for gastric adenocarcinoma and esophageal squamous cell carcinoma [J].
Abnet, Christian C. ;
Freedman, Neal D. ;
Hu, Nan ;
Wang, Zhaoming ;
Yu, Kai ;
Shu, Xiao-Ou ;
Yuan, Jian-Min ;
Zheng, Wei ;
Dawsey, Sanford M. ;
Dong, Linda M. ;
Lee, Maxwell P. ;
Ding, Ti ;
Qiao, You-Lin ;
Gao, Yu-Tang ;
Koh, Woon-Puay ;
Xiang, Yong-Bing ;
Tang, Ze-Zhong ;
Fan, Jin-Hu ;
Wang, Chaoyu ;
Wheeler, William ;
Gail, Mitchell H. ;
Yeager, Meredith ;
Yuenger, Jeff ;
Hutchinson, Amy ;
Jacobs, Kevin B. ;
Giffen, Carol A. ;
Burdett, Laurie ;
Fraumeni, Joseph F., Jr. ;
Tucker, Margaret A. ;
Chow, Wong-Ho ;
Goldstein, Alisa M. ;
Chanock, Stephen J. ;
Taylor, Philip R. .
NATURE GENETICS, 2010, 42 (09) :764-U51
[2]   SUSPECTS: enabling fast and effective prioritization of positional candidates [J].
Adie, EA ;
Adams, RR ;
Evans, KL ;
Porteous, DJ ;
Pickard, BS .
BIOINFORMATICS, 2006, 22 (06) :773-774
[3]   Gene prioritization through genomic data fusion [J].
Aerts, S ;
Lambrechts, D ;
Maity, S ;
Van Loo, P ;
Coessens, B ;
De Smet, F ;
Tranchevent, LC ;
De Moor, B ;
Marynen, P ;
Hassan, B ;
Carmeliet, P ;
Moreau, Y .
NATURE BIOTECHNOLOGY, 2006, 24 (05) :537-544
[4]   Integrating Computational Biology and Forward Genetics in Drosophila [J].
Aerts, Stein ;
Vilain, Sven ;
Hu, Shu ;
Tranchevent, Leon-Charles ;
Barriot, Roland ;
Yan, Jiekun ;
Moreau, Yves ;
Hassan, Bassem A. ;
Quan, Xiao-Jiang .
PLOS GENETICS, 2009, 5 (01)
[5]   Haploinsufficiency of the LIM Domain Containing Preferred Translocation Partner in Lipoma (LPP) Gene in Patients With Tetralogy of Fallot and VACTERL Association [J].
Arrington, Cammon B. ;
Patel, Ankita ;
Bacino, Carlos A. ;
Bowles, Neil E. .
AMERICAN JOURNAL OF MEDICAL GENETICS PART A, 2010, 152A (11) :2919-2923
[6]   Mutations in the G6PC3 Gene Cause Dursun Syndrome [J].
Banka, Siddharth ;
Newman, William G. ;
Ozgul, R. Koksal ;
Dursun, Ali .
AMERICAN JOURNAL OF MEDICAL GENETICS PART A, 2010, 152A (10) :2609-2611
[7]  
Becker KG, 2004, NAT GENET, V36, P431, DOI 10.1038/ng0504-431
[8]   A genome-wide association study of nasopharyngeal carcinoma identifies three new susceptibility loci [J].
Bei, Jin-Xin ;
Li, Yi ;
Jia, Wei-Hua ;
Feng, Bing-Jian ;
Zhou, Gangqiao ;
Chen, Li-Zhen ;
Feng, Qi-Sheng ;
Low, Hui-Qi ;
Zhang, Hongxing ;
He, Fuchu ;
Tai, E. Shyong ;
Kang, Tiebang ;
Liu, Edison T. ;
Liu, Jianjun ;
Zeng, Yi-Xin .
NATURE GENETICS, 2010, 42 (07) :599-U173
[9]   Evidence for CRHR1 in multiple sclerosis using supervised machine learning and meta-analysis in 12 566 individuals [J].
Briggs, Farren B. S. ;
Bartlett, Selena E. ;
Goldstein, Benjamin A. ;
Wang, Joanne ;
McCauley, Jacob L. ;
Zuvich, Rebecca L. ;
De Jager, Philip L. ;
Rioux, John D. ;
Ivinson, Adrian J. ;
Compston, Alastair ;
Hafler, David A. ;
Hauser, Stephen L. ;
Oksenberg, Jorge R. ;
Sawcer, Stephen J. ;
Pericak-Vance, Margaret A. ;
Haines, Jonathan L. ;
Barcellos, Lisa F. .
HUMAN MOLECULAR GENETICS, 2010, 19 (21) :4286-4295
[10]   Systematic identification of human mitochondrial disease genes through integrative genomics [J].
Calvo, S ;
Jain, M ;
Xie, XH ;
Sheth, SA ;
Chang, B ;
Goldberger, OA ;
Spinazzola, A ;
Zeviani, M ;
Carr, SA ;
Mootha, VK .
NATURE GENETICS, 2006, 38 (05) :576-582