An integrated approach to inferring gene-disease associations in humans

被引:121
作者
Radivojac, Predrag [1 ]
Peng, Kang [1 ]
Clark, Wyatt T. [1 ]
Peters, Brandon J. [2 ]
Mohan, Amrita [1 ]
Boyle, Sean M. [1 ]
Mooney, Sean D. [2 ,3 ]
机构
[1] Indiana Univ, Sch Informat, Bloomington, IN 47408 USA
[2] Indiana Univ Sch Med, Ctr Computat Biol & Bioinformat, Indianapolis, IN 46202 USA
[3] Indiana Univ Sch Med, Dept Med & Mol Genet, Indianapolis, IN 46202 USA
关键词
gene prioritization; gene-disease associations; protein-disease associations; protein function prediction;
D O I
10.1002/prot.21989
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
One of the most important tasks of modern bioinformatics is the development Of computational tools that can be used to understand and treat human disease. To date, a variety of methods have been explored and algorithms for candidate gene prioritization are gaining in their usefulness. Here, we propose an algorithm for detecting gene-disease associations based on the human protein-protein interaction network, known gene-disease associations, protein sequence, and protein functional information at the molecular level. Our method, PhenoPred, is supervised: first, we mapped each gene/protein onto the spaces of disease and functional terms based on distance to all annotated proteins in the protein interaction network. We also encoded sequence, function, physicochemical, and predicted structural properties, such as secondary structure and flexibility. We then trained support vector machines to detect gene-disease associations for a number of terms in Disease Ontology and provided evidence that, despite the noise/incompleteness of experimental data and unfinished ontology of diseases, identification of candidate genes can be successful even when a large number of candidate disease terms are predicted on simultaneously.
引用
收藏
页码:1030 / 1037
页数:8
相关论文
共 68 条
[1]   Speeding disease gene discovery by sequence based candidate prioritization [J].
Adie, EA ;
Adams, RR ;
Evans, KL ;
Porteous, DJ ;
Pickard, BS .
BMC BIOINFORMATICS, 2005, 6 (1)
[2]   Gene prioritization through genomic data fusion [J].
Aerts, S ;
Lambrechts, D ;
Maity, S ;
Van Loo, P ;
Coessens, B ;
De Smet, F ;
Tranchevent, LC ;
De Moor, B ;
Marynen, P ;
Hassan, B ;
Carmeliet, P ;
Moreau, Y .
NATURE BIOTECHNOLOGY, 2006, 24 (05) :537-544
[3]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]   The universal protein resource (UniProt) [J].
Bairoch, A ;
Apweiler, R ;
Wu, CH ;
Barker, WC ;
Boeckmann, B ;
Ferro, S ;
Gasteiger, E ;
Huang, HZ ;
Lopez, R ;
Magrane, M ;
Martin, MJ ;
Natale, DA ;
O'Donovan, C ;
Redaschi, N ;
Yeh, LSL .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D154-D159
[5]   STAT1 mediates differentiation of chronic lymphocytic leukemia cells in response to Bryostatin 1 [J].
Battle, TE ;
Frank, DA .
BLOOD, 2003, 102 (08) :3016-3024
[6]  
Battle TE, 2003, CLIN CANCER RES, V9, P2166
[7]   Online predicted human interaction database [J].
Brown, KR ;
Jurisica, I .
BIOINFORMATICS, 2005, 21 (09) :2076-2082
[8]   Creation and implications of a phenome-genome network [J].
Butte, AJ ;
Kohane, IS .
NATURE BIOTECHNOLOGY, 2006, 24 (01) :55-62
[9]  
Chen JX, 2006, PROCEEDINGS OF THE 2006 IEEE INTERNATIONAL CONFERENCE ON VEHICULAR ELECTRONICS AND SAFETY, P367
[10]   Abundance of intrinsic disorder in protein associated with cardiovascular disease [J].
Cheng, Yugong ;
LeGall, Tanguy ;
Oldfield, Christopher J. ;
Dunker, A. Keith ;
Uversky, Vladimir N. .
BIOCHEMISTRY, 2006, 45 (35) :10448-10460