Analysis of protein sequence and interaction data for candidate disease gene prediction

被引:105
作者
George, Richard A.
Liu, Jason Y.
Feng, Lina L.
Bryson-Richardson, Robert J.
Fatkin, Diane
Wouters, Merridee A. [1 ]
机构
[1] Victor Chang Cardiac Res Inst, Computat Biol & Bioinformat Program, Sydney, NSW, Australia
[2] Victor Chang Cardiac Res Inst, Dev Biol Program, Sydney, NSW, Australia
[3] Victor Chang Cardiac Res Inst, Sr Bernice Res Program Inherited Heart Dis, Sydney, NSW, Australia
[4] Univ New S Wales, Sch Biotechnol & Biomol Sci, Sydney, NSW, Australia
[5] Univ New S Wales, Sch Med, Sydney, NSW, Australia
[6] St Vincents Hosp, Dept Cardiol, Sydney, NSW 2010, Australia
关键词
D O I
10.1093/nar/gkl707
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Linkage analysis is a successful procedure to associate diseases with specific genomic regions. These regions are often large, containing hundreds of genes, which make experimental methods employed to identify the disease gene arduous and expensive. We present two methods to prioritize candidates for further experimental study: Common Pathway Scanning (CPS) and Common Module Profiling (CMP). CPS is based on the assumption that common phenotypes are associated with dysfunction in proteins that participate in the same complex or pathway. CPS applies network data derived from protein-protein interaction (PPI) and pathway databases to identify relationships between genes. CMP identifies likely candidates using a domain-dependent sequence similarity approach, based on the hypothesis that disruption of genes of similar function will lead to the same phenotype. Both algorithms use two forms of input data: known disease genes or multiple disease loci. When using known disease genes as input, our combined methods have a sensitivity of 0.52 and a specificity of 0.97 and reduce the candidate list by 13-fold. Using multiple loci, our methods successfully identify disease genes for all benchmark diseases with a sensitivity of 0.84 and a specificity of 0.63. Our combined approach prioritizes good candidates and will accelerate the disease gene discovery process.
引用
收藏
页数:10
相关论文
共 42 条
[1]   SUSPECTS: enabling fast and effective prioritization of positional candidates [J].
Adie, EA ;
Adams, RR ;
Evans, KL ;
Porteous, DJ ;
Pickard, BS .
BIOINFORMATICS, 2006, 22 (06) :773-774
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]   Beyond Mendel: An evolving view of human genetic disease transmission [J].
Badano, JL ;
Katsanis, N .
NATURE REVIEWS GENETICS, 2002, 3 (10) :779-789
[5]   The ciliopathies: An emerging class of human genetic disorders [J].
Badano, Jose L. ;
Mitsuma, Norimasa ;
Beales, Phil L. ;
Katsanis, Nicholas .
ANNUAL REVIEW OF GENOMICS AND HUMAN GENETICS, 2006, 7 :125-148
[6]   Pathguide: a Pathway Resource List [J].
Bader, Gary D. ;
Cary, Michael P. ;
Sander, Chris .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D504-D506
[7]  
Bader GD, 2003, NUCLEIC ACIDS RES, V31, P248, DOI 10.1093/nar/gkg056
[8]   Systematic identification of functional orthologs based on protein network comparison [J].
Bandyopadhyay, S ;
Sharan, R ;
Ideker, T .
GENOME RESEARCH, 2006, 16 (03) :428-435
[9]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkr1065, 10.1093/nar/gkh121]
[10]  
Benson Dennis A, 2005, Nucleic Acids Res, V33, pD34