Bayesian approach to discovering pathogenic SNPs in conserved protein domains

被引:33
作者
Cai, ZH
Tsung, EF
Marinescu, VD
Ramoni, MF
Riva, A
Kohane, IS
机构
[1] Harvard Univ, Sch Med, Childrens Hosp Boston, Informat Program, Boston, MA 02115 USA
[2] Harvard Partners Ctr Genet & Genom, Boston, MA USA
[3] Harvard Univ, Div Hlth Sci & Technol, Cambridge, MA USA
[4] MIT, Cambridge, MA 02139 USA
关键词
SNP; Bayesian networks; phylogenctic features; phenotype; association studies;
D O I
10.1002/humu.20063
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The success rate of association studies can be improved by selecting better genetic markers for genotyping or by providing better leads for identifying pathogenic single nucleotide polymorphisms (SNPs) in the regions of linkage disequilibrium with positive disease associations. We have developed a novel algorithm to predict pathogenic single amino acid changes, either nonsynonymous SNPs (nsSNPs) or missense mutations, in conserved protein domains. Using a Bayesian framework, we found that the probability of a microbial missense mutation causing a significant change in phenotype depended on how much difference it made in several phylogenetic, biochemical, and structural features related to the single amino acid substitution. We tested our model on pathogenic allelic variants (missense mutations or nsSNPs) included in OMIM, and on the other nsSNPs in the same genes (from dbSNP) as the nonpathogenic variants. As a result, our model predicted pathogenic variants with a 10% false-positive rate. The high specificity of our prediction algorithm should make it valuable in genetic association studies aimed at identifying pathogenic SNPs. (C) 2004 Wiley-Liss, Inc.
引用
收藏
页码:178 / 184
页数:7
相关论文
共 26 条
[1]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkr1065, 10.1093/nar/gkh121]
[2]   The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 [J].
Boeckmann, B ;
Bairoch, A ;
Apweiler, R ;
Blatter, MC ;
Estreicher, A ;
Gasteiger, E ;
Martin, MJ ;
Michoud, K ;
O'Donovan, C ;
Phan, I ;
Pilbout, S ;
Schneider, M .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :365-370
[3]   Association study designs for complex diseases [J].
Cardon, LR ;
Bell, JI .
NATURE REVIEWS GENETICS, 2001, 2 (02) :91-99
[4]   PicSNP: A browsable catalog of nonsynonymous single nucleotide polymorphisms in the human genome [J].
Chang, HG ;
Fujita, T .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2001, 287 (01) :288-291
[5]   Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: Structure-based assessment of amino acid variation [J].
Chasman, D ;
Adams, RM .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 307 (02) :683-706
[6]  
CREIGHTON TE, 1993, PROTEINS STRUCTURES, pCH4
[7]  
DURBIN R, 1998, BIOL SEQUENCE ANAL P, P46
[8]   A comprehensive review of genetic association studies [J].
Hirschhorn, JN ;
Lohmueller, K ;
Byrne, E ;
Hirschhorn, K .
GENETICS IN MEDICINE, 2002, 4 (02) :45-61
[9]   Editorial: Once and again - Issues surrounding replication in genetic association studies [J].
Hirschhorn, JN ;
Altshuler, D .
JOURNAL OF CLINICAL ENDOCRINOLOGY & METABOLISM, 2002, 87 (10) :4438-4441
[10]   A comparative study of machine-learning methods to predict the effects of single nucleotide polymorphisms on protein function [J].
Krishnan, VG ;
Westhead, DR .
BIOINFORMATICS, 2003, 19 (17) :2199-2209