Prediction of Missense Mutation Functionality Depends on Both the Algorithm and Sequence Alignment Employed

被引:168
作者
Hicks, Stephanie [1 ]
Wheeler, David A. [2 ]
Plon, Sharon E. [2 ,3 ]
Kimmel, Marek [1 ]
机构
[1] Rice Univ, Dept Stat, Houston, TX 77005 USA
[2] Human Genome Sequencing Ctr, Houston, TX USA
[3] Baylor Coll Med, Dept Pediat, Texas Childrens Canc Ctr, Houston, TX 77030 USA
关键词
multiple sequence alignment; SIFT; PolyPhen-2; Align-GVGD; Xvar; BRCA1; MSH2; MLH1; TP53; PROTEIN FUNCTION; VARIANTS; DISEASE; RECOMMENDATIONS; CLASSIFICATION; SUBSTITUTIONS; DATABASE; SERVER; BRCA1; P53;
D O I
10.1002/humu.21490
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Multiple algorithms are used to predict the impact of missense mutations on protein structure and function using algorithm-generated sequence alignments or manually curated alignments. We compared the accuracy with native alignment of SIFT, Align-GVGD, PolyPhen-2, and Xvar when generating functionality predictions of well-characterized missense mutations (n=267) within the BRCA1, MSH2, MLH1, and TP53 genes. We also evaluated the impact of the alignment employed on predictions from these algorithms (except Xvar) when supplied the same four alignments including alignments automatically generated by (1) SIFT, (2) Polyphen-2, (3) Uniprot, and (4) a manually curated alignment tuned for Align-GVGD. Alignments differ in sequence composition and evolutionary depth. Data-based receiver operating characteristic curves employing the native alignment for each algorithm result in area under the curve of 78-79% for all four algorithms. Predictions from the PolyPhen-2 algorithm were least dependent on the alignment employed. In contrast, Align-GVGD predicts all variants neutral when provided alignments with a large number of sequences. Of note, algorithms make different predictions of variants even when provided the same alignment and do not necessarily perform best using their own alignment. Thus, researchers should consider optimizing both the algorithm and sequence alignment employed in missense prediction. Hum Mutat 32:661-668, 2011. (C) 2011 Wiley-Liss, Inc.
引用
收藏
页码:661 / 668
页数:8
相关论文
共 32 条
[1]   Analysis of missense variation in human BRCA1 in the context of interspecific sequence variation [J].
Abkevich, V ;
Zharkikh, A ;
Deffenbaugh, AM ;
Frank, D ;
Chen, Y ;
Shattuck, D ;
Skolnick, MH ;
Gutin, A ;
Tavtigian, SV .
JOURNAL OF MEDICAL GENETICS, 2004, 41 (07) :492-507
[2]   A method and server for predicting damaging missense mutations [J].
Adzhubei, Ivan A. ;
Schmidt, Steffen ;
Peshkin, Leonid ;
Ramensky, Vasily E. ;
Gerasimova, Anna ;
Bork, Peer ;
Kondrashov, Alexey S. ;
Sunyaev, Shamil R. .
NATURE METHODS, 2010, 7 (04) :248-249
[3]  
Agresti A, 2013, Categorical data analysis, V3rd
[4]   Sequence variation in G-protein-coupled receptors: analysis of single nucleotide polymorphisms [J].
Balasubramanian, S ;
Xia, Y ;
Freinkman, E ;
Gerstein, M .
NUCLEIC ACIDS RESEARCH, 2005, 33 (05) :1710-1721
[5]   Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information [J].
Bao, L ;
Cui, Y .
BIOINFORMATICS, 2005, 21 (10) :2185-2190
[6]   Interpreting missense variants:: Comparing computational methods in human disease genes CDKN2A, MLH1, MSH2, MECP2, and tyrosinase (TYR) [J].
Chan, Philip A. ;
Duraisamy, Sekhar ;
Miller, Peter J. ;
Newell, Joan A. ;
McBride, Carole ;
Bond, Jeffrey P. ;
Raevaara, Tiina ;
Ollila, Saara ;
Nystrom, Minna ;
Grimm, Andrew J. ;
Christodoulou, John ;
Oetting, William S. ;
Greenblatt, Marc S. .
HUMAN MUTATION, 2007, 28 (07) :683-693
[7]   Accurate classification of MLH1/MSH2 missense variants with multivariate analysis of protein polymorphisms-mismatch repair (MAPP-MMR) [J].
Chao, Elizabeth C. ;
Velasquez, Jonathan L. ;
Witherspoon, Mavee S. L. ;
Rozek, Laura S. ;
Peel, David ;
Ng, Pauline ;
Gruber, Stephen B. ;
Watson, Patrice ;
Rennert, Gad ;
Anton-Culver, Hoda ;
Lynch, Henry ;
Lipkin, Steven M. .
HUMAN MUTATION, 2008, 29 (06) :852-860
[8]   Identification of deleterious mutations within three human genomes [J].
Chun, Sung ;
Fay, Justin C. .
GENOME RESEARCH, 2009, 19 (09) :1553-1561
[9]   An introduction to ROC analysis [J].
Fawcett, Tom .
PATTERN RECOGNITION LETTERS, 2006, 27 (08) :861-874
[10]   Integrated evaluation of DNA sequence variants of unknown clinical significance:: Application to BRCA1 and BRCA2 [J].
Goldgar, DE ;
Easton, DF ;
Deffenbaugh, AM ;
Monteiro, ANA ;
Tavtigian, SV ;
Couch, FJ .
AMERICAN JOURNAL OF HUMAN GENETICS, 2004, 75 (04) :535-544