ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data

被引:10498
作者
Wang, Kai [1 ]
Li, Mingyao [2 ]
Hakonarson, Hakon [1 ,3 ]
机构
[1] Childrens Hosp Philadelphia, Ctr Appl Genom, Philadelphia, PA 19104 USA
[2] Univ Penn, Dept Biostat & Epidemiol, Philadelphia, PA 19104 USA
[3] Univ Penn, Dept Pediat, Philadelphia, PA 19104 USA
关键词
SNPS; ASSOCIATION; GENOMES;
D O I
10.1093/nar/gkq603
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
High-throughput sequencing platforms are generating massive amounts of genetic variation data for diverse genomes, but it remains a challenge to pinpoint a small subset of functionally important variants. To fill these unmet needs, we developed the ANNOVAR tool to annotate single nucleotide variants (SNVs) and insertions/deletions, such as examining their functional consequence on genes, inferring cytogenetic bands, reporting functional importance scores, finding variants in conserved regions, or identifying variants reported in the 1000 Genomes Project and dbSNP. ANNOVAR can utilize annotation databases from the UCSC Genome Browser or any annotation data set conforming to Generic Feature Format version 3 (GFF3). We also illustrate a 'variants reduction' protocol on 4.7 million SNVs and indels from a human genome, including two causal mutations for Miller syndrome, a rare recessive disease. Through a stepwise procedure, we excluded variants that are unlikely to be causal, and identified 20 candidate genes including the causal gene. Using a desktop computer, ANNOVAR requires similar to 4 min to perform gene-based annotation and similar to 15 min to perform variants reduction on 4.7 million variants, making it practical to handle hundreds of human genomes in a day. ANNOVAR is freely available at http://www.openbioinformatics.org/annovar/.
引用
收藏
页数:7
相关论文
共 21 条
[11]   F-SNP: computationally predicted functional SNPs for disease association studies [J].
Lee, Phil Hyoun ;
Shatkay, Hagit .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D820-D824
[12]   Snap: an integrated SNP annotation platform [J].
Li, Shengting ;
Ma, Lijia ;
Li, Heng ;
Vang, Soren ;
Hu, Yafeng ;
Bolund, Lars ;
Wang, Jun .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D707-D710
[13]   SIFT: predicting amino acid changes that affect protein function [J].
Ng, PC ;
Henikoff, S .
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3812-3814
[14]   Exome sequencing identifies the cause of a mendelian disorder [J].
Ng, Sarah B. ;
Buckingham, Kati J. ;
Lee, Choli ;
Bigham, Abigail W. ;
Tabor, Holly K. ;
Dent, Karin M. ;
Huff, Chad D. ;
Shannon, Paul T. ;
Jabs, Ethylin Wang ;
Nickerson, Deborah A. ;
Shendure, Jay ;
Bamshad, Michael J. .
NATURE GENETICS, 2010, 42 (01) :30-U41
[15]   Targeted capture and massively parallel sequencing of 12 human exomes [J].
Ng, Sarah B. ;
Turner, Emily H. ;
Robertson, Peggy D. ;
Flygare, Steven D. ;
Bigham, Abigail W. ;
Lee, Choli ;
Shaffer, Tristan ;
Wong, Michelle ;
Bhattacharjee, Arindam ;
Eichler, Evan E. ;
Bamshad, Michael ;
Nickerson, Deborah A. ;
Shendure, Jay .
NATURE, 2009, 461 (7261) :272-U153
[16]   Detection of nonneutral substitution rates on mammalian phylogenies [J].
Pollard, Katherine S. ;
Hubisz, Melissa J. ;
Rosenbloom, Kate R. ;
Siepel, Adam .
GENOME RESEARCH, 2010, 20 (01) :110-121
[17]   NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins [J].
Pruitt, Kim D. ;
Tatusova, Tatiana ;
Maglott, Donna R. .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D61-D65
[18]   Human non-synonymous SNPs: server and survey [J].
Ramensky, V ;
Bork, P ;
Sunyaev, S .
NUCLEIC ACIDS RESEARCH, 2002, 30 (17) :3894-3900
[19]   The UCSC Genome Browser database: update 2010 [J].
Rhead, Brooke ;
Karolchik, Donna ;
Kuhn, Robert M. ;
Hinrichs, Angie S. ;
Zweig, Ann S. ;
Fujita, Pauline A. ;
Diekhans, Mark ;
Smith, Kayla E. ;
Rosenbloom, Kate R. ;
Raney, Brian J. ;
Pohl, Andy ;
Pheasant, Michael ;
Meyer, Laurence R. ;
Learned, Katrina ;
Hsu, Fan ;
Hillman-Jackson, Jennifer ;
Harte, Rachel A. ;
Giardine, Belinda ;
Dreszer, Timothy R. ;
Clawson, Hiram ;
Barber, Galt P. ;
Haussler, David ;
Kent, W. James .
NUCLEIC ACIDS RESEARCH, 2010, 38 :D613-D619
[20]   Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes [J].
Siepel, A ;
Bejerano, G ;
Pedersen, JS ;
Hinrichs, AS ;
Hou, MM ;
Rosenbloom, K ;
Clawson, H ;
Spieth, J ;
Hillier, LW ;
Richards, S ;
Weinstock, GM ;
Wilson, RK ;
Gibbs, RA ;
Kent, WJ ;
Miller, W ;
Haussler, D .
GENOME RESEARCH, 2005, 15 (08) :1034-1050