PolyMAPr: Programs for polymorphism database mining, annotation, and functional analysis

被引:25
作者
Freimuth, RR
Stormo, GD
McLeod, HL
机构
[1] Washington Univ, Sch Med, Dept Med, St Louis, MO 63110 USA
[2] Washington Univ, Sch Med, Dept Genet, St Louis, MO 63110 USA
关键词
database mining; polymorphism; SNP; annotation; pharmacogenomics; pharmacogenetics;
D O I
10.1002/humu.20123
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Pharmacogenomic and disease-association studies rely on identifying a comprehensive set of polymorphisms within candidate genes. Public SNP databases are a rich source of polymorphism data, but mining them effectively requires overcoming at least four challenges: ensuring accurate annotations for genes and polymorphisms, eliminating both inter- and intra-database redundancy, integrating data from multiple public sources with data generated locally, and prioritizing the variants for further study. PolyMAPr (Polymorphism Mining and Annotation Programs) was developed to overcome these challenges and to improve the efficiency of database mining and polymorphism annotation. PolyMAPr takes as input a file containing a list of genes to be processed and files containing each annotated gene sequence. Polymorphic sequences obtained from public databases (dbSNP, CGAP, and JSNP) or through local SNP discovery efforts, as well as oligonucleotide sequences (e.g., PCR primers), are mapped to the annotated gene sequences and named according to suggested nomenclature guidelines. The functional effects of nonsynonymous coding region SNPs (cSNPs) and any variants that might alter exon splicing enhancer (ESE) sites, putative transcription factor binding sites, or intron-exon splice sites are predicted. The output files are accessible though a browser interface. In addition, the results are also provided in Extensible Markup Language (XML) format to facilitate uploading them into a local relational database. PolyMAPr increases the efficiency of mining public databases for genetic variants within candidate genes and provides a mechanism by which data from multiple sources (both public and private) can be uniformly integrated, thereby significantly reducing the effort required to obtain a comprehensive set of polymorphisms for pharmacogenomic and disease-association studies. PolyMAPr can be obtained from http://pharmacogenomics.wustl.edu. (C) 2005 Wiley-Liss, Inc.
引用
收藏
页码:110 / 117
页数:8
相关论文
共 30 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   The relevance of alternative RNA splicing to pharmacogenomics [J].
Bracco, L ;
Kearsey, J .
TRENDS IN BIOTECHNOLOGY, 2003, 21 (08) :346-353
[3]   Reliable identification of large numbers of candidate SNPs from public EST data [J].
Buetow, KH ;
Edmonson, MN ;
Cassidy, AB .
NATURE GENETICS, 1999, 21 (03) :323-325
[4]   Listening to silence and understanding nonsense: Exonic mutations that affect splicing [J].
Cartegni, L ;
Chew, SL ;
Krainer, AR .
NATURE REVIEWS GENETICS, 2002, 3 (04) :285-298
[5]   ESEfinder: a web resource to identify exonic splicing enhancers [J].
Cartegni, L ;
Wang, JH ;
Zhu, ZW ;
Zhang, MQ ;
Krainer, AR .
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3568-3571
[6]   A DNA polymorphism discovery resource for research on human genetic variation [J].
Collins, FS ;
Brooks, LD ;
Chakravarti, A .
GENOME RESEARCH, 1998, 8 (12) :1229-1231
[7]  
den Dunnen JT, 2000, HUM MUTAT, V15, P7
[8]   Drug therapy - Pharmacogenomics - Drug disposition, drug targets, and side effects [J].
Evans, WE ;
McLeod, HL .
NEW ENGLAND JOURNAL OF MEDICINE, 2003, 348 (06) :538-549
[9]   HGVbase:: a human sequence variation database emphasizing data quality and a broad spectrum of data sources [J].
Fredman, D ;
Siegfried, M ;
Yuan, YP ;
Bork, P ;
Lehväslaiho, H ;
Brookes, AJ .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :387-391
[10]  
Holden AL, 2002, BIOTECHNIQUES, P22