Short tandem repeats in human exons: A target for disease mutations

被引:35
作者
Madsen, Bo Eskerod [1 ]
Villesen, Palle [1 ]
Wiuf, Carsten [1 ]
机构
[1] Univ Aarhus, Bioinformat Res Ctr, DK-8000 Aarhus C, Denmark
关键词
D O I
10.1186/1471-2164-9-410
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: In recent years it has been demonstrated that structural variations, such as indels (insertions and deletions), are common throughout the genome, but the implications of structural variations are still not clearly understood. Long tandem repeats (e. g. microsatellites or simple repeats) are known to be hypermutable (indel-rich), but are rare in exons and only occasionally associated with diseases. Here we focus on short (imperfect) tandem repeats (STRs) which fall below the radar of conventional tandem repeat detection, and investigate whether STRs are targets for disease-related mutations in human exons. In particular, we test whether they share the hypermutability of the longer tandem repeats and whether disease-related genes have a higher STR content than non-disease-related genes. Results: We show that validated human indels are extremely common in STR regions compared to non-STR regions. In contrast to longer tandem repeats, our definition of STRs found them to be present in exons of most known human genes (92%), 99% of all STR sequences in exons are shorter than 33 base pairs and 62% of all STR sequences are imperfect repeats. We also demonstrate that STRs are significantly overrepresented in disease-related genes in both human and mouse. These results are preserved when we limit the analysis to STRs outside known longer tandem repeats. Conclusion: Based on our findings we conclude that STRs represent hypermutable regions in the human genome that are linked to human disease. In addition, STRs constitute an obvious target when screening for rare mutations, because of the relatively low amount of STRs in exons (1,973,844 bp) and the limited length of STR regions.
引用
收藏
页数:9
相关论文
共 33 条
[1]   Comparative analysis of amino acid repeats in rodents and humans [J].
Albà, MM ;
Guigó, R .
GENOME RESEARCH, 2004, 14 (04) :549-554
[2]  
[Anonymous], 2006, R LANG ENV STAT COMP
[3]   Tandem repeats finder: a program to analyze DNA sequences [J].
Benson, G .
NUCLEIC ACIDS RESEARCH, 1999, 27 (02) :573-580
[4]   TRbase: a database relating tandem repeats to disease genes for the human genome [J].
Boby, T ;
Patch, AM ;
Aves, SJ .
BIOINFORMATICS, 2005, 21 (06) :811-816
[5]   Tandem repeats in protein coding regions of primate genes [J].
Borstnik, B ;
Pumpernik, D .
GENOME RESEARCH, 2002, 12 (06) :909-915
[6]   The Mouse Genome Database (MGD): integrating biology with the genome [J].
Bult, CJ ;
Blake, JA ;
Richardson, JE ;
Kadin, JA ;
Eppig, JT .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D476-D481
[7]   Genomics - DNA duplications and deletions help determine health [J].
Cohen, Jon .
SCIENCE, 2007, 317 (5843) :1315-1317
[8]   A high-resolution survey of deletion polymorphism in the human genome [J].
Conrad, DF ;
Andrews, TD ;
Carter, NP ;
Hurles, ME ;
Pritchard, JK .
NATURE GENETICS, 2006, 38 (01) :75-81
[9]   BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis [J].
Durinck, S ;
Moreau, Y ;
Kasprzyk, A ;
Davis, S ;
De Moor, B ;
Brazma, A ;
Huber, W .
BIOINFORMATICS, 2005, 21 (16) :3439-3440
[10]   The mouse genome database (MGD): new features facilitating a model system [J].
Eppig, Janan T. ;
Blake, Judith A. ;
Bult, Carol J. ;
Kadin, James A. ;
Richardson, Joel E. .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D630-D637