Detecting microsatellites within genomes: significant variation among algorithms

被引:65
作者
Leclercq, Sebastien
Rivals, Eric
Jarne, Philippe
机构
[1] Univ Montpellier 2, CNRS, UMR 5506, LIRMM, Montpellier, France
[2] Univ Montpellier 2, CNRS, UMR 5175, CEFE, Montpellier, France
关键词
D O I
10.1186/1471-2105-8-125
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Microsatellites are short, tandemly-repeated DNA sequences which are widely distributed among genomes. Their structure, role and evolution can be analyzed based on exhaustive extraction from sequenced genomes. Several dedicated algorithms have been developed for this purpose. Here, we compared the detection efficiency of five of them (TRF, Mreps, Sputnik, STAR, and RepeatMasker). Results: Our analysis was first conducted on the human X chromosome, and microsatellite distributions were characterized by microsatellite number, length, and divergence from a pure motif. The algorithms work with user-defined parameters, and we demonstrate that the parameter values chosen can strongly influence microsatellite distributions. The five algorithms were then compared by fixing parameters settings, and the analysis was extended to three other genomes (Saccharomyces cerevisiae, Neurospora crassa and Drosophila melanogaster) spanning a wide range of size and structure. Significant differences for all characteristics of microsatellites were observed among algorithms, but not among genomes, for both perfect and imperfect microsatellites. Striking differences were detected for short microsatellites (below 20 bp), regardless of motif. Conclusion: Since the algorithm used strongly influences empirical distributions, studies analyzing microsatellite evolution based on a comparison between empirical and theoretical size distributions should therefore be considered with caution. We also discuss why a typological definition of microsatellites limits our capacity to capture their genomic distributions.
引用
收藏
页数:18
相关论文
共 51 条
[21]   Microsatellites, from molecules to populations and back [J].
Jarne, P ;
Lagoda, PJL .
TRENDS IN ECOLOGY & EVOLUTION, 1996, 11 (10) :424-429
[22]   Mutation rate varies among alleles at a microsatellite locus: Phylogenetic evidence [J].
Jin, L ;
Macaubas, C ;
Hallmayer, J ;
Kimura, A ;
Mignot, E .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1996, 93 (26) :15285-15288
[23]   SIMPLE REPETITIVE DNA-SEQUENCES FROM PRIMATES - COMPILATION AND ANALYSIS [J].
JURKA, J ;
PETHIYAGODA, C .
JOURNAL OF MOLECULAR EVOLUTION, 1995, 40 (02) :120-126
[24]   Repbase Update - a database and an electronic journal of repetitive elements [J].
Jurka, J .
TRENDS IN GENETICS, 2000, 16 (09) :418-420
[25]   Differential distribution of simple sequence repeats in eukaryotic genome sequences [J].
Katti, MV ;
Ranjekar, PK ;
Gupta, VS .
MOLECULAR BIOLOGY AND EVOLUTION, 2001, 18 (07) :1161-1167
[26]   Microsatellite length differences between humans and chimpanzees at autosomal loci are not found at equivalent haploid Y chromosomal loci [J].
Kayser, Manfred ;
Vowles, Edward J. ;
Kappei, Dennis ;
Amos, William .
GENETICS, 2006, 173 (04) :2179-2186
[27]   Finding approximate repetitions under Hamming distance [J].
Kolpakov, R ;
Kucherov, G .
THEORETICAL COMPUTER SCIENCE, 2003, 303 (01) :135-156
[28]   mreps: efficient and flexible detection of tandem repeats in DNA [J].
Kolpakov, R ;
Bana, G ;
Kucherov, G .
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3672-3678
[29]   Distribution and abundance of microsatellites in the yeast genome can be explained by a balance between slippage events and point mutations [J].
Kruglyak, S ;
Durrett, R ;
Schug, MD ;
Aquadro, CF .
MOLECULAR BIOLOGY AND EVOLUTION, 2000, 17 (08) :1210-1219
[30]   Equilibrium distributions of microsatellite repeat length resulting from a balance between slippage events and point mutations [J].
Kruglyak, S ;
Durrett, RT ;
Schug, MD ;
Aquadro, CF .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (18) :10774-10778