Tandem repeats finder: a program to analyze DNA sequences

被引:6448
作者
Benson, G [1 ]
机构
[1] CUNY Mt Sinai Sch Med, Dept Biomath Sci, New York, NY 10029 USA
基金
美国国家科学基金会;
关键词
D O I
10.1093/nar/27.2.573
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A tandem repeat in DNA is two or more contiguous, approximate copies of a pattern of nucleotides. Tandem repeats have been shown to cause human disease, may play a variety of regulatory and evolutionary roles and are important laboratory and analytic tools. Extensive knowledge about pattern size, copy number, mutational history, etc. for tandem repeats has been limited by the inability to easily detect them in genomic sequence data. In this paper, we present a new algorithm for finding tandem repeats which works without the need to specify either the pattern or pattern size. We model tandem repeats by percent identity and frequency of indels between adjacent pattern copies and use statistically based recognition criteria. We demonstrate the algorithm's speed and its ability to detect tandem repeats that have undergone extensive mutational change by analyzing four sequences: the human frataxin gene, the human beta T cell receptor locus sequence and two yeast chromosomes. These sequences range in size from 3 kb up to 700 kb, A World Wide Web server interface at c3.biomath.mssm.edu/trf.html has been established for automated use of the program.
引用
收藏
页码:573 / 580
页数:8
相关论文
共 37 条
  • [1] ON DISCRETE-DISTRIBUTIONS OF ORDER-K
    AKI, S
    KUBOKI, H
    HIRANO, K
    [J]. ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 1984, 36 (03) : 431 - 440
  • [2] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [3] Minisatellite diversity supports a recent African origin for modern humans
    Armour, JAL
    Anttinen, T
    May, CA
    Vega, EE
    Sajantila, A
    Kidd, JR
    Kidd, KK
    Bertranpetit, J
    Paabo, S
    Jeffreys, AJ
    [J]. NATURE GENETICS, 1996, 13 (02) : 154 - 160
  • [4] On the distribution of K-tuple matches for sequence homology: A constant time exact calculation of the variance
    Benson, G
    Su, XP
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 1998, 5 (01) : 87 - 100
  • [5] A SPACE EFFICIENT ALGORITHM FOR FINDING THE BEST NONOVERLAPPING ALIGNMENT SCORE
    BENSON, G
    [J]. THEORETICAL COMPUTER SCIENCE, 1995, 145 (1-2) : 357 - 369
  • [6] A METHOD FOR FAST DATABASE SEARCH FOR ALL K-NUCLEOTIDE REPEATS
    BENSON, G
    WATERMAN, MS
    [J]. NUCLEIC ACIDS RESEARCH, 1994, 22 (22) : 4828 - 4836
  • [7] Sequence alignment with tandem duplication
    Benson, G
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 1997, 4 (03) : 351 - 367
  • [8] Friedreich's ataxia: Autosomal recessive disease caused by an intronic GAA triplet repeat expansion
    Campuzano, V
    Montermini, L
    Molto, MD
    Pianese, L
    Cossee, M
    Cavalcanti, F
    Monros, E
    Rodius, F
    Duclos, F
    Monticelli, A
    Zara, F
    Canizares, J
    Koutnikova, H
    Bidichandani, SI
    Gellera, C
    Brice, A
    Trouillas, P
    DeMichele, G
    Filla, A
    DeFrutos, R
    Palau, F
    Patel, PI
    DiDonato, S
    Mandel, JL
    Cocozza, S
    Koenig, M
    Pandolfo, M
    [J]. SCIENCE, 1996, 271 (5254) : 1423 - 1427
  • [9] Analysis of immunoglobulin S gamma 3 recombination breakpoints by PCR: implications for the mechanism of isotype switching
    Du, J
    Zhu, Y
    Shanmugam, A
    Kenter, AL
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (15) : 3066 - 3073
  • [10] GENETIC-VARIATION AT 5 TRIMERIC AND TETRAMERIC TANDEM REPEAT LOCI IN 4 HUMAN-POPULATION GROUPS
    EDWARDS, A
    HAMMOND, HA
    JIN, L
    CASKEY, CT
    CHAKRABORTY, R
    [J]. GENOMICS, 1992, 12 (02) : 241 - 253