PlantSat:: a specialized database for plant satellite repeats

被引:102
作者
Macas, J [1 ]
Mészáros, T [1 ]
Nouzová, M [1 ]
机构
[1] Inst Plant Mol Biol, Lab Mol Cytogenet, CZ-37005 Ceske Budejovice, Czech Republic
关键词
D O I
10.1093/bioinformatics/18.1.28
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Tandemly organized repetitive sequences (satellite DNA) are widespread in complex eukaryotic genomes. In plants, satellite repeats often represent a substantial part of nuclear DNA but only a little is known about the molecular mechanisms of their amplification and their possible role(s) in genome evolution and function. Unfortunately, addressing these questions via characterization of general sequence properties of known satellite repeats has been hindered by a difficulty in obtaining a complete and unbiased set of sequence data for this analysis. This is mainly due to the presence of multiple entries of homologous sequences and of single entries that contain more than one repeated unit (monomer) in the public databases. Results: We have established a computer database specialized for plant satellite repeats (PlantSat) that integrates sequence data available from various resources with supplementary information including repeat consensus sequences, abundances, and chromosomal localizations. The sequences are stored as individual repeat monomers grouped into families, which simplifies their computer analysis and makes it more accurate. Using this feature, we have performed a basic sequence analysis of the whole set of plant satellite repeats with respect to their monomer length and nucleotide composition. The analysis revealed several preferred length ranges of the monomers (similar to165 bp and its multiples) and an over-representation of the AA/TT dinucleotide in the repeats. We have also detected an enrichment of satellite DNA sequences for the motif CAAAA that is supposed to be involved in breakage-reunion of repeated sequences.
引用
收藏
页码:28 / 35
页数:8
相关论文
共 34 条
[1]   A large database of chicken bursal ESTs as a resource for the analysis of vertebrate gene function [J].
Abdrakhmanov, I ;
Lodygin, D ;
Geroth, P ;
Arakawa, H ;
Law, A ;
Plachy, J ;
Korn, B ;
Buerstedde, JM .
GENOME RESEARCH, 2000, 10 (12) :2062-2069
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]  
ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
[4]   RYE HETEROCHROMATIN .1. STUDIES ON CLUSTERS OF THE MAJOR REPEATING SEQUENCE AND THE IDENTIFICATION OF A NEW DISPERSED REPETITIVE SEQUENCE ELEMENT [J].
APPELS, R ;
MORAN, LB ;
GUSTAFSON, JP .
CANADIAN JOURNAL OF GENETICS AND CYTOLOGY, 1986, 28 (05) :645-657
[5]   The Medicago Genome Initiative:: a model legume database [J].
Bell, CJ ;
Dixon, RA ;
Farmer, AD ;
Flores, R ;
Inman, J ;
Gonzales, RA ;
Harrison, MJ ;
Paiva, NL ;
Scott, AD ;
Weller, JW ;
May, GD .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :114-117
[6]   OVER-REPRESENTATION AND UNDER-REPRESENTATION OF SHORT OLIGONUCLEOTIDES IN DNA-SEQUENCES [J].
BURGE, C ;
CAMPBELL, AM ;
KARLIN, S .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (04) :1358-1362
[7]   THE EVOLUTIONARY DYNAMICS OF REPETITIVE DNA IN EUKARYOTES [J].
CHARLESWORTH, B ;
SNIEGOWSKI, P ;
STEPHAN, W .
NATURE, 1994, 371 (6494) :215-220
[8]   DISTRIBUTION AND ORGANIZATION OF A TANDEMLY REPEATED 352-BP SEQUENCE IN THE ORYZAE FAMILY [J].
DEKOCHKO, A ;
KIEFER, MC ;
CORDESSE, F ;
REDDY, AS ;
DELSENY, M .
THEORETICAL AND APPLIED GENETICS, 1991, 82 (01) :57-64
[9]   SEQUENCE ARRANGEMENT OF A HIGHLY METHYLATED SATELLITE DNA OF A PLANT, SCILLA - A TANDEMLY REPEATED INVERTED REPEAT [J].
DEUMLING, B .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA-BIOLOGICAL SCIENCES, 1981, 78 (01) :338-342
[10]   The effects of sequence context on DNA curvature [J].
Dlakic, M ;
Harrington, RE .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1996, 93 (09) :3847-3852