Short fuzzy tandem repeats in genomic sequences, identification, and possible role in regulation of gene expression

被引:63
作者
Boeva, V [1 ]
Regnier, M
Papatsenko, D
Makeev, V
机构
[1] Moscow MV Lomonosov State Univ, Dept Bioengn & Bioinformat, Moscow, Russia
[2] Univ Calif Berkeley, Berkeley, CA 94720 USA
[3] State Res Ctr GosNIIGenet, Moscow, Russia
[4] Russian Acad Sci, VA Engelhardt Mol Biol Inst, Moscow, Russia
关键词
D O I
10.1093/bioinformatics/btk032
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Genomic sequences are highly redundant and contain many types of repetitive DNA. Fuzzy tandem repeats (FTRs) are of particular interest. They are found in regulatory regions of eukaryotic genes and are reported to interact with transcription factors. However, accurate assessment of FTR occurrences in different genome segments requires specific algorithm for efficient FTR identification and classification. Results: We have obtained formulas for P-values of FTR occurrence and developed an FTR identification algorithm implemented in TandemSWAN software. Using TandemSWAN we compared the structure and the occurrence of FTRs with short period length (up to 24 bp) in coding and non-coding regions including UTRs, heterochromatic, intergenic and enhancer sequences of Drosophila melanogaster and Drosophila pseudoobscura. Tandems with period three and its multiples were found in coding segments, whereas FTRs with periods multiple of six are overrepresented in all non-coding segment. Periods equal to 5-7 and 11-14 were characteristic of the enhancer regions and other non-coding regions close to genes.
引用
收藏
页码:676 / 684
页数:9
相关论文
共 58 条
[11]   Genomic sequence allalysis of Fugu rubripes CFTR and flanking genes in a 60 kb region conserving synteny with 800 kb of human chromosome 7 [J].
Davidson, H ;
Taylor, MS ;
Doherty, A ;
Boyd, AC ;
Porteous, DJ .
GENOME RESEARCH, 2000, 10 (08) :1194-1203
[12]  
DELCOURT SG, 1991, J BIOL CHEM, V266, P15160
[13]   MOLECULAR DRIVE - A COHESIVE MODE OF SPECIES EVOLUTION [J].
DOVER, G .
NATURE, 1982, 299 (5879) :111-117
[14]   GENETIC-VARIATION AT 5 TRIMERIC AND TETRAMERIC TANDEM REPEAT LOCI IN 4 HUMAN-POPULATION GROUPS [J].
EDWARDS, A ;
HAMMOND, HA ;
JIN, L ;
CASKEY, CT ;
CHAKRABORTY, R .
GENOMICS, 1992, 12 (02) :241-253
[15]   Microsatellites: Simple sequences with complex evolution [J].
Ellegren, H .
NATURE REVIEWS GENETICS, 2004, 5 (06) :435-445
[16]   AN UNSTABLE TRIPLET REPEAT IN A GENE RELATED TO MYOTONIC MUSCULAR-DYSTROPHY [J].
FU, YH ;
PIZZUTI, A ;
FENWICK, RG ;
KING, J ;
RAJNARAYAN, S ;
DUNNE, PW ;
DUBEL, J ;
NASSER, GA ;
ASHIZAWA, T ;
DEJONG, P ;
WIERINGA, B ;
KORNELUK, R ;
PERRYMAN, MB ;
EPSTEIN, HF ;
CASKEY, CT .
SCIENCE, 1992, 255 (5049) :1256-1258
[17]  
Gao Q, 1998, DEVELOPMENT, V125, P4185
[18]   Periodical distribution of transcription factor sites in promoter regions and connection with chromatin structure [J].
Ioshikhes, I ;
Trifonov, EN ;
Zhang, MQ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (06) :2891-2895
[19]   EFFICIENT ALGORITHMS FOR MOLECULAR SEQUENCE-ANALYSIS [J].
KARLIN, S ;
MORRIS, M ;
GHANDOUR, G ;
LEUNG, MY .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1988, 85 (03) :841-845
[20]   Differential distribution of simple sequence repeats in eukaryotic genome sequences [J].
Katti, MV ;
Ranjekar, PK ;
Gupta, VS .
MOLECULAR BIOLOGY AND EVOLUTION, 2001, 18 (07) :1161-1167