The distribution of GC nucleotides and regulatory sequence motifs in genes and their adjacent sequences

被引:16
作者
Jaksik, Roman [1 ]
Rzeszowska-Wolny, Joanna [1 ]
机构
[1] Silesian Tech Univ, Inst Automat Control, Biosyst Grp, PL-44100 Gliwice, Poland
关键词
Sequence analysis program; Transcription factor binding elements; ARE motifs; Motifs in coding sequences; Motifs in 3 '-UTRs; GC distribution; RNA-PROTEIN INTERACTIONS; AU-RICH ELEMENT; MESSENGER-RNA; HUMAN GENOME; TRANSCRIPTION; INITIATION; ISOCHORES; DATABASE; BINDING; LOCALIZATION;
D O I
10.1016/j.gene.2011.10.050
中图分类号
Q3 [遗传学];
学科分类号
071007 [遗传学];
摘要
The genomes of warm-blooded vertebrates are a mosaic of alternating fragments, isochores, with low and high GC contents and embedded genes. The evolutionary mechanisms leading to such structures are not fully understood. We have compared the distributions of GC base pairs in coding sequences and sequences spanning 5 kb upstream and downstream of genes in human and other species annotated in the RefSeq database and in different isochores of the human genome. Using our computer application NucleoSeq (available at www.bioinformatics.aei.polsl.pl), we also compared the average distributions of AT-rich regulatory motifs and transcription factor binding sites (TFBS) for single transcription factors with those in randomized sequences of the human genome, and revealed that some TFBS have a lower average frequency in a gene's promoter than in the randomized sequence, whereas for other transcription factors the opposite is observed. TFBS for some transcription factors show a higher frequency in the coding sequence than in the regulatory and in randomized sequences, suggesting their accumulation during evolution and possible functional roles. On the basis of the GC content in genes and their adjacent sequences which was similar in all species studied here, and the distribution of regulatory motifs, we hypothesize that the first step in evolution of many genes existing today was the joining of a GC-rich coding sequence to a region with a lower GC content and the potential to create regulatory motifs. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:375 / 381
页数:7
相关论文
共 48 条
[1]
ARED: human AU-rich element-containing mRNA database reveals an unexpectedly diverse functional repertoire of encoded proteins [J].
Bakheet, T ;
Frevel, M ;
Williams, BRG ;
Greer, W ;
Khabar, KSA .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :246-254
[2]
A Transcriptional Enhancer from the Coding Region of ADAMTS5 [J].
Barthel, Kristen K. B. ;
Liu, Xuedong .
PLOS ONE, 2008, 3 (05)
[3]
Isochores and the evolutionary genomics of vertebrates [J].
Bernardi, G .
GENE, 2000, 241 (01) :3-17
[4]
COMPOSITIONAL CONSTRAINTS AND GENOME EVOLUTION [J].
BERNARDI, G ;
BERNARDI, G .
JOURNAL OF MOLECULAR EVOLUTION, 1986, 24 (1-2) :1-11
[5]
The human genome: Organization and evolutionary history [J].
Bernardi, G .
ANNUAL REVIEW OF GENETICS, 1995, 29 :445-476
[6]
The neoselectionist theory of genome evolution [J].
Bernardi, Giorgio .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (20) :8385-8390
[7]
RNA-protein interactions and control of mRNA stability in neurons [J].
Bolognani, Federico ;
Perrone-Bizzozero, Nora I. .
JOURNAL OF NEUROSCIENCE RESEARCH, 2008, 86 (03) :481-489
[8]
The HGNC Database in 2008: a resource for the human genome [J].
Bruford, Elspeth A. ;
Lush, Michael J. ;
Wright, Mathew W. ;
Sneddon, Tam P. ;
Povey, Sue ;
Birney, Ewan .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D445-D448
[9]
Derrigo M, 2000, INT J MOL MED, V5, P111
[10]
Biased Gene Conversion and the Evolution of Mammalian Genomic Landscapes [J].
Duret, Laurent ;
Galtier, Nicolas .
ANNUAL REVIEW OF GENOMICS AND HUMAN GENETICS, 2009, 10 :285-311