Nonstationary Analysis of Coding and Noncoding Regions in Nucleotide Sequences

被引:17
作者
Bouaynaya, Nidhal [1 ]
Schonfeld, Dan [2 ]
机构
[1] Univ Arkansas, Dept Syst Engn, Little Rock, AR 72204 USA
[2] Univ Illinois, Dept Elect & Comp Engn, Chicago, IL 60607 USA
关键词
AM-FM signals; empirical mode decomposition; evolutionary periodogram; Hilbert transform; long-range correlations; nonstationary time-series analysis;
D O I
10.1109/JSTSP.2008.923852
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 [电气工程]; 0809 [电子科学与技术];
摘要
Previous statistical analysis efforts of DNA sequences revealed that noncoding regions exhibit long-range power law correlations, whereas coding regions behave like random sequences or sustain short-range correlations. A great deal of debate on the presence or absence of long-range correlations in nucleotide sequences, and more specifically in coding regions, has ensued. These results were obtained using signal processing techniques for stationary signals and statistical tools for signals with slowly varying trends superimposed on stationary signals. However, it can be verified using statistical tests that genomic sequences are nonstationary and the nature of their nonstationarity varies and is often much more complex than a simple trend. In this paper, we will bring to bear new tools to analyze nonstationary signals that have emerged in the statistical and signal processing community over the past few years. The emergence of these new methods will be used to shed new light and help resolve the issues of i) the existence of long-range correlations in DNA sequences and ii) whether they are present in both coding and noncoding segments or only in the latter. It turns out that the statistical differences between coding and noncoding segments are much more subtle than previously thought using stationary analysis. In particular, both coding and noncoding sequences exhibit long-range correlations, as asserted by a 1/f(beta(n)) evolutionary (i.e., time-dependent) spectrum. However, we will use an index of randomness, which we derive from the Hilbert transform, to demonstrate that coding segments, although not random as previously suspected, are often "closer" to random sequences than noncoding segments. Moreover, we analytically justify the use of the Hilbert spectrum by proving that narrowband nonstationary signals result in a small demodulation error using the Hilbert transform.
引用
收藏
页码:357 / 364
页数:8
相关论文
共 43 条
[1]
Fractal properties of DNA walks [J].
Abramson, G ;
Cerdeira, HA ;
Bruschi, C .
BIOSYSTEMS, 1999, 49 (01) :63-70
[2]
[Anonymous], MODERN SPECTRAL ANAL
[3]
CHARACTERIZING LONG-RANGE CORRELATIONS IN DNA-SEQUENCES FROM WAVELET ANALYSIS [J].
ARNEODO, A ;
BACRY, E ;
GRAVES, PV ;
MUZY, JF .
PHYSICAL REVIEW LETTERS, 1995, 74 (16) :3293-3296
[4]
UNIVERSALITY IN A DNA STATISTICAL STRUCTURE [J].
AZBEL, MY .
PHYSICAL REVIEW LETTERS, 1995, 75 (01) :168-171
[5]
GLOBAL FRACTAL DIMENSION OF HUMAN DNA-SEQUENCES TREATED AS PSEUDORANDOM WALKS [J].
BERTHELSEN, CL ;
GLAZIER, JA ;
SKOLNICK, MH .
PHYSICAL REVIEW A, 1992, 45 (12) :8902-8913
[6]
LONG-RANGE CORRELATION-PROPERTIES OF CODING AND NONCODING DNA-SEQUENCES - GENBANK ANALYSIS [J].
BULDYREV, SV ;
GOLDBERGER, AL ;
HAVLIN, S ;
MANTEGNA, RN ;
MATSA, ME ;
PENG, CK ;
SIMONS, M ;
STANLEY, HE .
PHYSICAL REVIEW E, 1995, 51 (05) :5084-5091
[7]
Identifying characteristic scales in the human genome [J].
Carpena, P. ;
Bernaola-Galvan, P. ;
Coronado, A. V. ;
Hackenberg, M. ;
Oliver, J. L. .
PHYSICAL REVIEW E, 2007, 75 (03)
[8]
LONG-RANGE CORRELATIONS IN DNA [J].
CHATZIDIMITRIOUDREISMANN, CA ;
LARHAMMAR, D .
NATURE, 1993, 361 (6409) :212-213
[9]
DJIAN P, 1996, P NAT ACAD SCI, V93, P306
[10]
Triplet correlation in DNA sequences and stability of heteroduplexes [J].
Dodin, G ;
Levoir, P ;
Cordier, C .
JOURNAL OF THEORETICAL BIOLOGY, 1996, 183 (03) :341-343