Some statistical properties of regulatory DNA sequences, and their use in predicting regulatory regions in the Drosophila genome:: the fluffy-tail test -: art. no. 109

被引:27
作者
Abnizova, I
te Boekhorst, R
Walter, K
Gilks, WR
机构
[1] Inst Publ Hlth, MRC, Biostat Unit, Cambridge CB2 2SR, England
[2] Univ Hertfordshire, Dept Comp Sci, Hatfield AL10 92BA, Herts, England
关键词
D O I
10.1186/1471-2105-6-109
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: This paper addresses the problem of recognising DNA cis- regulatory modules which are located far from genes. Experimental procedures for this are slow and costly, and computational methods are hard, because they lack positional information. Results: We present a novel statistical method, the " fluffy- tail test", to recognise regulatory DNA. We exploit one of the basic informational properties of regulatory DNA: abundance of over represented transcription factor binding site ( TFBS) motifs, although we do not look for specific TFBS motifs, per se. Though overrepresentation of TFBS motifs in regulatory DNA has been intensively exploited by many algorithms, it is still a difficult problem to distinguish regulatory from other genomic DNA. Conclusion: We show that, in the data used, our method is able to distinguish cis- regulatory modules by exploiting statistical differences between the probability distributions of similar words in regulatory and other DNA. The potential application of our method includes annotation of new genomic sequences and motif discovery.
引用
收藏
页数:12
相关论文
共 20 条
[1]   Long-range correlations between DNA bending sites: Relation to the structure and dynamics of nucleosomes [J].
Audit, B ;
Vaillant, C ;
Arneodo, A ;
d'Aubenton-Carafa, Y ;
Thermes, C .
JOURNAL OF MOLECULAR BIOLOGY, 2002, 316 (04) :903-918
[2]   Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome [J].
Berman, BP ;
Nibu, Y ;
Pfeiffer, BD ;
Tomancak, P ;
Celniker, SE ;
Levine, M ;
Rubin, GM ;
Eisen, MB .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (02) :757-762
[3]   Algorithms for phylogenetic footprinting [J].
Blanchette, M ;
Schwikowski, B ;
Tompa, M .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2002, 9 (02) :211-223
[4]   Phylogenetic shadowing of primate sequences to find functional regions of the human genome [J].
Boffelli, D ;
McAuliffe, J ;
Ovcharenko, D ;
Lewis, KD ;
Ovcharenko, I ;
Pachter, L ;
Rubin, EM .
SCIENCE, 2003, 299 (5611) :1391-1394
[5]   Strategies and tools for whole-genome alignments [J].
Couronne, O ;
Poliakov, A ;
Bray, N ;
Ishkhanov, T ;
Ryaboy, D ;
Rubin, E ;
Pachter, L ;
Dubchak, I .
GENOME RESEARCH, 2003, 13 (01) :73-80
[6]  
Davidson E. H., 2001, Genomic regulatory systems: development and evolution
[7]   Searching for regulatory elements in human noncoding sequences [J].
Duret, L ;
Bucher, P .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1997, 7 (03) :399-406
[8]   Distinguishing regulatory DNA from neutral sites [J].
Elnitski, L ;
Hardison, RC ;
Li, J ;
Yang, S ;
Kolbe, D ;
Eswara, P ;
O'Connor, MJ ;
Schwartz, S ;
Miller, W ;
Chiaromonte, F .
GENOME RESEARCH, 2003, 13 (01) :64-72
[9]   Identification of functional clusters of transcription factor binding motifs in genome sequences: the MSCAN algorithm [J].
Johansson, Oe. ;
Alkema, W. ;
Wasserman, W. W. ;
Lagergren, J. .
BIOINFORMATICS, 2003, 19 :i169-i176
[10]   Homotypic regulatory clusters in Drosophila [J].
Lifanov, AP ;
Makeev, VJ ;
Nazina, AG ;
Papatsenko, DA .
GENOME RESEARCH, 2003, 13 (04) :579-588