WORDUP - AN EFFICIENT ALGORITHM FOR DISCOVERING STATISTICALLY SIGNIFICANT PATTERNS IN DNA-SEQUENCES

被引:48
作者
PESOLE, G [1 ]
PRUNELLA, N [1 ]
LIUNI, S [1 ]
ATTIMONELLI, M [1 ]
SACCONE, C [1 ]
机构
[1] CNR, CTR STUDIO MITOCONDRI & METAB ENERGENT, I-70126 BARI, ITALY
关键词
D O I
10.1093/nar/20.11.2871
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We present here a fast and sensitive method designed to isolate short nucleotide sequences which have non-random statistical properties and may thus be biologically active. It is based on a first order Markov analysis and allows us to detect statistically significant sequence motifs from six to ten nucleotides long which are significantly shared (or avoided) in the sequences under investigation. This method has been tested on a set of 521 sequences extracted from the Eukaryotic Promoter Database (2). Our results demonstrate the accuracy and the efficiency of the method in that the sequence motifs which are known to act as eukaryotic promoters, such as the TATA-box and the CAAT-box, were clearly identified. In addition we have found other statistically significant motifs, the biological roles of which are yet to be clarified.
引用
收藏
页码:2871 / 2875
页数:5
相关论文
共 35 条
[11]   DELETIONAL ANALYSIS OF THE PROMOTER REGION OF THE HUMAN TRANSFERRIN RECEPTOR GENE [J].
CASEY, JL ;
DIJESO, B ;
RAO, KK ;
ROUAULT, TA ;
KLAUSNER, RD ;
HARFORD, JB .
NUCLEIC ACIDS RESEARCH, 1988, 16 (02) :629-646
[12]   AUTOREGULATION OF PIT-1 GENE-EXPRESSION MEDIATED BY 2 CIS-ACTIVE PROMOTER ELEMENTS [J].
CHEN, RP ;
INGRAHAM, HA ;
TREACY, MN ;
ALBERT, VR ;
WILSON, L ;
ROSENFELD, MG .
NATURE, 1990, 346 (6284) :583-586
[13]   CELL-TYPE SPECIFIC PROTEIN-BINDING TO THE ENHANCER OF SIMIAN VIRUS-40 IN NUCLEAR EXTRACTS [J].
DAVIDSON, I ;
FROMENTAL, C ;
AUGEREAU, P ;
WILDEMAN, A ;
ZENKE, M ;
CHAMBON, P .
NATURE, 1986, 323 (6088) :544-548
[14]   A COMPREHENSIVE SET OF SEQUENCE-ANALYSIS PROGRAMS FOR THE VAX [J].
DEVEREUX, J ;
HAEBERLI, P ;
SMITHIES, O .
NUCLEIC ACIDS RESEARCH, 1984, 12 (01) :387-395
[15]   THE STRUCTURE AND EVOLUTION OF THE HUMAN BETA-GLOBIN GENE FAMILY [J].
EFSTRATIADIS, A ;
POSAKONY, JW ;
MANIATIS, T ;
LAWN, RM ;
OCONNELL, C ;
SPRITZ, RA ;
DERIEL, JK ;
FORGET, BG ;
WEISSMAN, SM ;
SLIGHTOM, JL ;
BLECHL, AE ;
SMITHIES, O ;
BARALLE, FE ;
SHOULDERS, CC ;
PROUDFOOT, NJ .
CELL, 1980, 21 (03) :653-668
[16]   THE REPEATED GC-RICH MOTIFS UPSTREAM FROM THE TATA BOX ARE IMPORTANT ELEMENTS OF THE SV40 EARLY PROMOTER [J].
EVERETT, RD ;
BATY, D ;
CHAMBON, P .
NUCLEIC ACIDS RESEARCH, 1983, 11 (08) :2447-2464
[17]   RIGOROUS PATTERN-RECOGNITION METHODS FOR DNA-SEQUENCES - ANALYSIS OF PROMOTER SEQUENCES FROM ESCHERICHIA-COLI [J].
GALAS, DJ ;
EGGERT, M ;
WATERMAN, MS .
JOURNAL OF MOLECULAR BIOLOGY, 1985, 186 (01) :117-128
[18]   SQUIRREL - SEQUENCE QUERY, INFORMATION-RETRIEVAL AND REPORTING LIBRARY - A PROGRAM PACKAGE FOR ANALYZING SIGNALS IN NUCLEIC-ACID SEQUENCES FOR THE VAX [J].
GARTMANN, CJ ;
GROB, U .
NUCLEIC ACIDS RESEARCH, 1991, 19 (21) :6033-6040
[19]   A RELATIONAL DATABASE OF TRANSCRIPTION FACTORS [J].
GHOSH, D .
NUCLEIC ACIDS RESEARCH, 1990, 18 (07) :1749-1756
[20]  
GOUY M, 1985, COMPUT APPL BIOSCI, V1, P167