STATISTICAL-ANALYSIS OF NUCLEOTIDE-SEQUENCES

被引:32
作者
STUCKLE, EE
EMMRICH, C
GROB, U
NIELSEN, PJ
机构
[1] MAX PLANCK INST IMMUNBIOL, STUBEWEG 51, W-7800 FREIBURG, GERMANY
[2] UNIV FREIBURG, FAK PHYS, W-7800 FREIBURG, GERMANY
关键词
D O I
10.1093/nar/18.22.6641
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In order to scan nucleic acid databases for potentially relevant but as yet unknown signals, we have developed an improved statistical model for pattern analysis of nucleic acid sequences by modifying previous methods based on Markov chains. We demonstrate the importance of selecting the appropriate parameters in order for the method to function at all. The model allows the simultaneous analysis of several short sequences with unequal base frequencies and Markov order k≠0 as is usually the case in databases. As a test of these modifications, we show that in E.coli sequences there is a bias against palindromic hexamers which correspond to known restriction enzyme recognition sites. © 1990 Oxford University Press.
引用
收藏
页码:6641 / 6647
页数:7
相关论文
共 28 条
[1]   A MARKOV ANALYSIS OF DNA-SEQUENCES [J].
ALMAGOR, H .
JOURNAL OF THEORETICAL BIOLOGY, 1983, 104 (04) :633-645
[2]   MONONUCLEOTIDE THROUGH HEXANUCLEOTIDE COMPOSITION OF THE SENSE STRAND OF YEAST DNA - A MARKOV-CHAIN ANALYSIS [J].
ARNOLD, J ;
CUTICCHIA, AJ ;
NEWSOME, DA ;
JENNINGS, WW ;
IVARIE, R .
NUCLEIC ACIDS RESEARCH, 1988, 16 (14) :7145-7158
[3]  
BARKER WC, 1990, METHOD ENZYMOL, V183, P31
[4]   THE GENBANK GENETIC SEQUENCE DATA-BANK [J].
BILOFSKY, HS ;
BURKS, C .
NUCLEIC ACIDS RESEARCH, 1988, 16 (05) :1861-1863
[6]   HIGHLY RECURRING SEQUENCE ELEMENTS IDENTIFIED IN EUKARYOTIC DNAS BY COMPUTER-ANALYSIS ARE OFTEN HOMOLOGOUS TO REGULATORY SEQUENCES OR PROTEIN-BINDING SITES [J].
BODNAR, JW ;
WARD, DC .
NUCLEIC ACIDS RESEARCH, 1987, 15 (04) :1835-1851
[7]   LINGUISTICS OF NUCLEOTIDE-SEQUENCES - MORPHOLOGY AND COMPARISON OF VOCABULARIES [J].
BRENDEL, V ;
BECKMANN, JS ;
TRIFONOV, EN .
JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 1986, 4 (01) :11-21
[8]  
BURKS C, 1990, METHOD ENZYMOL, V183, P3
[9]   THE EMBL DATA LIBRARY [J].
CAMERON, GN .
NUCLEIC ACIDS RESEARCH, 1988, 16 (05) :1865-1867
[10]   HEURISTIC INFORMATIONAL ANALYSIS OF SEQUENCES [J].
CLAVERIE, JM ;
BOUGUELERET, L .
NUCLEIC ACIDS RESEARCH, 1986, 14 (01) :179-196