Sequence alignment kernel for recognition of promoter regions

被引:97
作者
Gordon, L [1 ]
Chervonenkis, AY
Gammerman, AJ
Shahmuradov, IA
Solovyev, VV
机构
[1] Univ London, Royal Holloway, Dept Comp Sci, Egham TW20 0EX, Surrey, England
[2] Inst Control Sci, Moscow, Russia
[3] Softberry Inc, Mt Kisco, NY 10549 USA
基金
英国工程与自然科学研究理事会; 英国生物技术与生命科学研究理事会;
关键词
D O I
10.1093/bioinformatics/btg265
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In this paper we propose a new method for recognition of prokaryotic promoter regions with startpoints of transcription. The method is based on Sequence Alignment Kernel, a function reflecting the quantitative measure of match between two sequences. This kernel function is further used in Dual SVM, which performs the recognition. Several recognition methods have been trained and tested on positive data set, consisting of 669 sigma(70)-promoter regions with known transcription startpoints of Escherichia coli and two negative data sets of 709 examples each, taken from coding and non-coding regions of the same genome. The results show that our method performs well and achieves 16.5% average error rate on positive & coding negative data and 18.6% average error rate on positive & non-coding negative data.
引用
收藏
页码:1964 / 1971
页数:8
相关论文
共 42 条
[1]   APPLICATION OF A NEW METHOD OF PATTERN-RECOGNITION IN DNA-SEQUENCE ANALYSIS - A STUDY OF ESCHERICHIA-COLI PROMOTERS [J].
ALEXANDROV, NN ;
MIRONOV, AA .
NUCLEIC ACIDS RESEARCH, 1990, 18 (07) :1847-1852
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]  
BAILEY T, LEARNING CONSENSUS P
[4]  
BAILEY TL, 1995, MACH LEARN, V21, P51, DOI 10.1007/BF00993379
[5]  
BAILEY TL, 1994, 2 INT C INT SYST MOL, P28
[6]   The complete genome sequence of Escherichia coli K-12 [J].
Blattner, FR ;
Plunkett, G ;
Bloch, CA ;
Perna, NT ;
Burland, V ;
Riley, M ;
ColladoVides, J ;
Glasner, JD ;
Rode, CK ;
Mayhew, GF ;
Gregor, J ;
Davis, NW ;
Kirkpatrick, HA ;
Goeden, MA ;
Rose, DJ ;
Mau, B ;
Shao, Y .
SCIENCE, 1997, 277 (5331) :1453-+
[7]  
CROCHEMORE M, 2002, IN PRESS HDB COMPUTA
[8]   RNA polymerase-promoter interactions: the comings and goings of RNA polymerase [J].
DeHaseth, PL ;
Zupancic, ML ;
Record, MT .
JOURNAL OF BACTERIOLOGY, 1998, 180 (12) :3019-3025
[9]   NEURAL NETWORK OPTIMIZATION FOR ESCHERICHIA-COLI PROMOTER PREDICTION [J].
DEMELER, B ;
ZHOU, GW .
NUCLEIC ACIDS RESEARCH, 1991, 19 (07) :1593-1599
[10]   Profile hidden Markov models [J].
Eddy, SR .
BIOINFORMATICS, 1998, 14 (09) :755-763