Gradient-based optimization of kernel-target alignment for sequence kernels applied to bacterial gene start detection

被引:14
作者
Igel, Christian
Glasmachers, Tobias
Mersch, Britta
Pfeifer, Nico
Meinicke, Peter
机构
[1] Ruhr Univ Bochum, Inst Neuroinformat, D-44780 Bochum, Germany
[2] German Canc Res Ctr, D-69120 Heidelberg, Germany
[3] Univ Tubingen, Abt Simulat Biol Syst, Inst Informat, D-72076 Tubingen, Germany
[4] Univ Gottingen, Abt Bioinformat, Inst Mikrobiol & Genet, D-37077 Gottingen, Germany
关键词
sequence analysis; oligo kernel; translation initiation sites; model selection; kernel target alignment; support vector machines;
D O I
10.1109/tcbb.2007.070208
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Biological data mining using kernel methods can be improved by a task-specific choice of the kernel function. Oligo kernels for genomic sequence analysis have proven to have a high discriminative power and to provide interpretable results. Oligo kernels that consider subsequences of different lengths can be combined and parameterized to increase their flexibility. For adapting these parameters efficiently, gradient-based optimization of the kernel-target alignment is proposed. The power of this new, general model selection procedure and the benefits of fitting kernels to problem classes are demonstrated by adapting oligo kernels for bacterial gene start detection.
引用
收藏
页码:216 / 226
页数:11
相关论文
共 43 条
[1]   Learning by kernel polarization [J].
Baram, Y .
NEURAL COMPUTATION, 2005, 17 (06) :1264-1275
[2]   The complete genome sequence of Escherichia coli K-12 [J].
Blattner, FR ;
Plunkett, G ;
Bloch, CA ;
Perna, NT ;
Burland, V ;
Riley, M ;
ColladoVides, J ;
Glasner, JD ;
Rode, CK ;
Mayhew, GF ;
Gregor, J ;
Davis, NW ;
Kirkpatrick, HA ;
Goeden, MA ;
Rose, DJ ;
Mau, B ;
Shao, Y .
SCIENCE, 1997, 277 (5331) :1453-+
[3]   Choosing multiple parameters for support vector machines [J].
Chapelle, O ;
Vapnik, V ;
Bousquet, O ;
Mukherjee, S .
MACHINE LEARNING, 2002, 46 (1-3) :131-159
[4]   Radius margin bounds for support vector machines with the RBF kernel [J].
Chung, KM ;
Kao, WC ;
Sun, CL ;
Wang, LL ;
Lin, CJ .
NEURAL COMPUTATION, 2003, 15 (11) :2643-2681
[5]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
[6]  
Cristianini N., 2000, Intelligent Data Analysis: An Introduction
[7]  
Cristianini Nello, 2001, ADV NEURAL INFORM PR, V14
[8]   Feature subset selection for splice site prediction [J].
Degroeve, S ;
De Baets, B ;
Van de Peer, Y ;
Rouzé, P .
BIOINFORMATICS, 2002, 18 :S75-S83
[9]   Improved microbial gene identification with GLIMMER [J].
Delcher, AL ;
Harmon, D ;
Kasif, S ;
White, O ;
Salzberg, SL .
NUCLEIC ACIDS RESEARCH, 1999, 27 (23) :4636-4641
[10]   Regularization networks and support vector machines [J].
Evgeniou, T ;
Pontil, M ;
Poggio, T .
ADVANCES IN COMPUTATIONAL MATHEMATICS, 2000, 13 (01) :1-50