Biological applications of support vector machines

被引:149
作者
Yang, ZR [1 ]
机构
[1] Univ Exeter, Dept Comp Sci, Exeter EX4 4PS, Devon, England
关键词
support vector machines; sequence analysis; protein function annotation; protein functional site recognition;
D O I
10.1093/bib/5.4.328
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
One of the major tasks in bioinformatics is the classification and prediction of biological data. With the rapid increase in size of the biological databanks, it is essential to use computer programs to automate the classification process. At present, the computer programs that give the best prediction performance are support vector machines (SVMs). This is because SVMs are designed to maximise the margin to separate two classes so that the trained model generalises well on unseen data. Most other computer programs implement a classifier through the minimisation of error occurred in training, which leads to poorer generalisation. Because of this, SVMs have been widely applied to many areas of bioinformatics including protein function prediction, protease functional site recognition, transcription initiation site prediction and gene expression data classification. This paper will discuss the principles of SVMs and the applications of SVMs to the analysis of biological data, mainly protein and DNA
引用
收藏
页码:328 / 338
页数:11
相关论文
共 45 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]  
[Anonymous], 1978, Atlas of protein sequence and structure
[3]   Reduced bio basis function neural network for identification of protein phosphorylation sites: comparison with pattern recognition algorithms [J].
Berry, EA ;
Dalby, AR ;
Yang, ZR .
COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2004, 28 (01) :75-85
[4]   Knowledge-based analysis of microarray gene expression data by using support vector machines [J].
Brown, MPS ;
Grundy, WN ;
Lin, D ;
Cristianini, N ;
Sugnet, CW ;
Furey, TS ;
Ares, M ;
Haussler, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) :262-267
[5]   Protein function classification via support vector machine approach [J].
Cai, CZ ;
Wang, WL ;
Sun, LZ ;
Chen, YZ .
MATHEMATICAL BIOSCIENCES, 2003, 185 (02) :111-122
[6]   Application of SVM to predict membrane protein types [J].
Cai, YD ;
Ricardo, PW ;
Jen, CH ;
Chou, KC .
JOURNAL OF THEORETICAL BIOLOGY, 2004, 226 (04) :373-376
[7]   Support Vector Machine for predicting α-turn types [J].
Cai, YD ;
Feng, KY ;
Li, YX ;
Chou, KC .
PEPTIDES, 2003, 24 (04) :629-630
[8]   Support vector machines for predicting rRNA-, RNA-, and DNA-binding proteins from amino acid sequence [J].
Cai, YD ;
Lin, SL .
BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS, 2003, 1648 (1-2) :127-133
[9]   Prediction of protein structural classes by support vector machines [J].
Cai, YD ;
Liu, XJ ;
Xu, XB ;
Chou, KC .
COMPUTERS & CHEMISTRY, 2002, 26 (03) :293-296
[10]   A computational approach to identify genes for functional RNAs in genomic sequences [J].
Carter, RJ ;
Dubchak, I ;
Holbrook, SR .
NUCLEIC ACIDS RESEARCH, 2001, 29 (19) :3928-3938