Prediction of methylated CpGs in DNA sequences using a support vector machine

被引:98
作者
Bhasin, M
Zhang, H
Reinherz, EL
Reche, PA
机构
[1] Harvard Univ, Sch Med, Dana Farber Canc Inst, Immunobiol Lab, Boston, MA 02115 USA
[2] Harvard Univ, Sch Med, Dana Farber Canc Inst, Dept Med Oncol, Boston, MA 02115 USA
[3] Harvard Univ, Sch Med, Dept Med, Boston, MA 02115 USA
基金
美国国家卫生研究院;
关键词
DNA; CpG; methylation; support vector machine; prediction;
D O I
10.1016/j.febslet.2005.07.002
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
DNA methylation plays a key role in the regulation of gene expression. The most common type of DNA modification consists of the methylation of cytosine in the CpG dinucleotide. At the present time, there is no method available for the prediction of DNA methylation sites. Therefore, in this study we have developed a support vector machine (SVM)-based method for the prediction of cytosine methylation in CpG dinucleotides. Initially a SVM module was developed from human data for the prediction of human-specific methylation sites. This module achieved a MCC and AUC of 0.501 and 0.814, respectively, when evaluated using a 5-fold cross-validation. The performance of this SVM-based module was better than the classifiers built using alternative machine learning and statistical algorithms including artificial neural networks, Bayesian statistics, and decision trees. Additional SVM modules were also developed based on mammalian- and vertebrate-specific methylation patterns. The SVM module based on human methylation patterns was used for genome-wide analysis of methylation sites. This analysis demonstrated that the percentage of methylated CpGs is higher in UTRs as compared to exonic and intronic regions of human genes. This method is available on line for public use under the name of Methylator at http://bio.dfci.harvard.edu/Methylator/. (c) 2005 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
引用
收藏
页码:4302 / 4308
页数:7
相关论文
共 41 条
[1]   Tumour class prediction and discovery by microarray-based DNA methylation analysis -: art. no. e21 [J].
Adorján, P ;
Distler, J ;
Lipscher, E ;
Model, F ;
Müller, J ;
Pelet, C ;
Braun, A ;
Florl, AR ;
Gütig, D ;
Grabs, G ;
Howe, A ;
Kursar, M ;
Lesche, R ;
Leu, E ;
Lewin, A ;
Maier, S ;
Müller, V ;
Otto, T ;
Scholz, C ;
Schulz, WA ;
Seifert, HH ;
Schwope, I ;
Ziebarth, H ;
Berlin, K ;
Piepenbrock, C ;
Olek, A .
NUCLEIC ACIDS RESEARCH, 2002, 30 (05) :e21
[2]  
Ahuja N, 2000, HISTOL HISTOPATHOL, V15, P835, DOI 10.14670/HH-15.835
[3]   An improved version of the DNA methylation database (MethDB) [J].
Amoreira, C ;
Hindermann, W ;
Grunau, C .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :75-77
[4]   SVM based method for predicting HLA-DRB1*0401 binding peptides in an antigen sequence [J].
Bhasin, M ;
Raghava, GPS .
BIOINFORMATICS, 2004, 20 (03) :421-423
[5]   THE ESSENTIALS OF DNA METHYLATION [J].
BIRD, A .
CELL, 1992, 70 (01) :5-8
[6]  
BIRD A, 1985, CELL, V40, P91, DOI 10.1016/0092-8674(85)90312-5
[7]   CPG-RICH ISLANDS AND THE FUNCTION OF DNA METHYLATION [J].
BIRD, AP .
NATURE, 1986, 321 (6067) :209-213
[8]   Knowledge-based analysis of microarray gene expression data by using support vector machines [J].
Brown, MPS ;
Grundy, WN ;
Lin, D ;
Cristianini, N ;
Sugnet, CW ;
Furey, TS ;
Ares, M ;
Haussler, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) :262-267
[9]   DNA methylation and chromatin structure: The puzzling CpG islands [J].
Caiafa, P ;
Zampieri, M .
JOURNAL OF CELLULAR BIOCHEMISTRY, 2005, 94 (02) :257-265
[10]   Aberrant CpG-island methylation has non-random and tumour-type-specific patterns [J].
Costello, JF ;
Frühwald, MC ;
Smiraglia, DJ ;
Rush, LJ ;
Robertson, GP ;
Gao, X ;
Wright, FA ;
Feramisco, JD ;
Peltomäki, P ;
Lang, JC ;
Schuller, DE ;
Yu, L ;
Bloomfield, CD ;
Caligiuri, MA ;
Yates, A ;
Nishikawa, R ;
Huang, HJS ;
Petrelli, NJ ;
Zhang, XL ;
O'Dorisio, MS ;
Held, WA ;
Cavenee, WK ;
Plass, C .
NATURE GENETICS, 2000, 24 (02) :132-138