CpG island mapping by epigenome prediction

被引:144
作者
Bock, Christoph [1 ]
Walter, Joern
Paulsen, Martina
Lengauer, Thomas
机构
[1] Max Planck Inst Informat, Saarbrucken, Germany
[2] Univ Saarland, D-6600 Saarbrucken, Germany
关键词
D O I
10.1371/journal.pcbi.0030110
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
CpG islands were originally identified by epigenetic and functional properties, namely, absence of DNA methylation and frequent promoter association. However, this concept was quickly replaced by simple DNA sequence criteria, which allowed for genome-wide annotation of CpG islands in the absence of large-scale epigenetic datasets. Although widely used, the current CpG island criteria incur significant disadvantages: (1) reliance on arbitrary threshold parameters that bear little biological justification, (2) failure to account for widespread heterogeneity among CpG islands, and (3) apparent lack of specificity when applied to the human genome. This study is driven by the idea that a quantitative score of ''CpG island strength'' that incorporates epigenetic and functional aspects can help resolve these issues. We construct an epigenome prediction pipeline that links the DNA sequence of CpG islands to their epigenetic states, including DNA methylation, histone modifications, and chromatin accessibility. By training support vector machines on epigenetic data for CpG islands on human Chromosomes 21 and 22, we identify informative DNA attributes that correlate with open versus compact chromatin structures. These DNA attributes are used to predict the epigenetic states of all CpG islands genome-wide. Combining predictions for multiple epigenetic features, we estimate the inherent CpG island strength for each CpG island in the human genome, i.e., its inherent tendency to exhibit an open and transcriptionally competent chromatin structure. We extensively validate our results on independent datasets, showing that the CpG island strength predictions are applicable and informative across different tissues and cell types, and we derive improved maps of predicted ''bona fide'' CpG islands. The mapping of CpG islands by epigenome prediction is conceptually superior to identifying CpG islands by widely used sequence criteria since it links CpG island detection to their characteristic epigenetic and functional states. And it is superior to purely experimental epigenome mapping for CpG island detection since it abstracts from specific properties that are limited to a single cell type or tissue. In addition, using computational epigenetics methods we could identify high correlation between the epigenome and characteristics of the DNA sequence, a finding which emphasizes the need for a better understanding of the mechanistic links between genome and epigenome.
引用
收藏
页码:1055 / 1070
页数:16
相关论文
共 39 条
  • [1] [Anonymous], 2003, HP INVEN
  • [2] Structure, function and evolution of CpG island promoters
    Antequera, F
    [J]. CELLULAR AND MOLECULAR LIFE SCIENCES, 2003, 60 (08) : 1647 - 1658
  • [3] Promoter prediction analysis on the whole human genome
    Bajic, VB
    Tan, SL
    Suzuki, Y
    Sugano, S
    [J]. NATURE BIOTECHNOLOGY, 2004, 22 (11) : 1467 - 1473
  • [4] Mice and men:: Their promoter properties
    Bajic, Vladimir B.
    Tan, Sin Lam
    Christoffels, Alan
    Schonbach, Christian
    Lipovich, Leonard
    Yang, Liang
    Hofmann, Oliver
    Kruger, Adele
    Hide, Winston
    Kai, Chikatoshi
    Kawai, Jun
    Hume, David A.
    Carninci, Piero
    Hayashizaki, Yoshihide
    [J]. PLOS GENETICS, 2006, 2 (04): : 614 - 626
  • [5] Assessing the accuracy of prediction algorithms for classification: an overview
    Baldi, P
    Brunak, S
    Chauvin, Y
    Andersen, CAF
    Nielsen, H
    [J]. BIOINFORMATICS, 2000, 16 (05) : 412 - 424
  • [6] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [7] Genomic maps and comparative analysis of histone modifications in human and mouse
    Bernstein, BE
    Kamal, M
    Lindblad-Toh, K
    Bekiranov, S
    Bailey, DK
    Huebert, DJ
    McMahon, S
    Karlsson, EK
    Kulbokas, EJ
    Gingeras, TR
    Schreiber, SL
    Lander, ES
    [J]. CELL, 2005, 120 (02) : 169 - 181
  • [8] DNA methylation patterns and epigenetic memory
    Bird, A
    [J]. GENES & DEVELOPMENT, 2002, 16 (01) : 6 - 21
  • [9] CPG-RICH ISLANDS AND THE FUNCTION OF DNA METHYLATION
    BIRD, AP
    [J]. NATURE, 1986, 321 (6067) : 209 - 213
  • [10] CpG island methylation in human lymphocytes is highly correlated with DNA sequence, repeats, and predicted DNA structure
    Bock, Christoph
    Paulsen, Martina
    Tierling, Sascha
    Mikeska, Thomas
    Lengauer, Thomas
    Walter, Joern
    [J]. PLOS GENETICS, 2006, 2 (03): : 243 - 252