Evaluation of regulatory potential and conservation scores for detecting cis-regulatory modules in aligned mammalian genome sequences

被引:164
作者
King, DC
Taylor, J
Elnitski, L
Chiaromonte, F
Miller, W
Hardison, RC [1 ]
机构
[1] Penn State Univ, Ctr Comparat Genom & Bioinformat, Huck Inst Life Sci, University Pk, PA 16802 USA
[2] Penn State Univ, Dept Biochem, University Pk, PA 16802 USA
[3] Penn State Univ, Dept Biol Mol, University Pk, PA 16802 USA
[4] Penn State Univ, Dept Comp Sci & Engn, University Pk, PA 16802 USA
[5] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
[6] Penn State Univ, Dept Biol, University Pk, PA 16802 USA
[7] NHGRI, Rockville, MD 20852 USA
关键词
D O I
10.1101/gr.3642605
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Techniques of comparative genomes are being used to identify candidate functional DNA sequences, and objective evaluations are needed to assess their effectiveness. Different analytical methods score distinctive features of whole-genome alignments among human, Mouse, and rat to predict functional regions. We evaluated three of these methods for their ability to identify the positions of known regulatory regions in the well-studied HBB gene complex. Two methods, multispecies conserved sequences and phastCons, quantify levels of conservation to estimate a likelihood that aligned DNA sequences are under purifying selection. A third function, regulatory potential (RP), measures the similarity of patterns in the alignments to those in known regulatory regions. the methods call correctly identify 50%-60% of noncoding positions in the HBB gene complex as regulatory or nonregulatory, with RP performing better than do other methods. When evaluated by the ability to discriminate genomic intervals, RP reaches a sensitivity of 0.78 and a true discovery rate of similar to 0.6. the performance is better oil other reference sets; both phastCons and RP scores call capture almost all regulatory elements in those sets along with similar to 7% of the human genome.
引用
收藏
页码:1051 / 1060
页数:10
相关论文
共 103 条
[1]  
ALLAN M, 1983, CELL, V35, P187
[2]   THE HUMAN BETA-GLOBIN GENE CONTAINS MULTIPLE REGULATORY REGIONS - IDENTIFICATION OF ONE PROMOTER AND 2 DOWNSTREAM ENHANCERS [J].
ANTONIOU, M ;
DEBOER, E ;
HABETS, G ;
GROSVELD, F .
EMBO JOURNAL, 1988, 7 (02) :377-384
[3]   2 3' SEQUENCES DIRECT ADULT ERYTHROID-SPECIFIC EXPRESSION OF HUMAN BETA-GLOBIN GENES IN TRANSGENIC MICE [J].
BEHRINGER, RR ;
HAMMER, RE ;
BRINSTER, RL ;
PALMITER, RD ;
TOWNES, TM .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1987, 84 (20) :7056-7060
[4]   Description and targeted deletion of 5′ hypersensitive site 5 and 6 of the mouse β-globin locus control region [J].
Bender, MA ;
Reik, A ;
Close, J ;
Telling, A ;
Epner, E ;
Fiering, S ;
Hardison, R ;
Groudine, M .
BLOOD, 1998, 92 (11) :4394-4403
[5]   Computational identification of developmental enhancers:: conservation and function of transcription factor binding-site clusters in Drosophila melanogaster and Drosophila pseudoobscura -: art. no. R61 [J].
Berman, BP ;
Pfeiffer, BD ;
Laverty, TR ;
Salzberg, SL ;
Rubin, GM ;
Eisen, MB ;
Celniker, SE .
GENOME BIOLOGY, 2004, 5 (09)
[6]   Aligning multiple genomic sequences with the threaded blockset aligner [J].
Blanchette, M ;
Kent, WJ ;
Riemer, C ;
Elnitski, L ;
Smit, AFA ;
Roskin, KM ;
Baertsch, R ;
Rosenbloom, K ;
Clawson, H ;
Green, ED ;
Haussler, D ;
Miller, W .
GENOME RESEARCH, 2004, 14 (04) :708-715
[7]   Algorithms for phylogenetic footprinting [J].
Blanchette, M ;
Schwikowski, B ;
Tompa, M .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2002, 9 (02) :211-223
[8]   AN ENHANCER ELEMENT LIES 3' TO THE HUMAN A-GAMMA-GLOBIN GENE [J].
BODINE, DM ;
LEY, TJ .
EMBO JOURNAL, 1987, 6 (10) :2997-3004
[9]   Phylogenetic shadowing of primate sequences to find functional regions of the human genome [J].
Boffelli, D ;
McAuliffe, J ;
Ovcharenko, D ;
Lewis, KD ;
Ovcharenko, I ;
Pachter, L ;
Rubin, EM .
SCIENCE, 2003, 299 (5611) :1391-1394
[10]   Recent advances in gene structure prediction [J].
Brent, MR ;
Guigó, R .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2004, 14 (03) :264-272