Close sequence comparisons are sufficient to identify human cis-regulatory elements

被引:149
作者
Prabhakar, Shyam [1 ]
Poulin, Francis
Shoukry, Malak
Afzal, Veena
Rubin, Edward M.
Couronne, Olivier
Pennacchio, Len A.
机构
[1] Lawrence Berkeley Natl Lab, Genom Div, Berkeley, CA 94720 USA
[2] US Dept Energy Joint Genome Inst, Walnut Creek, CA 94598 USA
关键词
D O I
10.1101/gr.4717506
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons. To address this problem, we identified evolutionarily conserved noncoding regions in primate, mammalian, and more distant comparisons using a uniform approach (Gumby) that facilitates unbiased assessment of the impact of evolutionary distance on predictive power. We benchmarked computational predictions against previously identified cis-regulatory elements at diverse genomic loci and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using an in vivo enhancer assay in transgenic mice. Human regulatory elements were identified with acceptable sensitivity (53%-80%) and true-positive rate (27%-67%) by comparison with one to five other eutherian mammals or six other simian primates. More distant comparisons (marsupial, avian, amphibian, and fish) failed to identify many of the empirically defined functional noncoding elements. Our results highlight the practical utility of close sequence comparisons, and the loss of sensitivity entailed by more distant comparisons. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole-genome comparative analysis that explains most of the observations from empirical benchmarking. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for in vivo testing at embryonic time points.
引用
收藏
页码:855 / 863
页数:9
相关论文
共 35 条
[1]   Mapping cis-regulatory domains in the human genome using multi-species conservation of synteny [J].
Ahituv, N ;
Prabhakar, S ;
Rubin, EM ;
Couronne, O .
HUMAN MOLECULAR GENETICS, 2005, 14 (20) :3057-3063
[2]   Exploiting human-fish genome comparisons for deciphering gene regulation [J].
Ahituv, N ;
Rubin, EM ;
Nobrega, MA .
HUMAN MOLECULAR GENETICS, 2004, 13 :R261-R266
[3]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[4]   Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes [J].
Aparicio, S ;
Chapman, J ;
Stupka, E ;
Putnam, N ;
Chia, J ;
Dehal, P ;
Christoffels, A ;
Rash, S ;
Hoon, S ;
Smit, A ;
Gelpke, MDS ;
Roach, J ;
Oh, T ;
Ho, IY ;
Wong, M ;
Detter, C ;
Verhoef, F ;
Predki, P ;
Tay, A ;
Lucas, S ;
Richardson, P ;
Smith, SF ;
Clark, MS ;
Edwards, YJK ;
Doggett, N ;
Zharkikh, A ;
Tavtigian, SV ;
Pruss, D ;
Barnstead, M ;
Evans, C ;
Baden, H ;
Powell, J ;
Glusman, G ;
Rowen, L ;
Hood, L ;
Tan, YH ;
Elgar, G ;
Hawkins, T ;
Venkatesh, B ;
Rokhsar, D ;
Brenner, S .
SCIENCE, 2002, 297 (5585) :1301-1310
[5]   Ultraconserved elements in the human genome [J].
Bejerano, G ;
Pheasant, M ;
Makunin, I ;
Stephen, S ;
Kent, WJ ;
Mattick, JS ;
Haussler, D .
SCIENCE, 2004, 304 (5675) :1321-1325
[6]   Into the heart of darkness: large-scale clustering of human non-coding DNA [J].
Bejerano, Gill ;
Haussler, David ;
Blanchette, Mathieu .
BIOINFORMATICS, 2004, 20 :40-48
[7]   Phylogenetic shadowing of primate sequences to find functional regions of the human genome [J].
Boffelli, D ;
McAuliffe, J ;
Ovcharenko, D ;
Lewis, KD ;
Ovcharenko, I ;
Pachter, L ;
Rubin, EM .
SCIENCE, 2003, 299 (5611) :1391-1394
[8]   CHARACTERIZATION OF THE PUFFERFISH (FUGU) GENOME AS A COMPACT MODEL VERTEBRATE GENOME [J].
BRENNER, S ;
ELGAR, G ;
SANDFORD, R ;
MACRAE, A ;
VENKATESH, B ;
APARICIO, S .
NATURE, 1993, 366 (6452) :265-268
[9]   Fast and sensitive multiple alignment of large genomic sequences -: art. no. 66 [J].
Brudno, M ;
Chapman, M ;
Göttgens, B ;
Batzoglou, S ;
Morgenstern, B .
BMC BIOINFORMATICS, 2003, 4 (1)
[10]   LAGAN and Multi-LAGAN: Efficient tools for large-scale multiple alignment of genomic DNA [J].
Brudno, M ;
Do, CB ;
Cooper, GM ;
Kim, MF ;
Davydov, E ;
Green, ED ;
Sidow, A ;
Batzoglou, S .
GENOME RESEARCH, 2003, 13 (04) :721-731