DHPC: A new tool to express genome structural features

被引:8
作者
Deng, Xuegong [1 ,2 ]
Deng, Xuemei [1 ]
Rayner, Simon [1 ,3 ]
Liu, Xiangdong [4 ]
Zhang, Qingling [2 ]
Yang, Yupu [5 ]
Li, Ning [1 ]
机构
[1] China Agr Univ, Minist Agr, State Key Lab Agrobiotechnol, Beijing 100094, Peoples R China
[2] NE Univ, Sch Sci, Shenyang 110004, Peoples R China
[3] Wuhan Inst Virol, Hubei 430071, Peoples R China
[4] Dalian Natl Univ, Res Inst Nonlinear Informat Technol, Dalian 116600, Peoples R China
[5] Kaiyue Biosoftware Ltd, Shenyang 110004, Peoples R China
基金
中国国家自然科学基金;
关键词
genome visualization; interspersed repeat sequence; GC/AT skew; isochore; zebrafish; chicken; tetraodon; Hilbert-Peano curve; Gauss smooth; DHPC tool;
D O I
10.1016/j.ygeno.2008.01.003
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
The DHPC (DNA Hilbert-Peano curve) is a new tool for visualizing large-scale genome sequences by mapping sequences into a two-dimensional square. It utilizes the space-filling function of Hilbert-Peano mapping. By applying a Gauss smoothing technique and a user-defined color function, a large-scale genome sequence can be mapped into a two-dimensional color image. In the calculated DHPCs, many genome characteristics are revealed. In this article we introduce the method and show how DHPCs may be used to identify regions of different base composition. The power of the method is demonstrated by presenting multiple examples such as repeating sequences, degree of base bias, regions of homogeneity and their boundaries, and mark of annotated segments. We also present several genome curves generated by DHPC to demonstrate how DHPC can be used to find previously unidentified sequence features in these genomes. (c) 2008 Elsevier Inc. All rights reserved.
引用
收藏
页码:476 / 483
页数:8
相关论文
共 21 条
[1]   GC composition of the human genome: In search of isochores [J].
Cohen, N ;
Dagan, T ;
Stone, L ;
Graur, D .
MOLECULAR BIOLOGY AND EVOLUTION, 2005, 22 (05) :1260-1272
[2]   Structure-function analysis of the inverted terminal repeats of the Sleeping Beauty transposon [J].
Cui, ZB ;
Geurts, AM ;
Liu, GY ;
Kaufman, CD ;
Hackett, PB .
JOURNAL OF MOLECULAR BIOLOGY, 2002, 318 (05) :1221-1235
[3]   Browsing repeats in genomes:: Pygram and an application to non-coding region analysis [J].
Durand, Patrick ;
Mahe, Frederic ;
Valin, Anne-Sophie ;
Nicolas, Jacques .
BMC BIOINFORMATICS, 2006, 7 (1)
[4]   Isochore structures in the chicken genome [J].
Gao, F ;
Zhang, CT .
FEBS JOURNAL, 2006, 273 (08) :1637-1648
[5]   Genome visualization made fast and simple [J].
Gibson, R ;
Smith, DR .
BIOINFORMATICS, 2003, 19 (11) :1449-1450
[6]   CHAOS GAME REPRESENTATION OF GENE STRUCTURE [J].
JEFFREY, HJ .
NUCLEIC ACIDS RESEARCH, 1990, 18 (08) :2163-2170
[7]   Chaos game representation for comparison of whole genomes [J].
Joseph, Jijoy ;
Sasikumar, Roschen .
BMC BIOINFORMATICS, 2006, 7 (1)
[8]  
Kempken F, 1996, MOL CELL BIOL, V16, P6563
[9]   Visualization for genomics: The microbial genome viewer [J].
Kerkhoven, R ;
van Enckevort, FHJ ;
Boekhorst, J ;
Molenaar, D ;
Siezen, RJ .
BIOINFORMATICS, 2004, 20 (11) :1812-1814
[10]   The UCSC Genome Browser Database: Update 2007 [J].
Kuhn, R. M. ;
Karolchik, D. ;
Zweig, A. S. ;
Trumbower, H. ;
Thomas, D. J. ;
Thakkapallayil, A. ;
Sugnet, C. W. ;
Stanke, M. ;
Smith, K. E. ;
Siepel, A. ;
Rosenbloom, K. R. ;
Rhead, B. ;
Raney, B. J. ;
Pohl, A. ;
Pedersen, J. S. ;
Hsu, F. ;
Hinrichs, A. S. ;
Harte, R. A. ;
Diekhans, M. ;
Clawson, H. ;
Bejerano, G. ;
Barber, G. P. ;
Baertsch, R. ;
Haussler, D. ;
Kent, W. J. .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D668-D673