Finding pathogenicity islands and gene transfer events in genome data

被引:53
作者
Liò, P
Vannucci, M
机构
[1] Univ Cambridge, Dept Zool, Cambridge CB2 3EH, England
[2] Texas A&M Univ, Dept Stat, College Stn, TX 77843 USA
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1093/bioinformatics/16.10.932
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: There is a growing literature on wavelet theory and wavelet methods showing improvements on more classical techniques, especially in the contexts of smoothing and extraction of fundamental components of signals. G+C patterns occur at different lengths (scales) and, for this reason, G+C plots are usually difficult to interpret. Current methods for genome analysis choose a window size and compute a chi (2) statistics of the average value for each window with respect to the whole genome. Results: Firstly, wavelets are used to smooth G+C profiles to locate characteristic patterns in genome sequences. The method we use is based on performing a chi (2) statistics on the wavelet coefficients of a profile; thus we do not need to choose a fixed window size, in that the smoothing occurs at a set of different scales. Secondly, a wavelet scalogram is used as a measure for sequence profile comparison; this tool is very general and carl be applied to other sequence profiles commonly used in genome analysis. We show applications to the analysis of Deinococcus radiodurans chromosome I, of two strains of Helicobacter pylori (26 695, J99) and two of Neisseria meningitidis (serogroup B strain MC58 and serogroup A strain Z2491). We report a list of loci that have different G+C content with respect to the nearby regions; the analysis of N. meningitidis serogroup B shows two new large regions with low G+C content that are putative pathogenicity islands.
引用
收藏
页码:932 / 940
页数:9
相关论文
共 32 条
[1]   Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori [J].
Alm, RA ;
Ling, LSL ;
Moir, DT ;
King, BL ;
Brown, ED ;
Doig, PC ;
Smith, DR ;
Noonan, B ;
Guild, BC ;
deJonge, BL ;
Carmel, G ;
Tummino, PJ ;
Caruso, A ;
Uria-Nickelsen, M ;
Mills, DM ;
Ives, C ;
Gibson, R ;
Merberg, D ;
Mills, SD ;
Jiang, Q ;
Taylor, DE ;
Vovis, GF ;
Trost, TJ .
NATURE, 1999, 397 (6715) :176-180
[2]  
ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
[3]  
ARINO M, 2000, UNPUB COMMUNICATIONS
[4]   What can we learn with wavelets about DNA sequences? [J].
Arneodo, A ;
D'Aubenton-Carafa, Y ;
Audit, B ;
Bacry, E ;
Muzy, JF ;
Thermes, C .
PHYSICA A, 1998, 249 (1-4) :439-448
[5]   An Escherichia coli strain with all chromosomal rRNA operons inactivated:: Complete exchange of rRNA genes between bacteria [J].
Asai, T ;
Zaporojets, D ;
Squires, C ;
Squires, CL .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (05) :1971-1976
[6]  
Chiann C, 1999, J NONPARAMETR STAT, V10, P1
[7]   Dimerization of the Agrobacterium tumefaciens VirB4 ATPase and the effect of ATP-binding cassette mutations on the assembly and function of the T-DNA transporter [J].
Dang, TA ;
Zhou, XR ;
Graf, B ;
Christie, PJ .
MOLECULAR MICROBIOLOGY, 1999, 32 (06) :1239-1253
[8]  
Daubechies I., 1993, Ten Lectures of Wavelets, V28, P350
[9]   Serum amyloid P component bound to gram-negative bacteria prevents lipopolysaccharide-mediated classical pathway complement activation [J].
de Haas, CJC ;
van Leeuwen, EMM ;
van Bommel, T ;
Verhoef, J ;
van Kessel, KPM ;
van Strijp, JAG .
INFECTION AND IMMUNITY, 2000, 68 (04) :1753-1759
[10]   IDEAL SPATIAL ADAPTATION BY WAVELET SHRINKAGE [J].
DONOHO, DL ;
JOHNSTONE, IM .
BIOMETRIKA, 1994, 81 (03) :425-455