Interrupted coding sequences in Mycobacterium smegmatis:: authentic mutations or sequencing errors?

被引:27
作者
Deshayes, Caroline
Perrodou, Emmanuel
Gallien, Sebastien
Euphrasie, Daniel
Schaeffer, Christine
Van-Dorsselaer, Alain
Poch, Olivier
Lecompte, Odile
Reyrat, Jean-Marc [1 ]
机构
[1] Univ Paris 05, Fac Med Rene Descartes, F-75730 Paris 15, France
[2] INSERM, U570, Unite Pathogenie Infect Syst, Grp AVERNIR, F-75730 Paris, France
[3] ULP, INSERM, CNRS, IGBMC,Lab Biol & Genom Struct, F-67404 Illkirch Graffenstaden, France
[4] ECPM, UMR7178, Lab Spectrometrie Masse Bioorgan, F-67087 Strasbourg 2, France
关键词
D O I
10.1186/gb-2007-8-2-r20
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: In silico analysis has shown that all bacterial genomes contain a low percentage of ORFs with undetected frameshifts and in-frame stop codons. These interrupted coding sequences (ICDSs) may really be present in the organism or may result from misannotation based on sequencing errors. The reality or otherwise of these sequences has major implications for all subsequent functional characterization steps, including module prediction, comparative genomics and high-throughput proteomic projects. Results: We show here, using Mycobacterium smegmatis as a model species, that a significant proportion of these ICDSs result from sequencing errors. We used a resequencing procedure and mass spectrometry analysis to determine the nature of a number of ICDSs in this organism. We found that 28 of the 73 ICDSs investigated correspond to sequencing errors. Conclusion: The correction of these errors results in modification of the predicted amino acid sequences of the corresponding proteins and changes in annotation. We suggest that each bacterial ICDS should be investigated individually, to determine its true status and to ensure that the genome sequence is appropriate for comparative genomics analyses.
引用
收藏
页数:9
相关论文
共 26 条
[1]   Genomes OnLine Database (GOLD): a monitor of genome projects world-wide [J].
Bernal, A ;
Ear, U ;
Kyrpides, N .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :126-127
[2]  
Bradshaw RA, 2005, MOL CELL PROTEOMICS, V4, P1223
[3]   Frame: detection of genomic sequencing errors [J].
Brown, NP ;
Sander, C ;
Bork, P .
BIOINFORMATICS, 1998, 14 (04) :367-371
[4]   Spot overlapping in two-dimensional maps: A serious problem ignored for much too long [J].
Campostrini, N ;
Areces, LB ;
Rappsilber, J ;
Pietrogrande, MC ;
Dondi, F ;
Pastorino, F ;
Ponzoni, M ;
Righetti, PG .
PROTEOMICS, 2005, 5 (09) :2385-2395
[5]   Role of the pks15/1 gene in the biosynthesis of phenolglycolipids in the Mycobacterium tuberculosis complex -: Evidence that all strains synthesize glycosylated p-hydroxybenzoic methyl esters and that strains devoid of phenolglycolipids harbor a frameshift mutation in the pks15/1 gene [J].
Constant, P ;
Perez, E ;
Malaga, W ;
Lanéelle, MA ;
Saurel, O ;
Daffé, M ;
Guilhot, C .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2002, 277 (41) :38148-38158
[6]   The impact of the absence of glycopeptidolipids on the ultrastructure, cell surface and cell wall properties, and phagocytosis of Mycobacterium smegmatis [J].
Etienne, G ;
Villeneuve, C ;
Billman-Jacobe, H ;
Astarie-Dequeker, C ;
Dupont, MA ;
Daffé, M .
MICROBIOLOGY-SGM, 2002, 148 :3089-3100
[7]   Base-calling of automated sequencer traces using phred.: II.: Error probabilities [J].
Ewing, B ;
Green, P .
GENOME RESEARCH, 1998, 8 (03) :186-194
[8]   Base-calling of automated sequencer traces using phred.: I.: Accuracy assessment [J].
Ewing, B ;
Hillier, L ;
Wendl, MC ;
Green, P .
GENOME RESEARCH, 1998, 8 (03) :175-185
[9]   Single nucleotide polymorphisms in Mycobacterium tuberculosis structural genes -: Response to Dr. Musser [J].
Fleischmann, R .
EMERGING INFECTIOUS DISEASES, 2001, 7 (03) :487-488
[10]   Translational bypassing: A new reading alternative of the genetic code [J].
Groisman, I ;
EngelbergKulka, H .
BIOCHEMISTRY AND CELL BIOLOGY-BIOCHIMIE ET BIOLOGIE CELLULAIRE, 1995, 73 (11-12) :1055-1059