Evolutionary profiles from the QR factorization of multiple sequence alignments

被引:33
作者
Sethi, A [1 ]
O'Donoghue, P [1 ]
Luthey-Schulten, Z [1 ]
机构
[1] Univ Illinois, Sch Chem Sci, Dept Chem, Urbana, IL 61801 USA
关键词
archaeal cysteinyl-tRNA synthetase; gene annotation; lipocalin superfamily; triosephosphate isomerase superfamily;
D O I
10.1073/pnas.0409715102
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We present an algorithm to generate complete evolutionary profiles that represent the topology of the molecular phylogenetic tree of the homologous group. The method, based on the multidimensional QR factorization of numerically encoded multiple sequence alignments, removes redundancy from the alignments and orders the protein sequences by increasing linear dependence, resulting in the identification of a minimal basis set of sequences that spans the evolutionary space of the homologous group of proteins. We observe a general trend that these smaller, more evolutionarily balanced profiles have comparable and, in many cases, better performance in database searches than conventional profiles containing hundreds of sequences, constructed in an iterative and computationally intensive procedure. For more diverse families or superfamilies, with sequence identity <30%, structural alignments, based purely on the geometry of the protein structures, provide better alignments than pure sequence-based methods. Merging the structure and sequence information allows the construction of accurate profiles for distantly related groups. These structure-based profiles outperformed other sequence-based methods for finding distant homologs and were used to identify a putative class II cysteinyl-tRNA synthetase (CysRS) in several archaea that eluded previous annotation studies. Phylogenetic analysis showed the putative class II CysRSs to be a monophyletic group and homology modeling revealed a constellation of active site residues similar to that in the known class I CysRS.
引用
收藏
页码:4045 / 4050
页数:6
相关论文
共 24 条
[1]  
[Anonymous], 1965, NUMER MATH, DOI DOI 10.1007/BF01436075
[2]   Which craft is best in bioinformatics? [J].
Attwood, TK ;
Miller, CJ .
COMPUTERS & CHEMISTRY, 2001, 25 (04) :329-339
[3]   The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 [J].
Boeckmann, B ;
Bairoch, A ;
Apweiler, R ;
Blatter, MC ;
Estreicher, A ;
Gasteiger, E ;
Martin, MJ ;
Michoud, K ;
O'Donovan, C ;
Phan, I ;
Pilbout, S ;
Schneider, M .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :365-370
[4]  
BORK P, 1995, PROTEIN SCI, V4, P268
[5]   Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii [J].
Bult, CJ ;
White, O ;
Olsen, GJ ;
Zhou, LX ;
Fleischmann, RD ;
Sutton, GG ;
Blake, JA ;
FitzGerald, LM ;
Clayton, RA ;
Gocayne, JD ;
Kerlavage, AR ;
Dougherty, BA ;
Tomb, JF ;
Adams, MD ;
Reich, CI ;
Overbeek, R ;
Kirkness, EF ;
Weinstock, KG ;
Merrick, JM ;
Glodek, A ;
Scott, JL ;
Geoghagen, NSM ;
Weidman, JF ;
Fuhrmann, JL ;
Nguyen, D ;
Utterback, TR ;
Kelley, JM ;
Peterson, JD ;
Sadow, PW ;
Hanna, MC ;
Cotton, MD ;
Roberts, KM ;
Hurst, MA ;
Kaine, BP ;
Borodovsky, M ;
Klenk, HP ;
Fraser, CM ;
Smith, HO ;
Woese, CR ;
Venter, JC .
SCIENCE, 1996, 273 (5278) :1058-1073
[6]   The ASTRAL Compendium in 2004 [J].
Chandonia, JM ;
Hon, G ;
Walker, NS ;
Lo Conte, L ;
Koehl, P ;
Levitt, M ;
Brenner, SE .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D189-D192
[7]   Phenylalanyl-tRNA synthetase from the archaeon Methanobacterium thermoautotrophicum is an (αβ)2 heterotetrameric protein [J].
Das, R ;
Vothknecht, UC .
BIOCHIMIE, 1999, 81 (11) :1037-1039
[8]   Transducer placement for broadband active vibration control using a novel multidimensional QR factorization [J].
Heck, LP ;
Olkin, JA ;
Naghshineh, K .
JOURNAL OF VIBRATION AND ACOUSTICS-TRANSACTIONS OF THE ASME, 1998, 120 (03) :663-670
[9]   UNITARY TRIANGULARIZATION OF A NONSYMMETRIC MATRIX [J].
HOUSEHOLDER, AS .
JOURNAL OF THE ACM, 1958, 5 (04) :339-342
[10]   A euryarchaeal Lysyl-tRNA synthetase: Resemblance to class I synthetases [J].
Ibba, M ;
Morgan, S ;
Curnow, AW ;
Pridmore, DR ;
Vothknecht, UC ;
Gardner, W ;
Lin, W ;
Woese, CR ;
Soll, D .
SCIENCE, 1997, 278 (5340) :1119-1122