Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions

被引:1302
作者
Mistry, Jaina [1 ,2 ]
Finn, Robert D. [3 ]
Eddy, Sean R. [3 ]
Bateman, Alex [1 ,2 ]
Punta, Marco [1 ,2 ]
机构
[1] EMBL European Bioinformat Inst, Cambridge CB10 1SD, England
[2] Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England
[3] HHMI Janelia Farm Res Campus, Ashburn, VA 20147 USA
基金
英国生物技术与生命科学研究理事会; 英国惠康基金;
关键词
CHLAMYDIA-TRACHOMATIS; PSI-BLAST; PROTEIN; SEQUENCE; COMPLEXITY; ACCURACY; DOMAIN; GENE;
D O I
10.1093/nar/gkt263
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Detection of protein homology via sequence similarity has important applications in biology, from protein structure and function prediction to reconstruction of phylogenies. Although current methods for aligning protein sequences are powerful, challenges remain, including problems with homologous overextension of alignments and with regions under convergent evolution. Here, we test the ability of the profile hidden Markov model method HMMER3 to correctly assign homologous sequences to > 13 000 manually curated families from the Pfam database. We identify problem families using protein regions that match two or more Pfam families not currently annotated as related in Pfam. We find that HMMER3 E-value estimates seem to be less accurate for families that feature periodic patterns of compositional bias, such as the ones typically observed in coiled-coils. These results support the continued use of manually curated inclusion thresholds in the Pfam database, especially on the subset of families that have been identified as problematic in experiments such as these. They also highlight the need for developing new methods that can correct for this particular type of compositional bias.
引用
收藏
页数:10
相关论文
共 37 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]   Reorganizing the protein space at the Universal Protein Resource (UniProt) [J].
Apweiler, Rolf ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Alam-Faruque, Yasmin ;
Antunes, Ricardo ;
Casanova, Elisabet Barrera ;
Bely, Benoit ;
Bingley, Mark ;
Bower, Lawrence ;
Bursteinas, Borisas ;
Chan, Wei Mun ;
Chavali, Gayatri ;
Da Silva, Alan ;
Dimmer, Emily ;
Eberhardt, Ruth ;
Fazzini, Francesco ;
Fedotov, Alexander ;
Garavelli, John ;
Castro, Leyla Garcia ;
Gardner, Michael ;
Hieta, Reija ;
Huntley, Rachael ;
Jacobsen, Julius ;
Legge, Duncan ;
Liu, Wudong ;
Luo, Jie ;
Orchard, Sandra ;
Patient, Samuel ;
Pichler, Klemens ;
Poggioli, Diego ;
Pontikos, Nikolas ;
Pundir, Sangya ;
Rosanoff, Steven ;
Sawford, Tony ;
Sehra, Harminder ;
Turner, Edward ;
Wardell, Tony ;
Watkins, Xavier ;
Corbett, Matt ;
Donnelly, Mike ;
van Rensburg, Pieter ;
Goujon, Mickael ;
McWilliam, Hamish ;
Lopez, Rodrigo ;
Xenarios, Ioannis ;
Bougueleret, Lydie ;
Bridge, Alan ;
Poux, Sylvain ;
Redaschi, Nicole .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D71-D75
[4]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
[5]   PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES [J].
BERNSTEIN, FC ;
KOETZLE, TF ;
WILLIAMS, GJB ;
MEYER, EF ;
BRICE, MD ;
RODGERS, JR ;
KENNARD, O ;
SHIMANOUCHI, T ;
TASUMI, M .
JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) :535-542
[6]   TPR, A LARGE COILED-COIL PROTEIN WHOSE AMINO-TERMINUS IS INVOLVED IN ACTIVATION OF ONCOGENIC KINASES, IS LOCALIZED TO THE CYTOPLASMIC SURFACE OF THE NUCLEAR-PORE COMPLEX [J].
BYRD, DA ;
SWEET, DJ ;
PANTE, N ;
KONSTANTINOV, KN ;
GUAN, TL ;
SAPHIRE, ACS ;
MITCHELL, PJ ;
COOPER, CS ;
AEBI, U ;
GERACE, L .
JOURNAL OF CELL BIOLOGY, 1994, 127 (06) :1515-1526
[7]   Conservation of the biochemical properties of IncA from Chlamydia trachomatis and Chlamydia caviae -: Oligomerization of IncA mediates interaction between facing membranes [J].
Delevoye, C ;
Nilges, M ;
Dautry-Varsat, A ;
Subtil, A .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2004, 279 (45) :46896-46906
[8]   The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins [J].
Dosztányi, Z ;
Csizmók, V ;
Tompa, P ;
Simon, I .
JOURNAL OF MOLECULAR BIOLOGY, 2005, 347 (04) :827-839
[9]   Accelerated Profile HMM Searches [J].
Eddy, Sean R. .
PLOS COMPUTATIONAL BIOLOGY, 2011, 7 (10)
[10]   Phylogenomics: Intersection of evolution and genomics [J].
Eisen, JA ;
Fraser, CM .
SCIENCE, 2003, 300 (5626) :1706-1707