A novel fold recognition method using composite predicted secondary structures

被引：14

作者：

An, YL

Friesner, RA ^{[1
]}

机构：

[1] Columbia Univ, Dept Chem, New York, NY 10027 USA

[2] Columbia Univ, Ctr Biomol Simulat, New York, NY 10027 USA

来源：

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS | 2002年 / 48卷 / 02期

关键词：

fold recognition; structural homologue; composite secondary structure; secondary structure segment; CASP; alignment;

D O I：

10.1002/prot.10145

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

In this work, we introduce a new method for fold recognition using composite secondary structures assembled from different secondary structure prediction servers for a given target sequence. An automatic, complete, and robust way of finding all possible combinations of predicted secondary structure segments (SSS) for the target sequence and clustering them into a few flexible clusters, each containing patterns with the same number of SSS, is developed. This program then takes two steps in choosing plausible homologues: (i) a SSS-based alignment excludes impossible templates whose SSS patterns are very different from any of those of the target; (ii) a residue-based alignment selects good structural templates based on sequence similarity and secondary structure similarity between the target and only those templates left in the first stage. The secondary structure of each residue in the target is selected from one of the predictions to find the best match with the template. Truncation is applied to a target where different predictions vary. In most cases, a target is also divided into N-terminal and C-terminal fragments, each of which is used as a separate subsequence. Our program was tested on the fold recognition targets from CASP3 with known PDB codes and some available targets from CASP4. The results are compared with a structural homologue list for each target produced by the CE program (Shindyalov and Bourne, Protein Eng 1998;11:739-747). The program successfully locates homologues with high Z-score and low root-mean-score deviation within the top 30-50 predictions in the overwhelming majority of cases. (C) 2002 Wiley-Liss, Inc.

引用

页码：352 / 366

页数：15

共 36 条

[1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].