I-TASSER server for protein 3D structure prediction

被引:3148
作者
Zhang, Yang [1 ,2 ]
机构
[1] Univ Kansas, Ctr Bioinformat, Lawrence, KS 66047 USA
[2] Univ Kansas, Dept Mol Biosci, Lawrence, KS 66047 USA
关键词
D O I
10.1186/1471-2105-9-40
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Prediction of 3-dimensional protein structures from amino acid sequences represents one of the most important problems in computational structural biology. The community-wide Critical Assessment of Structure Prediction ( CASP) experiments have been designed to obtain an objective assessment of the state-of-the-art of the field, where I-TASSER was ranked as the best method in the server section of the recent 7th CASP experiment. Our laboratory has since then received numerous requests about the public availability of the I-TASSER algorithm and the usage of the I-TASSER predictions. Results: An on-line version of I-TASSER is developed at the KU Center for Bioinformatics which has generated protein structure predictions for thousands of modeling requests from more than 35 countries. A scoring function (C-score) based on the relative clustering structural density and the consensus significance score of multiple threading templates is introduced to estimate the accuracy of the I-TASSER predictions. A large-scale benchmark test demonstrates a strong correlation between the C-score and the TM-score ( a structural similarity measurement with values in [0, 1]) of the first models with a correlation coefficient of 0.91. Using a C-score cutoff > -1.5 for the models of correct topology, both false positive and false negative rates are below 0.1. Combining C-score and protein length, the accuracy of the I-TASSER models can be predicted with an average error of 0.08 for TM-score and 2 angstrom for RMSD. Conclusion: The I-TASSER server has been developed to generate automated full-length 3D protein structural predictions where the benchmarked scoring system helps users to obtain quantitative assessments of the I-TASSER models. The output of the I-TASSER server for each query includes up to five full-length models, the confidence score, the estimated TM-score and RMSD, and the standard deviation of the estimations. The I-TASSER server is freely available to the academic community at http://zhang.bioinformatics.ku.edu/I-TASSER.
引用
收藏
页数:8
相关论文
共 33 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Protein structure prediction and structural genomics [J].
Baker, D ;
Sali, A .
SCIENCE, 2001, 294 (5540) :93-96
[3]   Automated server predictions in CASP7 [J].
Battey, James N. D. ;
Kopp, Jurgen ;
Bordoli, Lorenza ;
Read, Randy J. ;
Clarke, Neil D. ;
Schwede, Torsten .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2007, 69 :68-82
[4]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[5]  
Betancourt MR, 2001, BIOPOLYMERS, V59, P305, DOI 10.1002/1097-0282(20011015)59:5<305::AID-BIP1027>3.3.CO
[6]  
2-Y
[7]   A graph-theory algorithm for rapid protein side-chain prediction [J].
Canutescu, AA ;
Shelenkov, AA ;
Dunbrack, RL .
PROTEIN SCIENCE, 2003, 12 (09) :2001-2014
[8]   Assessment of predictions in the model quality assessment category [J].
Cozzetto, Domenico ;
Kryshtafovych, Andriy ;
Ceriani, Michele ;
Tramontano, Anna .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2007, 69 :175-183
[9]  
Feig M, 2000, PROTEINS, V41, P86, DOI 10.1002/1097-0134(20001001)41:1<86::AID-PROT110>3.0.CO
[10]  
2-Y