Distill:: a suite of web servers for the prediction of one-, two- and three-dimensional structural features of proteins

被引:71
作者
Bau, Davide [1 ]
Martin, Alberto J. M. [1 ]
Mooney, Catherine [1 ]
Vullo, Alessandro [1 ]
Walsh, Ian [1 ]
Pollastri, Gianluca [1 ]
机构
[1] Univ Coll Dublin, Sch Comp & Informat Sci, Dublin 4, Ireland
关键词
D O I
10.1186/1471-2105-7-402
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: We describe Distill, a suite of servers for the prediction of protein structural features: secondary structure; relative solvent accessibility; contact density; backbone structural motifs; residue contact maps at 6, 8 and 12 Angstrom; coarse protein topology. The servers are based on large-scale ensembles of recursive neural networks and trained on large, up-to-date, nonredundant subsets of the Protein Data Bank. Together with structural feature predictions, Distill includes a server for prediction of C alpha traces for short proteins (up to 200 amino acids). Results: The servers are state-of-the-art, with secondary structure predicted correctly for nearly 80% of residues (currently the top performance on EVA), 2-class solvent accessibility nearly 80% correct, and contact maps exceeding 50% precision on the top non-diagonal contacts. A preliminary implementation of the predictor of protein C a traces featured among the top 20 Novel Fold predictors at the last CASP6 experiment as group Distill (ID 0348). The majority of the servers, including the C a trace predictor, now take into account homology information from the PDB, when available, resulting in greatly improved reliability. Conclusion: All predictions are freely available through a simple joint web interface and the results are returned by email. In a single submission the user can send protein sequences for a total of up to 32k residues to all or a selection of the servers. Distill is accessible at the address: http://distill.ucd.ie/distill/.
引用
收藏
页数:8
相关论文
共 25 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
[Anonymous], 2004, J MACH LEARN RES, DOI DOI 10.1162/153244304773936054
[3]   Exploiting the past and the future in protein secondary structure prediction [J].
Baldi, P ;
Brunak, S ;
Frasconi, P ;
Soda, G ;
Pollastri, G .
BIOINFORMATICS, 1999, 15 (11) :937-946
[4]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[5]   Rosetta predictions in CASP5: Successes, failures, and prospects for complete automation [J].
Bradley, P ;
Chivian, D ;
Meiler, J ;
Misura, KMS ;
Rohl, CA ;
Schief, WR ;
Wedemeyer, WJ ;
Schueler-Furman, O ;
Murphy, P ;
Schonbrun, J ;
Strauss, CEM ;
Baker, D .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 53 :457-468
[6]   Learning protein secondary structure from sequential and relational data [J].
Ceroni, A ;
Frasconi, P ;
Pollastri, G .
NEURAL NETWORKS, 2005, 18 (08) :1029-1039
[7]   EVA:: continuous automatic evaluation of protein structure prediction servers [J].
Eyrich, VA ;
Martí-Renom, MA ;
Przybylski, D ;
Madhusudhan, MS ;
Fiser, A ;
Pazos, F ;
Valencia, A ;
Sali, A ;
Rost, B .
BIOINFORMATICS, 2001, 17 (12) :1242-1243
[8]  
HOBOHM U, 1994, PROTEIN SCI, V3, P522
[9]   GenTHREADER: An efficient and reliable protein fold recognition method for genomic sequences [J].
Jones, DT .
JOURNAL OF MOLECULAR BIOLOGY, 1999, 287 (04) :797-815
[10]   DICTIONARY OF PROTEIN SECONDARY STRUCTURE - PATTERN-RECOGNITION OF HYDROGEN-BONDED AND GEOMETRICAL FEATURES [J].
KABSCH, W ;
SANDER, C .
BIOPOLYMERS, 1983, 22 (12) :2577-2637