PANDORA: keyword-based analysis of protein sets by integration of annotation sources

被引:25
作者
Kaplan, N
Vaaknin, A
Linial, M [1 ]
机构
[1] Hebrew Univ Jerusalem, Inst Life Sci, Dept Biol Chem, IL-91904 Jerusalem, Israel
[2] Hebrew Univ Jerusalem, Sch Engn & Comp Sci, IL-91904 Jerusalem, Israel
关键词
D O I
10.1093/nar/gkg769
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Recent advances in high-throughput methods and the application of computational tools for automatic classification of proteins have made it possible to carry out large-scale proteomic analyses. Biological analysis and interpretation of sets of proteins is a time-consuming undertaking carried out manually by experts. We have developed PANDORA (Protein ANnotation Diagram ORiented Analysis), a web-based tool that provides an automatic representation of the biological knowledge associated with any set of proteins. PANDORA uses a unique approach of keyword-based graphical analysis that focuses on detecting subsets of proteins that share unique biological properties and the intersections of such sets. PANDORA currently supports SwissProt keywords, NCBI Taxonomy, InterPro entries and the hierarchical classification terms from ENZYME, SCOP and GO databases. The integrated study of several annotation sources simultaneously allows a representation of biological relations of structure, function, cellular location, taxonomy, domains and motifs. PANDORA is also integrated into the ProtoNet system, thus allowing testing thousands of automatically generated clusters. We illustrate how PANDORA enhances the biological understanding of large, non-uniform sets of proteins originating from experimental and computational sources, without the need for prior biological knowledge on individual proteins.
引用
收藏
页码:5617 / 5626
页数:10
相关论文
共 43 条
[31]   DIAN: A novel algorithm for genome ontological classification [J].
Pouliot, Y ;
Gao, J ;
Su, QJJ ;
Liu, GZG ;
Ling, XFB .
GENOME RESEARCH, 2001, 11 (10) :1766-1779
[32]   Review: Protein secondary structure prediction continues to rise [J].
Rost, B .
JOURNAL OF STRUCTURAL BIOLOGY, 2001, 134 (2-3) :204-218
[33]   ProtoNet: hierarchical classification of the protein space [J].
Sasson, O ;
Vaaknin, A ;
Fleischer, H ;
Portugaly, E ;
Bilu, Y ;
Linial, N ;
Linial, M .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :348-352
[34]   Selenium: an insulin-mimetic [J].
Stapleton, SR .
CELLULAR AND MOLECULAR LIFE SCIENCES, 2000, 57 (13-14) :1874-1879
[35]   Database resources of the National Center for Biotechnology [J].
Wheeler, DL ;
Church, DM ;
Federhen, S ;
Lash, AE ;
Madden, TL ;
Pontius, JU ;
Schuler, GD ;
Schriml, LM ;
Sequeira, E ;
Tatusova, TA ;
Wagner, L .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :28-33
[36]   GidA is an FAD-binding protein involved in development of Myxococcus xanthus [J].
White, DJ ;
Merod, R ;
Thomasson, B ;
Hartzell, PL .
MOLECULAR MICROBIOLOGY, 2001, 42 (02) :503-517
[37]   Assessing annotation transfer for genomics: Quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores [J].
Wilson, CA ;
Kreychman, J ;
Gerstein, M .
JOURNAL OF MOLECULAR BIOLOGY, 2000, 297 (01) :233-249
[38]   The Protein Information Resource [J].
Wu, CH ;
Yeh, LSL ;
Huang, HZ ;
Arminski, L ;
Castro-Alvear, J ;
Chen, YX ;
Hu, ZZ ;
Kourtesis, P ;
Ledley, RS ;
Suzek, BE ;
Vinayaka, CR ;
Zhang, J ;
Barker, WC .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :345-347
[39]   Large-scale protein annotation through gene ontology [J].
Xie, HQ ;
Wasserman, A ;
Levine, Z ;
Novik, A ;
Grebinskiy, V ;
Shoshan, A ;
Mintz, L .
GENOME RESEARCH, 2002, 12 (05) :785-794
[40]  
YANAI I, 2002, GENOME BIOL, V3, pS64