ProtoNet: hierarchical classification of the protein space

被引:61
作者
Sasson, O
Vaaknin, A
Fleischer, H
Portugaly, E
Bilu, Y
Linial, N
Linial, M [1 ]
机构
[1] Hebrew Univ Jerusalem, Sch Comp Sci & Engn, IL-91904 Jerusalem, Israel
[2] Hebrew Univ Jerusalem, Inst Life Sci, Dept Biol Chem, IL-91904 Jerusalem, Israel
关键词
D O I
10.1093/nar/gkg096
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The ProtoNet site provides an automatic hierarchical clustering of the SWISS-PROT protein database. The clustering is based on an all-against-all BLAST similarity search. The similarities E - score is used to perform a continuous bottom-up clustering process by applying alternative rules for merging clusters. The outcome of this clustering process is a classification of the input proteins into a hierarchy of cluster of varying degree of granularity. ProtoNet ( version 1.3) is accessible in the form of an interactive web site at http: / / www. protonet. cs. huji. ac. il. ProtoNet provides navigation tools for monitoring the clustering process with a vertical and horizontal view. Each cluster at any level of the hierarchy is assigned with a statistical index, indicating the level of purity based on biological keyword such as those provided by SWISS-PROT and InterPro. ProtoNet can be used for function prediction, for de ning superfamilies and subfamilies and for large-scale protein annotation purposes.
引用
收藏
页码:348 / 352
页数:5
相关论文
共 19 条
[1]   The InterPro database, an integrated documentation resource for protein families, domains and functional sites [J].
Apweiler, R ;
Attwood, TK ;
Bairoch, A ;
Bateman, A ;
Birney, E ;
Biswas, M ;
Bucher, P ;
Cerutti, T ;
Corpet, F ;
Croning, MDR ;
Durbin, R ;
Falquet, L ;
Fleischmann, W ;
Gouzy, J ;
Hermjakob, H ;
Hulo, N ;
Jonassen, I ;
Kahn, D ;
Kanapin, A ;
Karavidopoulou, Y ;
Lopez, R ;
Marx, B ;
Mulder, NJ ;
Oinn, TM ;
Pagni, M ;
Servant, F ;
Sigrist, CJA ;
Zdobnov, EM .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :37-40
[2]   PRINTS-S: the database formerly known as PRINTS [J].
Attwood, TK ;
Croning, MDR ;
Flower, DR ;
Lewis, AP ;
Mabey, JE ;
Scordis, P ;
Selley, JN ;
Wright, W .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :225-227
[3]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[4]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
[5]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[6]   ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons [J].
Corpet, F ;
Servant, F ;
Gouzy, J ;
Kahn, D .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :267-269
[7]   Picasso: generating a covering set of protein family profiles [J].
Heger, A ;
Holm, L .
BIOINFORMATICS, 2001, 17 (03) :272-279
[8]   The PROSITE database, its status in 1999 [J].
Hofmann, K ;
Bucher, P ;
Falquet, L ;
Bairoch, A .
NUCLEIC ACIDS RESEARCH, 1999, 27 (01) :215-219
[9]   The SYSTERS protein sequence cluster set [J].
Krause, A ;
Stoye, J ;
Vingron, M .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :270-272
[10]   Clustering and analysis of protein families [J].
Kriventseva, EV ;
Biswas, M ;
Apweiler, R .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2001, 11 (03) :334-339