Domains, motifs and clusters in the protein universe

被引:69
作者
Liu, JF
Rost, B
机构
[1] Columbia Univ, CUBIC, Dept Biochem & Mol Biophys, New York, NY 10032 USA
[2] Columbia Univ, N E Struct Genomics Consortium NESG, Dept Biochem & Mol Biophys, New York, NY 10032 USA
[3] Columbia Univ, Dept Pharmacol, New York, NY 10032 USA
[4] Columbia Univ, Ctr Computat Biol & Bioinformat C2B2, New York, NY 10032 USA
关键词
D O I
10.1016/S1367-5931(02)00003-0
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The rapid growth of bio-sequence information has resulted in an increasing demand for reliable methods that group proteins. A few databases with curated alignments of protein families have demonstrated that expert-driven repositories can keep up with the data deluge in the genome era. These original resources implicitly identify domain-like modules in proteins. An increasing number of automatic methods have sprouted over the past few years that cluster the protein universe. Many of these implicitly dissect proteins into structural domain-like fragments. In a very coarse-grained evaluation, some of the automatic methods appear to be on par with expert-driven approaches. However, neither automatic nor manual methods are currently entirely up to the challenges of tasks such as target selection in structural genomics. Thus, we urgently need refined and sustained automatic clustering tools.
引用
收藏
页码:5 / 11
页数:7
相关论文
共 64 条
[1]   Clustering of proximal sequence space for the identification of protein families [J].
Abascal, F ;
Valencia, A .
BIOINFORMATICS, 2002, 18 (07) :908-921
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   The InterPro database, an integrated documentation resource for protein families, domains and functional sites [J].
Apweiler, R ;
Attwood, TK ;
Bairoch, A ;
Bateman, A ;
Birney, E ;
Biswas, M ;
Bucher, P ;
Cerutti, T ;
Corpet, F ;
Croning, MDR ;
Durbin, R ;
Falquet, L ;
Fleischmann, W ;
Gouzy, J ;
Hermjakob, H ;
Hulo, N ;
Jonassen, I ;
Kahn, D ;
Kanapin, A ;
Karavidopoulou, Y ;
Lopez, R ;
Marx, B ;
Mulder, NJ ;
Oinn, TM ;
Pagni, M ;
Servant, F ;
Sigrist, CJA ;
Zdobnov, EM .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :37-40
[4]   PRINTS and PRINTS-S shed light on protein ancestry [J].
Attwood, TK ;
Blythe, MJ ;
Flower, DR ;
Gaulton, A ;
Mabey, JE ;
Maudling, N ;
McGregor, L ;
Mitchell, AL ;
Moulton, G ;
Paine, K ;
Scordis, P .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :239-241
[5]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
[6]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[7]   Clustering protein sequences-structure prediction by transitive homology [J].
Bolten, E ;
Schliep, A ;
Schneckener, S ;
Schomburg, D ;
Schrader, R .
BIOINFORMATICS, 2001, 17 (10) :935-941
[8]  
CARTER P, 2002, IN PRESS NUCL ACIDS
[9]   Intrinsic errors in genome annotation [J].
Devos, D ;
Valencia, A .
TRENDS IN GENETICS, 2001, 17 (08) :429-431
[10]   Identification of homology in protein structure classification [J].
Dietmann, S ;
Holm, L .
NATURE STRUCTURAL BIOLOGY, 2001, 8 (11) :953-957