Domains, motifs and clusters in the protein universe

被引:69
作者
Liu, JF
Rost, B
机构
[1] Columbia Univ, CUBIC, Dept Biochem & Mol Biophys, New York, NY 10032 USA
[2] Columbia Univ, N E Struct Genomics Consortium NESG, Dept Biochem & Mol Biophys, New York, NY 10032 USA
[3] Columbia Univ, Dept Pharmacol, New York, NY 10032 USA
[4] Columbia Univ, Ctr Computat Biol & Bioinformat C2B2, New York, NY 10032 USA
关键词
D O I
10.1016/S1367-5931(02)00003-0
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The rapid growth of bio-sequence information has resulted in an increasing demand for reliable methods that group proteins. A few databases with curated alignments of protein families have demonstrated that expert-driven repositories can keep up with the data deluge in the genome era. These original resources implicitly identify domain-like modules in proteins. An increasing number of automatic methods have sprouted over the past few years that cluster the protein universe. Many of these implicitly dissect proteins into structural domain-like fragments. In a very coarse-grained evaluation, some of the automatic methods appear to be on par with expert-driven approaches. However, neither automatic nor manual methods are currently entirely up to the challenges of tasks such as target selection in structural genomics. Thus, we urgently need refined and sustained automatic clustering tools.
引用
收藏
页码:5 / 11
页数:7
相关论文
共 64 条
[11]   Intrinsically disordered protein [J].
Dunker, AK ;
Lawson, JD ;
Brown, CJ ;
Williams, RM ;
Romero, P ;
Oh, JS ;
Oldfield, CJ ;
Campen, AM ;
Ratliff, CR ;
Hipps, KW ;
Ausio, J ;
Nissen, MS ;
Reeves, R ;
Kang, CH ;
Kissinger, CR ;
Bailey, RW ;
Griswold, MD ;
Chiu, M ;
Garner, EC ;
Obradovic, Z .
JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2001, 19 (01) :26-59
[12]   GeneRAGE: a robust algorithm for sequence clustering and domain detection [J].
Enright, AJ ;
Ouzounis, CA .
BIOINFORMATICS, 2000, 16 (05) :451-457
[13]   An efficient algorithm for large-scale detection of protein families [J].
Enright, AJ ;
Van Dongen, S ;
Ouzounis, CA .
NUCLEIC ACIDS RESEARCH, 2002, 30 (07) :1575-1584
[14]   The PROSITE database, its status in 2002 [J].
Falquet, L ;
Pagni, M ;
Bucher, P ;
Hulo, N ;
Sigrist, CJA ;
Hofmann, K ;
Bairoch, A .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :235-238
[15]   Knowledge-based selection of targets for structural genomics [J].
Frishman, D .
PROTEIN ENGINEERING, 2002, 15 (03) :169-183
[16]   SnapDRAGON: a method to delineate protein structural domains from sequence data [J].
George, RA ;
Heringa, J .
JOURNAL OF MOLECULAR BIOLOGY, 2002, 316 (03) :839-851
[17]   Protein domain identification and improved sequence similarity searching using PSI-BLAST [J].
George, RA ;
Heringa, J .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2002, 48 (04) :672-681
[18]   DOMO: a new database of aligned protein domains [J].
Gracy, J ;
Argos, P .
TRENDS IN BIOCHEMICAL SCIENCES, 1998, 23 (12) :495-497
[19]   TIGRFAMs: a protein family resource for the functional identification of proteins [J].
Haft, DH ;
Loftus, BJ ;
Richardson, DL ;
Yang, F ;
Eisen, JA ;
Paulsen, IT ;
White, O .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :41-43
[20]   Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders [J].
Hamosh, A ;
Scott, AF ;
Amberger, J ;
Bocchini, C ;
Valle, D ;
McKusick, VA .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :52-55