Delineation of modular proteins: Domain boundary prediction from sequence information

被引:20
作者
Kong, LS [1 ]
Ranganathan, S
机构
[1] Macquarie Univ, Biotechnol Res Inst, Chair Bioinformat, N Ryde, NSW 2109, Australia
[2] Natl Univ Singapore, Dept Biochem, Singapore 117548, Singapore
关键词
protein domain; boundary prediction; protein modules; domain databases; domain architecture;
D O I
10.1093/bib/5.2.179
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The delineation of domain boundaries of a given sequence in the absence of known 3D structures or detectable sequence homology to known domains benefits many areas in protein science, such as protein engineering, protein 3D structure determination and protein structure prediction. With the exponential growth of newly determined sequences, our ability to predict domain boundaries rapidly and accurately from sequence information alone is both essential and critical from the viewpoint of gene function annotation. Anyone attempting to predict domain boundaries for a single protein sequence is invariably confronted with a plethora of databases that contain boundary information available from the internet and a variety of methods for domain boundary prediction. How are these derived and how well do they work? What definition of 'domain' do they use? We will first clarify the different definitions of protein domains, and then describe the available public databases with domain boundary information. Finally, we will review existing domain boundary prediction methods and discuss their strengths and weaknesses.
引用
收藏
页码:179 / 192
页数:14
相关论文
共 65 条
[1]  
ALBERTS B, 2002, MOL BIOL CELL, P140
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   SCOP database in 2004: refinements integrate structure and sequence family data [J].
Andreeva, A ;
Howorth, D ;
Brenner, SE ;
Hubbard, TJP ;
Chothia, C ;
Murzin, AG .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D226-D229
[4]   Domain combinations in archaeal, eubacterial and eukaryotic proteomes [J].
Apic, G ;
Gough, J ;
Teichmann, SA .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 310 (02) :311-325
[5]   GLOBAL FOLD DETERMINATION FROM A SMALL NUMBER OF DISTANCE RESTRAINTS [J].
ASZODI, A ;
GRADWELL, MJ ;
TAYLOR, WR .
JOURNAL OF MOLECULAR BIOLOGY, 1995, 251 (02) :308-326
[6]  
Attwood Terri K, 2002, Brief Bioinform, V3, P252, DOI 10.1093/bib/3.3.252
[7]   PROTEIN MODULES [J].
BARON, M ;
NORMAN, DG ;
CAMPBELL, ID .
TRENDS IN BIOCHEMICAL SCIENCES, 1991, 16 (01) :13-17
[8]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkr1065, 10.1093/nar/gkh121]
[9]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[10]   Ensembl 2004 [J].
Birney, E ;
Andrews, D ;
Bevan, P ;
Caccamo, M ;
Cameron, G ;
Chen, Y ;
Clarke, L ;
Coates, G ;
Cox, T ;
Cuff, J ;
Curwen, V ;
Cutts, T ;
Down, T ;
Durbin, R ;
Eyras, E ;
Fernandez-Suarez, XM ;
Gane, P ;
Gibbins, B ;
Gilbert, J ;
Hammond, M ;
Hotz, H ;
Iyer, V ;
Kahari, A ;
Jekosch, K ;
Kasprzyk, A ;
Keefe, D ;
Keenan, S ;
Lehvaslaiho, H ;
McVicker, G ;
Melsopp, C ;
Meidl, P ;
Mongin, E ;
Pettett, R ;
Potter, S ;
Proctor, G ;
Rae, M ;
Searle, S ;
Slater, G ;
Smedley, D ;
Smith, J ;
Spooner, W ;
Stabenau, A ;
Stalker, J ;
Storey, R ;
Ureta-Vidal, A ;
Woodwark, C ;
Clamp, M ;
Hubbard, T .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D468-D470