Analysis of protein domain families in Caenorhabditis elegans

被引:107
作者
Sonnhammer, ELL [1 ]
Durbin, R [1 ]
机构
[1] SANGER CTR,CAMBRIDGE CB10 1SA,ENGLAND
基金
英国惠康基金;
关键词
D O I
10.1006/geno.1997.4989
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
The Caenorhabditis elegans genome sequencing project has completed over half of this nematode's 100-Mb genome. Proteins predicted in the finished sequence have been compiled and released in the database Wormpep, Presented here is a comprehensive analysis of protein domain families in Wormpep 11, which comprises 7299 proteins. The relative abundance of common protein domain families was counted by comparing all Wormpep proteins to the Pfam collection of protein families, which is based on recognition by hidden Markov models. This analysis also identified a number of previously unannotated domains. To investigate new apparently nematode-specific protein families, Wormpep was clustered into domain families on the basis of sequence similarity using the Domainer program. The largest clusters that lacked clear homology to proteins outside Nematoda were analyzed in further detail, after which some could be assigned a putative function. We compared all proteins in Wormpep 11 to proteins in the human, Saccharomyces cerevisiae, and Haemophilus influenzae genomes. Among the results are the estimation that over two-thirds of the currently known human proteins are likely to have a homologue in the whole C. elegans genome and that a significant number of proteins are well conserved between C. elegans and H. influenzae, that are not found in S. cerevisiae. (C) 1997 Academic Press.
引用
收藏
页码:200 / 216
页数:17
相关论文
共 50 条
[41]  
SONNHAMMER ELL, 1996, GENE, V167, pGC1
[42]  
STATES DJ, 1993, ISMB 93 P 1 INT C IN, P387
[43]   Metabolism and evolution of Haemophilus influenzae deduced from a whole-genome comparison with Escherichia coli [J].
Tatusov, RL ;
Mushegian, AR ;
Bork, P ;
Brown, NP ;
Hayes, WS ;
Borodovsky, M ;
Rudd, KE ;
Koonin, EV .
CURRENT BIOLOGY, 1996, 6 (03) :279-291
[44]   CLUSTAL-W - IMPROVING THE SENSITIVITY OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT THROUGH SEQUENCE WEIGHTING, POSITION-SPECIFIC GAP PENALTIES AND WEIGHT MATRIX CHOICE [J].
THOMPSON, JD ;
HIGGINS, DG ;
GIBSON, TJ .
NUCLEIC ACIDS RESEARCH, 1994, 22 (22) :4673-4680
[45]   DIVERGENT 7 TRANSMEMBRANE RECEPTORS ARE CANDIDATE CHEMOSENSORY RECEPTORS IN C-ELEGANS [J].
TROEMEL, ER ;
CHOU, JH ;
DWYER, ND ;
COLBERT, HA ;
BARGMANN, CI .
CELL, 1995, 83 (02) :207-218
[46]  
WALKER RD, 1997, ISMB 97 P 5 INT C IN, P333
[47]  
WATANABE H, 1995, COMPUT APPL BIOSCI, V11, P159
[48]   THE GENOME OF CAENORHABDITIS-ELEGANS [J].
WATERSTON, R ;
SULSTON, J .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1995, 92 (24) :10836-10840
[49]   2.2 MB OF CONTIGUOUS NUCLEOTIDE-SEQUENCE FROM CHROMOSOME-III OF C-ELEGANS [J].
WILSON, R ;
AINSCOUGH, R ;
ANDERSON, K ;
BAYNES, C ;
BERKS, M ;
BURTON, J ;
CONNELL, M ;
BONFIELD, J ;
COPSEY, T ;
COOPER, J ;
COULSON, A ;
CRAXTON, M ;
DEAR, S ;
DU, Z ;
DURBIN, R ;
FAVELLO, A ;
FRASER, A ;
FULTON, L ;
GARDNER, A ;
GREEN, P ;
HAWKINS, T ;
HILLIER, L ;
JIER, M ;
JOHNSTON, L ;
JONES, M ;
KERSHAW, J ;
KIRSTEN, J ;
LAISSTER, N ;
LATREILLE, P ;
LLOYD, C ;
MORTIMORE, B ;
OCALLAGHAN, M ;
PARSONS, J ;
PERCY, C ;
RIFKEN, L ;
ROOPRA, A ;
SAUNDERS, D ;
SHOWNKEEN, R ;
SIMS, M ;
SMALDON, N ;
SMITH, A ;
SMITH, M ;
SONNHAMMER, E ;
STADEN, R ;
SULSTON, J ;
THIERRYMIEG, J ;
THOMAS, K ;
VAUDIN, M ;
VAUGHAN, K ;
WATERSTON, R .
NATURE, 1994, 368 (6466) :32-38
[50]  
ZHANG J, 1994, P 27 ANN HAW INT C S, P58