Inferring domain-domain interactions from protein-protein interactions

被引:310
作者
Deng, MH [1 ]
Mehta, S [1 ]
Sun, FZ [1 ]
Chen, T [1 ]
机构
[1] Univ So Calif, Dept Biol Sci, Program Mol & Computat Biol, Los Angeles, CA 90089 USA
关键词
D O I
10.1101/gr.153002
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The interaction between proteins is one of the most important features of protein functions. Behind protein-protein interactions there are protein domains interacting physically with one another to perform the necessary functions. Therefore, understanding protein interactions at the domain level gives a global view of the protein interaction network, and possibly of protein functions. Two research groups used yeast two-hybrid assays to generate 5719 interactions between proteins of the yeast Saccharomyces cerevisiae. This allows us to study the large-scale conserved patterns of interactions between protein domains. Using evolutionarily conserved domains defined in a protein-domain database called PFAM (http://PFAM.wustl.edu), we apply a Maximum Likelihood Estimation method to infer interacting domains that are consistent with the observed protein-protein interactions. We estimate the probabilities of interactions between every pair of domains and measure the accuracies of our predictions at the protein level. Using the inferred domain-domain interactions, we predict interactions between proteins. Our predicted protein-protein interactions have a significant overlap with the protein-protein interactions (MIPS: http://mips.gfs.de) obtained by methods other than the two-hybrid assays. The mean correlation coefficient of the gene expression profiles for our predicted interaction pairs is significantly higher than that for random pairs. Our method has shown robustness in analyzing incomplete data sets and dealing with various experimental errors. We found several novel protein-protein interactions such as RPSOA interacting with APG17 and TAF40 interacting with SPT3, which are consistent with the functions of the proteins.
引用
收藏
页码:1540 / 1548
页数:9
相关论文
共 26 条
[1]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[2]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
[3]   Predicting protein-protein interactions from primary structure [J].
Bock, JR ;
Gough, DA .
BIOINFORMATICS, 2001, 17 (05) :455-460
[4]   ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons [J].
Corpet, F ;
Servant, F ;
Gouzy, J ;
Kahn, D .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :267-269
[5]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[6]   Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868
[7]   Protein interaction maps for complete genomes based on gene fusion events [J].
Enright, AJ ;
Iliopoulos, I ;
Kyrpides, NC ;
Ouzounis, CA .
NATURE, 1999, 402 (6757) :86-90
[8]   Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae [J].
Ge, H ;
Liu, ZH ;
Church, GM ;
Vidal, M .
NATURE GENETICS, 2001, 29 (04) :482-486
[9]   Life with 6000 genes [J].
Goffeau, A ;
Barrell, BG ;
Bussey, H ;
Davis, RW ;
Dujon, B ;
Feldmann, H ;
Galibert, F ;
Hoheisel, JD ;
Jacq, C ;
Johnston, M ;
Louis, EJ ;
Mewes, HW ;
Murakami, Y ;
Philippsen, P ;
Tettelin, H ;
Oliver, SG .
SCIENCE, 1996, 274 (5287) :546-&
[10]  
Gomez SM, 2001, GENETICS, V159, P1291