Annotation transfer between genomes: Protein-protein interologs and protein-DNA regulogs

被引:396
作者
Yu, HY
Luscombe, NM
Lu, HX
Zhu, XW
Xia, Y
Han, JDJ
Bertin, N
Chung, S
Vidal, M
Gerstein, M [1 ]
机构
[1] Yale Univ, Dept Mol Biophys & Biochem, New Haven, CT 06520 USA
[2] Harvard Univ, Sch Med, Dept Genet, Boston, MA 02115 USA
[3] Harvard Univ, Sch Med, Dana Farber Canc Inst, Boston, MA 02115 USA
关键词
D O I
10.1101/gr.1774904
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Proteins function mainly through interactions, especially with DNA and other proteins. While some large-scale interaction networks are now available for a number of model organisms, their experimental generation remains difficult. Consequently, interolog mapping-the transfer of interaction annotation from one organism to another using comparative genomics-is of significant value. Here we quantitatively assess the degree to which interologs can be reliably transferred between species as a function of the sequence similarity of the corresponding interacting proteins. Using interaction information from Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, and Helicobacter pylori, we find that protein-protein interactions can be transferred when a pair of proteins has a joint sequence identity >80% or a joint E-value <10(-70). (These "joint" quantities are the geometric means of the identities or E-values for the two pairs of interacting proteins.) We generalize Our interolog analysis to protein-DNA binding, finding such interactions are conserved at specific thresholds between 30% and 60% Sequence identity depending oil the protein family. Furthermore, we introduce the concept of a "regulog"-a conserved regulatory relationship between proteins across different species. We map interologs and regulogs from yeast to a number of genomes with limited experimental annotation (e.g., Arabidopsis thaliana) and make these available through ail online database at http://interolog.gersteinlab.org. Specifically, we are able to transfer -90,000 potential protein-protein interactions to the worm. We test a number of these in two-hybrid experiments and are able to verify 45 overlaps, which we show to be statistically significant.
引用
收藏
页码:1107 / 1118
页数:12
相关论文
共 54 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   Bioinformatics: From genome data to biological knowledge [J].
Andrade, MA ;
Sander, C .
CURRENT OPINION IN BIOTECHNOLOGY, 1997, 8 (06) :675-683
[3]  
[Anonymous], [No title captured]
[4]  
[Anonymous], 1992, ENZYME NOMENCLATURE
[5]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[6]   Novel developments with the PRINTS protein fingerprint database [J].
Attwood, TK ;
Beck, ME ;
Bleasby, AJ ;
Degtyarenko, K ;
Michie, AD ;
ParrySmith, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (01) :212-216
[7]   The PROSITE database, its status in 1995 [J].
Bairoch, A ;
Bucher, P ;
Hofmann, K .
NUCLEIC ACIDS RESEARCH, 1996, 24 (01) :189-196
[8]   Predicting function: From genes to genomes and back [J].
Bork, P ;
Dandekar, T ;
Diaz-Lazcoz, Y ;
Eisenhaber, F ;
Huynen, M ;
Yuan, YP .
JOURNAL OF MOLECULAR BIOLOGY, 1998, 283 (04) :707-725
[9]   FROM GENOME SEQUENCES TO PROTEIN FUNCTION [J].
BORK, P ;
OUZOUNIS, C ;
SANDER, C .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1994, 4 (03) :393-403
[10]   Combined functional genomic maps of the C-elegans DNA damage response [J].
Boulton, SJ ;
Gartner, A ;
Reboul, J ;
Vaglio, P ;
Dyson, N ;
Hill, DE ;
Vidal, M .
SCIENCE, 2002, 295 (5552) :127-131