The effects of incomplete protein interaction data on structural and evolutionary inferences

被引:48
作者
de Silva, Eric
Thorne, Thomas
Ingram, Piers
Agrafioti, Ino
Swire, Jonathan
Wiuf, Carsten
Stumpf, Michael P. H. [1 ]
机构
[1] Univ London Imperial Coll Sci Technol & Med, Div Mol Biosci, Ctr Bioinformat, Theoret Genom Grp, London SW7 2AZ, England
[2] Univ London Imperial Coll Sci Technol & Med, Dept Math, London SW7 2BZ, England
[3] Univ Aarhus, Bioinformat Res Ctr, Aarhus, Denmark
[4] Aarhus Univ Hosp, Mol Diagnost Lab, DK-8000 Aarhus, Denmark
[5] Univ London Imperial Coll Sci Technol & Med, Inst Math Sci, London SW7 2AZ, England
关键词
D O I
10.1186/1741-7007-4-39
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Present protein interaction network data sets include only interactions among subsets of the proteins in an organism. Previously this has been ignored, but in principle any global network analysis that only looks at partial data may be biased. Here we demonstrate the need to consider network sampling properties explicitly and from the outset in any analysis. Results: Here we study how properties of the yeast protein interaction network are affected by random and non-random sampling schemes using a range of different network statistics. Effects are shown to be independent of the inherent noise in protein interaction data. The effects of the incomplete nature of network data become very noticeable, especially for so-called network motifs. We also consider the effect of incomplete network data on functional and evolutionary inferences. Conclusion: Crucially, when only small, partial network data sets are considered, bias is virtually inevitable. Given the scope of effects considered here, previous analyses may have to be carefully reassessed: ignoring the fact that present network data are incomplete will severely affect our ability to understand biological systems.
引用
收藏
页数:13
相关论文
共 45 条
[1]   Comparative analysis of the Saccharomyces cerevisiae and Caenorhabditis elegans protein interaction networks -: art. no. 23 [J].
Agrafioti, I ;
Swire, J ;
Abbott, J ;
Huntley, D ;
Butcher, S ;
Stumpf, MPH .
BMC EVOLUTIONARY BIOLOGY, 2005, 5 (1)
[2]  
[Anonymous], DATABASE INTERACTING
[3]   Gaining confidence in high-throughput protein interaction networks [J].
Bader, JS ;
Chaudhuri, A ;
Rothberg, JM ;
Chant, J .
NATURE BIOTECHNOLOGY, 2004, 22 (01) :78-85
[4]   Correlated random networks -: art. no. 228701 [J].
Berg, J ;
Lässig, M .
PHYSICAL REVIEW LETTERS, 2002, 89 (22) :228701-228701
[5]  
Bollobas Bela, 1998, RANDOM GRAPHS
[6]   Perturbing general uncorrelated networks [J].
Burda, Z. ;
Jurkiewicz, J. ;
Krzywicki, A. .
Physical Review E - Statistical, Nonlinear, and Soft Matter Physics, 2004, 70 (2 2) :026106-1
[7]   A genome-wide transcriptional analysis of the mitotic cell cycle [J].
Cho, RJ ;
Campbell, MJ ;
Winzeler, EA ;
Steinmetz, L ;
Conway, A ;
Wodicka, L ;
Wolfsberg, TG ;
Gabrielian, AE ;
Landsman, D ;
Lockhart, DJ ;
Davis, RW .
MOLECULAR CELL, 1998, 2 (01) :65-73
[8]  
COX D. R., 2000, Theoretical Statistics
[9]   Complex networks and simple models in biology [J].
de Silva, E ;
Stumpf, MPH .
JOURNAL OF THE ROYAL SOCIETY INTERFACE, 2005, 2 (05) :419-430
[10]  
Dorogovtsev S. N., 2003, EVOLUTION NETWORKS