An automated method for finding molecular complexes in large protein interaction networks

被引:4376
作者
Bader, GD
Hogue, CW
机构
[1] Mt Sinai Hosp, Samuel Lunenfeld Res Inst, Toronto, ON M5G 1X5, Canada
[2] Univ Toronto, Dept Biochem, Toronto, ON M5S 1A8, Canada
基金
美国国家科学基金会;
关键词
D O I
10.1186/1471-2105-4-2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Recent advances in proteomics technologies such as two-hybrid, phage display and mass spectrometry have enabled us to create a detailed map of biomolecular interaction networks. Initial mapping efforts have already produced a wealth of data. As the size of the interaction set increases, databases and computational methods will be required to store, visualize and analyze the information in order to effectively aid in knowledge discovery. Results: This paper describes a novel graph theoretic clustering algorithm, "Molecular Complex Detection" (MCODE), that detects densely connected regions in large protein-protein interaction networks that may represent molecular complexes. The method is based on vertex weighting by local neighborhood density and outward traversal from a locally dense seed protein to isolate the dense regions according to given parameters. The algorithm has the advantage over other graph clustering methods of having a directed mode that allows fine-tuning of clusters of interest without considering the rest of the network and allows examination of cluster interconnectivity, which is relevant for protein networks. Protein interaction and complex information from the yeast Saccharomyces cerevisiae was used for evaluation. Conclusion: Dense regions of protein interaction networks can be found, based solely on connectivity data, many of which correspond to known protein complexes. The algorithm is not affected by a known high rate of false positives in data from high-throughput interaction techniques. The program is available from ftp://ftp.mshri.on.ca/pub/BIND/Tools/MCODE.
引用
收藏
页数:27
相关论文
共 45 条
  • [1] Error and attack tolerance of complex networks
    Albert, R
    Jeong, H
    Barabási, AL
    [J]. NATURE, 2000, 406 (6794) : 378 - 382
  • [2] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [3] Analyzing yeast protein-protein interaction data obtained from different sources
    Bader, GD
    Hogue, CWV
    [J]. NATURE BIOTECHNOLOGY, 2002, 20 (10) : 991 - 997
  • [4] BIND - The Biomolecular Interaction Network Database
    Bader, GD
    Donaldson, I
    Wolting, C
    Ouellette, BFF
    Pawson, T
    Hogue, CWV
    [J]. NUCLEIC ACIDS RESEARCH, 2001, 29 (01) : 242 - 245
  • [5] Assessing the accuracy of prediction algorithms for classification: an overview
    Baldi, P
    Brunak, S
    Chauvin, Y
    Andersen, CAF
    Nielsen, H
    [J]. BIOINFORMATICS, 2000, 16 (05) : 412 - 424
  • [6] Emergence of scaling in random networks
    Barabási, AL
    Albert, R
    [J]. SCIENCE, 1999, 286 (5439) : 509 - 512
  • [7] Batagelj V, 1998, CONNECTIONS, V21, P47, DOI DOI 10.1017/CB09780511996368
  • [8] The proteasome
    Bochtler, M
    Ditzel, L
    Groll, M
    Hartmann, C
    Huber, R
    [J]. ANNUAL REVIEW OF BIOPHYSICS AND BIOMOLECULAR STRUCTURE, 1999, 28 : 295 - +
  • [9] Using the Saccharomyces Genome Database (SGD) for analysis of protein similarities and structure
    Chervitz, SA
    Hester, ET
    Ball, CA
    Dolinski, K
    Dwight, SS
    Harris, MA
    Juvik, G
    Malekian, A
    Roberts, S
    Roe, T
    Scafe, C
    Schroeder, M
    Sherlock, G
    Weng, S
    Zhu, Y
    Cherry, JM
    Botstein, D
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (01) : 74 - 78
  • [10] Christendat D, 2000, NAT STRUCT BIOL, V7, P903