Evaluation of Clustering Algorithms for Protein Complex and Protein Interaction Network Assembly

被引:27
作者
Sardiu, Mihaela E. [1 ]
Florens, Laurence [1 ]
Washburn, Michael P. [1 ]
机构
[1] Stowers Inst Med Res, Kansas City, MO 64110 USA
关键词
quantitative proteomics; hierarchical clustering; partition clustering; normalized spectral abundance factor; Z-score normalization; protein interaction networks; EXPRESSION; MASS; SIMILARITIES; CHROMATIN; INO80; SCORE;
D O I
10.1021/pr900073d
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Assembling protein complexes and protein interaction networks from affinity purification-based proteomics data sets remains a challenge. When little a priori knowledge of the complexes exists, it is difficult to place proteins in the proper locations and evaluate the results of clustering approaches. Here we have systematically compared multiple hierarchical and partitioning clustering approaches using a well-characterized but highly complex human protein interaction network data set centered around the conserved AAA+ ATPases Tip49a and Tip49b. This network provides a challenge to clustering algorithms because Tip49a and Tip49b are present in four distinct complexes, the network contains modules, and the network has multiple attachments. We compared the use of binary data, quantitative proteomics data in the form of normalized spectral abundance factors, and the Z-score normalization. In our analysis, a partitioning approach indicated the major modules in a network. Next, while Euclidian distance was sensitive to scaling, with data transformation, all the attachments in a data set were recovered in one branch of a dendrogram. Finally, when Pearson correlation and hierarchical clustering were used, complexes were well separated and their attachments were placed in the proper locations. Each of these three approaches provided distinct information useful for assembly of a network of multiple protein complexes.
引用
收藏
页码:2944 / 2952
页数:9
相关论文
共 28 条
[1]   A proteomic study of the HUPO Plasma Proteome Project's pilot samples using an accurate mass and time tag strategy [J].
Adkins, JN ;
Monroe, ME ;
Auberry, KJ ;
Shen, YF ;
Jacobs, JM ;
Camp, DG ;
Vitzthum, F ;
Rodland, KD ;
Zangar, RC ;
Smith, RD ;
Pounds, JG .
PROTEOMICS, 2005, 5 (13) :3454-3466
[2]  
Braun P, 2009, NAT METHODS, V6, P91, DOI [10.1038/NMETH.1281, 10.1038/nmeth.1281]
[3]   The mammalian YL1 protein is a shared subunit of the TRRAP/TIP60 histone acetyltransferase and SRCAP complexes [J].
Cai, Y ;
Jin, JJ ;
Florens, L ;
Swanson, SK ;
Kusch, T ;
Li, B ;
Workman, JL ;
Washburn, MP ;
Conaway, RC ;
Conaway, JW .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2005, 280 (14) :13665-13670
[4]   YY1 functions with INO80 to activate transcription [J].
Cai, Yong ;
Jin, Jingji ;
Yao, Tingting ;
Gottschalk, Aaron J. ;
Swanson, Selene K. ;
wu, Su ;
Shi, Yang ;
Washburn, Michael P. ;
Florens, Laurence ;
Conaway, Ronald C. ;
Conaway, Joan W. .
NATURE STRUCTURAL & MOLECULAR BIOLOGY, 2007, 14 (09) :872-874
[5]   Analysis of microarray data using Z score transformation [J].
Cheadle, C ;
Vawter, MP ;
Freed, WJ ;
Becker, KG .
JOURNAL OF MOLECULAR DIAGNOSTICS, 2003, 5 (02) :73-81
[6]   Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae [J].
Collins, Sean R. ;
Kemmeren, Patrick ;
Zhao, Xue-Chu ;
Greenblatt, Jack F. ;
Spencer, Forrest ;
Holstege, Frank C. P. ;
Weissman, Jonathan S. ;
Krogan, Nevan J. .
MOLECULAR & CELLULAR PROTEOMICS, 2007, 6 (03) :439-450
[7]   Evaluation of clustering algorithms for gene expression data [J].
Datta, Susmita ;
Datta, Somnath .
BMC BIOINFORMATICS, 2006, 7 (Suppl 4)
[8]  
Do JH, 2008, MOL CELLS, V25, P279
[9]   Large-scale mapping of human protein-protein interactions by mass spectrometry [J].
Ewing, Rob M. ;
Chu, Peter ;
Elisma, Fred ;
Li, Hongyan ;
Taylor, Paul ;
Climie, Shane ;
McBroom-Cerajewski, Linda ;
Robinson, Mark D. ;
O'Connor, Liam ;
Li, Michael ;
Taylor, Rod ;
Dharsee, Moyez ;
Ho, Yuen ;
Heilbut, Adrian ;
Moore, Lynda ;
Zhang, Shudong ;
Ornatsky, Olga ;
Bukhman, Yury V. ;
Ethier, Martin ;
Sheng, Yinglun ;
Vasilescu, Julian ;
Abu-Farha, Mohamed ;
Lambert, Jean-Philippe ;
Duewel, Henry S. ;
Stewart, Ian I. ;
Kuehl, Bonnie ;
Hogue, Kelly ;
Colwill, Karen ;
Gladwish, Katharine ;
Muskat, Brenda ;
Kinach, Robert ;
Adams, Sally-Lin ;
Moran, Michael F. ;
Morin, Gregg B. ;
Topaloglou, Thodoros ;
Figeys, Daniel .
MOLECULAR SYSTEMS BIOLOGY, 2007, 3 (1)
[10]   Clustering microarray data [J].
Gollub, Jeremy ;
Sherlock, Gavin .
DNA MICROARRAYS, PART B: DATABASES AND STATISTICS, 2006, 411 :194-+