Improving gene set analysis of microarray data by SAM-GS

被引:191
作者
Dinu, Irina
Potter, John D.
Mueller, Thomas
Liu, Qi
Adewale, Adeniyi J.
Jhangri, Gian S.
Einecke, Gunilla
Famulski, Konrad S.
Halloran, Philip
Yasui, Yutaka [1 ]
机构
[1] Univ Alberta, Sch Publ Hlth, Dept Publ Hlth Sci, Edmonton, AB T6G 2G3, Canada
[2] Fred Hutchinson Canc Res Ctr, Div Publ Hlth Sci, Seattle, WA 98109 USA
[3] Univ Alberta, Fac Med & Dent, Div Nephrol & Transplantat Immunol, Edmonton, AB T6G 2S2, Canada
关键词
D O I
10.1186/1471-2105-8-242
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Gene-set analysis evaluates the expression of biological pathways, or a priori defined gene sets, rather than that of individual genes, in association with a binary phenotype, and is of great biologic interest in many DNA microarray studies. Gene Set Enrichment Analysis (GSEA) has been applied widely as a tool for gene-set analyses. We describe here some critical problems with GSEA and propose an alternative method by extending the individual-gene analysis method, Significance Analysis of Microarray (SAM), to gene-set analyses (SAM-GS). Results: Using a mouse microarray dataset with simulated gene sets, we illustrate that GSEA gives statistical significance to gene sets that have no gene associated with the phenotype (null gene sets), and has very low power to detect gene sets in which half the genes are moderately or strongly associated with the phenotype (truly- associated gene sets). SAM-GS, on the other hand, performs very well. The two methods are also compared in the analyses of three real microarray datasets and relevant pathways, the diverging results of which clearly show advantages of SAM-GS over GSEA, both statistically and biologically. In a microarray study for identifying biological pathways whose gene expressions are associated with p53 mutation in cancer cell lines, we found biologically relevant performance differences between the two methods. Specifically, there are 31 additional pathways identified as significant by SAMGS over GSEA, that are associated with the presence vs. absence of p53. Of the 31 gene sets, 11 actually involve p53 directly as a member. A further 6 gene sets directly involve the extrinsic and intrinsic apoptosis pathways, 3 involve the cell-cycle machinery, and 3 involve cytokines and/or JAK/STAT signaling. Each of these 12 gene sets, then, is in a direct, well-established relationship with aspects of p53 signaling. Of the remaining 8 gene sets, 6 have plausible, if less well established, links with p53. Conclusion: We conclude that GSEA has important limitations as a gene- set analysis approach for microarray experiments for identifying biological pathways associated with a binary phenotype. As an alternative statistically- sound method, we propose SAM-GS. A free Excel Add-In for performing SAM-GS is available for public use.
引用
收藏
页数:13
相关论文
共 23 条
[1]   RANDOMIZATION TESTS FOR A MULTIVARIATE 2-SAMPLE PROBLEM [J].
CHUNG, JH ;
FRASER, DAS .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1958, 53 (283) :729-735
[2]   The BCL2 family: Regulators of the cellular life-or-death switch [J].
Cory, S ;
Adams, JM .
NATURE REVIEWS CANCER, 2002, 2 (09) :647-656
[3]   Application of microarrays to the analysis of the inactivation status of human X-linked genes expressed in lymphocytes [J].
Craig, IW ;
Mill, J ;
Craig, GM ;
Loat, C ;
Schalkwyk, LC .
EUROPEAN JOURNAL OF HUMAN GENETICS, 2004, 12 (08) :639-646
[4]   A HIGH DIMENSIONAL 2 SAMPLE SIGNIFICANCE TEST [J].
DEMPSTER, AP .
ANNALS OF MATHEMATICAL STATISTICS, 1958, 29 (04) :995-1010
[5]   A SIGNIFICANCE TEST FOR THE SEPARATION OF 2 HIGHLY MULTIVARIATE SMALL SAMPLES [J].
DEMPSTER, AP .
BIOMETRICS, 1960, 16 (01) :41-50
[6]   Expression of CTL associated transcripts precedes the development of tubulitis in T-cell mediated kidney graft rejection [J].
Einecke, G ;
Melk, A ;
Ramassar, V ;
Zhu, LF ;
Bleackley, RC ;
Famulski, KS ;
Halloran, PF .
AMERICAN JOURNAL OF TRANSPLANTATION, 2005, 5 (08) :1827-1836
[7]   Analyzing gene expression data in terms of gene sets:: methodological issues [J].
Goeman, Jelle J. ;
Buehlmann, Peter .
BIOINFORMATICS, 2007, 23 (08) :980-987
[8]  
GOES N, 1995, TRANSPLANTATION, V59, P565
[9]  
GOODFELLOW P, 1984, AM J HUM GENET, V36, P777
[10]   Cross-talk between Akt, p53 and Mdm2: possible implications for the regulation of apoptosis [J].
Gottlieb, TM ;
Leal, JFM ;
Seger, R ;
Taya, Y ;
Oren, M .
ONCOGENE, 2002, 21 (08) :1299-1303