Integrative analysis of multiple gene expression profiles with quality-adjusted effect size models

被引:38
作者
Hu, PZ
Greenwood, CMT
Beyene, J
机构
[1] Hosp Sick Children, Res Inst, Toronto, ON M5G 1X8, Canada
[2] Univ Toronto, Dept Publ Hlth Sci, Toronto, ON M5S 1A8, Canada
关键词
D O I
10.1186/1471-2105-6-128
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: With the explosion of microarray studies, an enormous amount of data is being produced. Systematic integration of gene expression data from different sources increases statistical power of detecting differentially expressed genes and allows assessment of heterogeneity. The challenge, however, is in designing and implementing efficient analytic methodologies for combination of data generated by different research groups. Results: We extended traditional effect size models to combine information from different microarray datasets by incorporating a quality measure for each gene in each study into the effect size estimation. We illustrated our method by integrating two datasets generated using different Affymetrix oligonucleotide types. Our results indicate that the proposed quality-adjusted weighting strategy for modelling inter-study variation of gene expression profiles not only increases consistency and decreases heterogeneous results between these two datasets, but also identifies many more differentially expressed genes than methods proposed previously. Conclusion: Data integration and synthesis is becoming increasingly important. We live in a high-throughput era where technologies constantly change leaving behind a trail of data with different forms, shapes and sizes. Statistical and computational methodologies are therefore critical for extracting the most out of these related but not identical sources of data.
引用
收藏
页数:11
相关论文
共 36 条
[21]  
KNIGHT K., 2000, C&H TEXT STAT SCI
[22]   Analysis of matched mRNA measurements from two different microarray technologies [J].
Kuo, WP ;
Jenssen, TK ;
Butte, AJ ;
Ohno-Machado, L ;
Kohane, IS .
BIOINFORMATICS, 2002, 18 (03) :405-412
[23]  
Luo J, 2001, CANCER RES, V61, P4683
[24]  
Magee JA, 2001, CANCER RES, V61, P5692
[25]  
Olkin I, 1992, Statistical Science, V7, P226
[26]   A cross-study comparison of gene expression studies for the molecular classification of lung cancer [J].
Parmigiani, G ;
Garrett-Mayer, ES ;
Anbazhagan, R ;
Gabrielson, E .
CLINICAL CANCER RESEARCH, 2004, 10 (09) :2922-2927
[27]   Estimating the occurrence of false positives and false negatives in microarray studies by approximating and partitioning the empirical distribution of p-values [J].
Pounds, S ;
Morris, SW .
BIOINFORMATICS, 2003, 19 (10) :1236-1242
[28]  
Rhodes DR, 2002, CANCER RES, V62, P4427
[29]   Prognostic meta-signature of breast cancer developed by two-stage mixture modeling of microarray data [J].
Shen, RL ;
Ghosh, D ;
Chinnaiyan, AM .
BMC GENOMICS, 2004, 5 (1)
[30]   Clinical and biological features associated with epidermal growth factor receptor gene mutations in lung cancers [J].
Shigematsu, H ;
Lin, L ;
Takahashi, T ;
Nomura, M ;
Suzuki, M ;
Wistuba, II ;
Fong, KM ;
Lee, H ;
Toyooka, S ;
Shimizu, N ;
Fujisawa, T ;
Feng, ZD ;
Roth, JA ;
Herz, J ;
Minna, JD ;
Gazdar, AF .
JNCI-JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2005, 97 (05) :339-346