Improving the statistical detection of regulated genes from microarray data using intensity-based variance estimation -: art. no. 17

被引:16
作者
Comander, J
Natarajan, S
Gimbrone, MA
García-Cardeña, G
机构
[1] Brigham & Womens Hosp, Dept Pathol, Ctr Excellence Vasc Biol, Boston, MA 02115 USA
[2] Harvard Univ, Sch Med, Dept Pathol, Boston, MA 02115 USA
[3] Harvard Mit Div Hlth Sci & Technol, Cambridge, MA 02139 USA
关键词
D O I
10.1186/1471-2164-5-17
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Gene microarray technology provides the ability to study the regulation of thousands of genes simultaneously, but its potential is limited without an estimate of the statistical significance of the observed changes in gene expression. Due to the large number of genes being tested and the comparatively small number of array replicates (e.g., N = 3), standard statistical methods such as the Student's t-test fail to produce reliable results. Two other statistical approaches commonly used to improve significance estimates are a penalized t-test and a Z-test using intensity-dependent variance estimates. Results: The performance of these approaches is compared using a dataset of 23 replicates, and a new implementation of the Z-test is introduced that pools together variance estimates of genes with similar minimum intensity. Significance estimates based on 3 replicate arrays are calculated using each statistical technique, and their accuracy is evaluated by comparing them to a reliable estimate based on the remaining 20 replicates. The reproducibility of each test statistic is evaluated by applying it to multiple, independent sets of 3 replicate arrays. Two implementations of a Z-test using intensity-dependent variance produce more reproducible results than two implementations of a penalized t-test. Furthermore, the minimum intensity-based Z-statistic demonstrates higher accuracy and higher or equal precision than all other statistical techniques tested. Conclusion: An intensity-based variance estimation technique provides one simple, effective approach that can improve p-value estimates for differentially regulated genes derived from replicated microarray datasets. Implementations of the Z-test algorithms are available at http:// vessels.bwh.harvard.edu/software/papers/bmcg2004.
引用
收藏
页数:21
相关论文
共 45 条
[1]  
*AG, AG FLUOR DIR LAB KIT
[2]  
[Anonymous], GENOME BIOL
[3]   Identifying differentially expressed genes in cDNA microarray experiments [J].
Baggerly, KA ;
Coombes, KR ;
Hess, KR ;
Stivers, DN ;
Abruzzo, LV ;
Zhang, W .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2001, 8 (06) :639-659
[4]   A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes [J].
Baldi, P ;
Long, AD .
BIOINFORMATICS, 2001, 17 (06) :509-519
[5]   Statistical methods for ranking differentially expressed genes [J].
Broberg, P .
GENOME BIOLOGY, 2003, 4 (06)
[6]   Analysis of microarray data using Z score transformation [J].
Cheadle, C ;
Vawter, MP ;
Freed, WJ ;
Becker, KG .
JOURNAL OF MOLECULAR DIAGNOSTICS, 2003, 5 (02) :73-81
[7]   Ratio statistics of gene expression levels and applications to microarray data analysis [J].
Chen, YD ;
Kamat, V ;
Dougherty, ER ;
Bittner, ML ;
Meltzer, PS ;
Trent, JM .
BIOINFORMATICS, 2002, 18 (09) :1207-1215
[8]   Fundamentals of experimental design for cDNA microarrays [J].
Churchill, GA .
NATURE GENETICS, 2002, 32 (Suppl 4) :490-495
[9]   SNOMAD (Standardization and NOrmalization of MicroArray Data): web-accessible gene expression data analysis [J].
Colantuoni, C ;
Henry, G ;
Zeger, S ;
Pevsner, J .
BIOINFORMATICS, 2002, 18 (11) :1540-1541
[10]   Argus -: A new database system for Web-based analysis of multiple microarray data sets [J].
Comander, J ;
Weber, GM ;
Gimbrone, MA ;
García-Cardeña, G .
GENOME RESEARCH, 2001, 11 (09) :1603-1610