Statistical significance analysis of longitudinal gene expression data

被引:24
作者
Guo, X
Qi, HL
Verfaillie, CM
Pan, W
机构
[1] Univ Minnesota, Sch Publ Hlth, Div Biostat, Minneapolis, MN 55455 USA
[2] Univ Minnesota, Sch Med, Dept Med, Minneapolis, MN 55445 USA
关键词
D O I
10.1093/bioinformatics/btg206
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Time-course microarray experiments are designed to study biological processes in a temporal fashion. Longitudinal gene expression data arise when biological samples taken from the same subject at different time points are used to measure the gene expression levels. It has been observed that the gene expression patterns of samples of a given tumor measured at different time points are likely to be much more similar to each other than are the expression patterns of tumor samples of the same type taken from different subjects. In statistics, this phenomenon is called the within-subject correlation of repeated measurements on the same subject, and the resulting data are called longitudinal data. It is well known in other applications that valid statistical analyses have to appropriately take account of the possible within-subject correlation in longitudinal data. Results: We apply estimating equation techniques to construct a robust statistic, which is a variant of the robust Wald statistic and accounts for the potential within-subject correlation of longitudinal gene expression data, to detect genes with temporal changes in expression. We associate significance levels to the proposed statistic by either incorporating the idea of the significance analysis of microarrays method or using the mixture model method to identify significant genes. The utility of the statistic is demonstrated by applying it to an important study of osteoblast lineage-specific differentiation. Using simulated data, we also show pitfalls in drawing statistical inference when the within-subject correlation in longitudinal gene expression data is ignored.
引用
收藏
页码:1628 / 1635
页数:8
相关论文
共 31 条
[1]  
AKAIKE H, 1973, INT S INFORMATION TH, V2, P267
[2]   Exploring the new world of the genome with DNA microarrays [J].
Brown, PO ;
Botstein, D .
NATURE GENETICS, 1999, 21 (Suppl 1) :33-37
[3]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[4]  
Diggle P. J., 2002, ANAL LONGITUDINAL DA
[5]   The osteoblast: A sophisticated fibroblast under central surveillance [J].
Ducy, P ;
Schinke, T ;
Karsenty, G .
SCIENCE, 2000, 289 (5484) :1501-1504
[6]   Empirical Bayes analysis of a microarray experiment [J].
Efron, B ;
Tibshirani, R ;
Storey, JD ;
Tusher, V .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1151-1160
[7]   How many clusters? Which clustering method? Answers via model-based cluster analysis [J].
Fraley, C ;
Raftery, AE .
COMPUTER JOURNAL, 1998, 41 (08) :578-588
[8]   Mixture modelling of gene expression data from microarray experiments [J].
Ghosh, D ;
Chinnaiyan, AM .
BIOINFORMATICS, 2002, 18 (02) :275-286
[9]   Transplantability and therapeutic effects of bone marrow-derived mesenchymal cells in children with osteogenesis imperfecta [J].
Horwitz, EM ;
Prockop, DJ ;
Fitzpatrick, LA ;
Koo, WWK ;
Gordon, PL ;
Neel, M ;
Sussman, M ;
Orchard, P ;
Marx, JC ;
Pyeritz, RE ;
Brenner, MK .
NATURE MEDICINE, 1999, 5 (03) :309-313
[10]   ROBUST PROPERTIES OF LIKELIHOOD RATIO TESTS [J].
KENT, JT .
BIOMETRIKA, 1982, 69 (01) :19-27