Semilinear high-dimensional model for normalization of microarray data: A theoretical analysis and partial consistency

被引:54
作者
Fan, JQ [1 ]
Peng, H
Huang, T
机构
[1] Princeton Univ, Dept Operat Res & Financial Engn, Princeton, NJ 08544 USA
[2] Yale Univ, Sch Med, Dept Epidemiol & Publ Hlth, New Haven, CT 06511 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
aggregation; cDNA microarray; in-slide replications; normalization; partial consistency; semiparametric models; SLIM;
D O I
10.1198/016214504000001781
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Normalization of microarray data is essential for removing experimental biases and revealing meaningful biological results. Motivated by a problem of normalizing microarray data, a semilinear in-slide model (SLIM) has been proposed. To aggregate information from other arrays. SLIM is generalized to account for across-array information, resulting in an even more dynamic semiparametric regression model. This model can be used to normalize microarray data even when there is no replication within an array. We demonstrate that this semiparametric model has a number of interesting features. The parametric component and the nonparametric component that are of primary interest can be consistently estimated, the former having a parametric rate and the latter having a nonparametric rate, whereas the nuisance parameters cannot be consistently estimated. This is an interesting extension of the partial consistent phenomena, which itself is of theoretical interest. The asymptotic normality for the parametric component and the rate of convergence for the nonparametric component are established. The results are augmented by simulation studies and illustrated by an application to the cDNA microarray analysis of neuroblastoma cells in response to the macrophage migration inhibitory factor.
引用
收藏
页码:781 / 796
页数:16
相关论文
共 19 条
[1]   Generalized partially linear single-index models [J].
Carroll, RJ ;
Fan, JQ ;
Gijbels, I ;
Wand, MP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1997, 92 (438) :477-489
[2]   Gene expression data: The technology and statistical analysis [J].
Craig, BA ;
Black, MA ;
Doerge, RW .
JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS, 2003, 8 (01) :1-28
[3]   Efficient estimation of conditional variance functions in stochastic regression [J].
Fan, JQ ;
Yao, Q .
BIOMETRIKA, 1998, 85 (03) :645-660
[4]   Normalization and analysis of cDNA microarrays using within-array replications applied to neuroblastoma cell response to a cytokine [J].
Fan, JQ ;
Tam, P ;
Woude, GV ;
Ren, Y .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (05) :1135-1140
[5]   DESIGN-ADAPTIVE NONPARAMETRIC REGRESSION [J].
FAN, JQ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1992, 87 (420) :998-1004
[6]  
FAN JQ, 1995, J ROY STAT SOC B, V57, P371
[7]   Global and specific translational control by rapamycin in T cells uncovered by microarrays and proteomics [J].
Grolleau, A ;
Bowman, J ;
Pradet-Balade, B ;
Puravs, E ;
Hanash, S ;
Garcia-Sanz, JA ;
Beretta, L .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2002, 277 (25) :22175-22184
[8]  
Hardle W., 2000, Partially Linear Models, DOI 10.1007/978-3-642-57700-0
[9]  
HUANG J, 2003, UNPUB 2 WAY SEMILINE
[10]  
HUANG J, 2003, 2003006 RUTG U