Covariance matrix selection and estimation via penalised normal likelihood

被引:276
作者
Huang, JZ [1 ]
Liu, NP
Pourahmadi, M
Liu, LX
机构
[1] Texas A&M Univ, Dept Stat, College Stn, TX 77843 USA
[2] Univ Penn, Dept Stat, Philadelphia, PA 19104 USA
[3] No Illinois Univ, Div Stat, De Kalb, IL 60115 USA
[4] Columbia Univ, Mailman Sch Publ Hlth, Dept Biostat, New York, NY 10032 USA
关键词
cholesky decomposition; crossvalidation; LASSO; L-p penalty; model selection; penalised likelihood; shrinkage;
D O I
10.1093/biomet/93.1.85
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We propose a nonparametric method for identifying parsimony and for producing a statistically efficient estimator of a large covariance matrix. We reparameterise a covariance matrix through the modified Cholesky decomposition of its inverse or the one-step-ahead predictive representation of the vector of responses and reduce the nonintuitive task of modelling covariance matrices to the familiar task of model selection and estimation for a sequence of regression models. The Cholesky factor containing these regression coefficients is likely to have many off-diagonal elements that are zero or close to zero. Penalised normal likelihoods in this situation with L-1 and L-2 penalities are shown to be closely related to Tibshirani's (1996) LASSO approach and to ridge regression. Adding either penalty to the likelihood helps to produce more stable estimators by introducing shrinkage to the elements in the Cholesky factor, while, because of its singularity, the L-1 penalty will set some elements to zero and produce interpretable models. An algorithm is developed for computing the estimator and selecting the tuning parameter. The proposed maximum penalised likelihood estimator is illustrated using simulation and a real dataset involving estimation of a 102 x 102 covariance matrix.
引用
收藏
页码:85 / 98
页数:14
相关论文
共 25 条
[1]  
Anderson TW., 2003, INTRO MULTIVARIATE S
[2]  
[Anonymous], 2002, ANAL LONGITUDINAL DA
[3]   Spectral models for covariance matrices [J].
Boik, RJ .
BIOMETRIKA, 2002, 89 (01) :159-182
[4]   Statistical analysis of a telephone call center: A queueing-science perspective [J].
Brown, L ;
Gans, N ;
Mandelbaum, A ;
Sakov, A ;
Shen, HP ;
Zeltyn, S ;
Zhao, L .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2005, 100 (469) :36-50
[5]   The matrix logarithmic covariance model [J].
Chiu, TYM ;
Leonard, T ;
Tsui, KW .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1996, 91 (433) :198-210
[6]   SMOOTHING NOISY DATA WITH SPLINE FUNCTIONS [J].
WAHBA, G .
NUMERISCHE MATHEMATIK, 1975, 24 (05) :383-393
[7]   COVARIANCE SELECTION [J].
DEMPSTER, AP .
BIOMETRICS, 1972, 28 (01) :157-&
[8]   Nonparametric estimation of covariance structure in longitudinal data [J].
Diggle, PJ ;
Verbyla, AP .
BIOMETRICS, 1998, 54 (02) :401-415
[9]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360
[10]   A STATISTICAL VIEW OF SOME CHEMOMETRICS REGRESSION TOOLS [J].
FRANK, IE ;
FRIEDMAN, JH .
TECHNOMETRICS, 1993, 35 (02) :109-135