Prediction of multivariate responses with a selected number of principal components

被引:10
作者
Koch, Inge [1 ]
Naito, Kanta
机构
[1] Univ Adelaide, Sch Math Sci, Adelaide, SA 5005, Australia
关键词
Dimension selection; Principal component regression; Supervised learning; Variable ranking; REGRESSION; VARIABLES;
D O I
10.1016/j.csda.2010.01.030
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper proposes a new method and algorithm for predicting multivariate responses in a regression setting Research into the classification of high dimension low sample size (HDLSS) data, in particular microarray data, has made considerable advances, but regression prediction for high-dimensional data with continuous responses has had less attention. Recently Bair et al (2006) proposed an efficient prediction method based on supervised principal component regression (PCR) Motivated by the fact that using a larger number of principal components results in better regression performance, this paper extends the method of Bair et at in several ways a comprehensive variable ranking is combined with a selection of the best number of components for PCR, and the new method further extends to regression with multivariate responses The new method is particularly suited to addressing HDLSS problems Applications to simulated and real data demonstrate the performance of the new method Comparisons with the findings of Bail et al (2006) show that for high-dimensional data in particular the new ranking results in a smaller number of predictors and smaller errors (C) 2010 Elsevier B V All rights reserved
引用
收藏
页码:1791 / 1807
页数:17
相关论文
共 20 条
[1]  
[Anonymous], 1980, Multivariate Analysis
[2]   Prediction by supervised principal components [J].
Bair, E ;
Hastie, T ;
Paul, D ;
Tibshirani, R .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (473) :119-137
[3]   Predicting multivariate responses in multiple linear regression [J].
Breiman, L ;
Friedman, JH .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1997, 59 (01) :3-37
[4]   Elimination of uninformative variables for multivariate calibration [J].
Centner, V ;
Massart, DL ;
deNoord, OE ;
deJong, S ;
Vandeginste, BM ;
Sterna, C .
ANALYTICAL CHEMISTRY, 1996, 68 (21) :3851-3858
[5]   PROJECTION PURSUIT ALGORITHM FOR EXPLORATORY DATA-ANALYSIS [J].
FRIEDMAN, JH ;
TUKEY, JW .
IEEE TRANSACTIONS ON COMPUTERS, 1974, C 23 (09) :881-890
[6]   EXPLORATORY PROJECTION PURSUIT [J].
FRIEDMAN, JH .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1987, 82 (397) :249-266
[7]  
Gilmour S, 2004, PROCEEDINGS OF THE 2004 INTELLIGENT SENSORS, SENSOR NETWORKS & INFORMATION PROCESSING CONFERENCE, P271
[8]   Independent component analysis yields chemically interpretable latent variables in multivariate regression [J].
Gustafsson, MG .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2005, 45 (05) :1244-1255
[9]   Forward stagewise regression and the monotone lasso [J].
Hastie, Trevor ;
Taylor, Jonathan ;
Tibshirani, Robert ;
Walther, Guenther .
ELECTRONIC JOURNAL OF STATISTICS, 2007, 1 :1-29
[10]  
HELLAND IS, 1990, SCAND J STAT, V17, P97