Neural-network design for small training sets of high dimension

被引:50
作者
Yuan, JL [1 ]
Fine, TL
机构
[1] Natl Chung Hsing Univ, Dept Stat, Taipei 10433, Taiwan
[2] Cornell Univ, Sch Elect Engn, Ithaca, NY 14853 USA
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 1998年 / 9卷 / 02期
基金
美国国家科学基金会;
关键词
architecture selection; difference-based variance estimation; feature/input selection; neural network design; short-term load forecasting; slicing inverse regression;
D O I
10.1109/72.661122
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a statistically based methodology for the design of neural networks when the dimension d of the network input is comparable to the size n of the training set. If one proceeds straightforwardly, then one is committed to a network of complexity exceeding n. The result will be good performance on the training set but poor generalization performance when the network is presented with new data. To avoid this we need to select carefully the network architecture, including control over the input variables. Our approach to selecting a network architecture first selects a subset of input variables (features) using the nonparametric statistical process of difference-based variance estimation and then selects a simple network architecture using projection pursuit regression (PPR) ideas combined with the statistical idea of slicing inverse regression (SIR). The resulting network, which is then retrained without regard to the PPR/SIR determined parameters, is one of moderate complexity (number of parameters significantly less than n) whose performance on the training set can be expected to generalize well. The application of this methodology is illustrated in detail in the context of shortterm forecasting of the demand for electric power from an electric utility.
引用
收藏
页码:266 / 280
页数:15
相关论文
共 41 条
[1]  
AZZAMULAZAR, 1994, IEEE T CONTROL SYSTE, V2, P135
[2]  
Barron A. R., 1991, NONPARAMETRIC FUNCTI
[3]  
BARTLETT P, 1996, ADV NEURAL INFORMATI, V9
[4]  
BRACE MC, 1993, IEEE PES WINT M
[5]   WEATHER SENSITIVE SHORT-TERM LOAD FORECASTING USING NONFULLY CONNECTED ARTIFICIAL NEURAL NETWORK [J].
CHEN, ST ;
YU, DC ;
MOGHADDAMJO, AR ;
LU, CN ;
VEMURI, S .
IEEE TRANSACTIONS ON POWER SYSTEMS, 1992, 7 (03) :1098-1105
[6]  
CONNOR J, 1992, ADV NEUR IN, V4, P301
[7]   RECURRENT NEURAL NETWORKS AND ROBUST TIME-SERIES PREDICTION [J].
CONNOR, JT ;
MARTIN, RD ;
ATLAS, LE .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :240-254
[8]  
DAMBORG MT, 1990, P 1990 IEEE INT S CI, V4
[9]   ASYMPTOTICS OF GRAPHICAL PROJECTION PURSUIT [J].
DIACONIS, P ;
FREEDMAN, D .
ANNALS OF STATISTICS, 1984, 12 (03) :793-815
[10]  
ELSHARKAWI M, 1991, P 1 INT FOR APPL NEU