Combining multiple feature selection methods for stock prediction: Union, intersection, and multi-intersection approaches

被引:243
作者
Tsai, Chih-Fong [1 ]
Hsiao, Yu-Chieh [1 ]
机构
[1] Natl Cent Univ, Dept Informat Management, Chungli, Taiwan
关键词
Stock prediction; Feature selection; Data mining; Principal Component Analysis; Genetic algorithm; Decision trees; NEURAL-NETWORKS; GENETIC ALGORITHMS; COMPONENT ANALYSIS; IMPLEMENTATION; OPTIMIZATION; PERFORMANCE; INVESTMENT; PARAMETERS; RETURNS;
D O I
10.1016/j.dss.2010.08.028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To effectively predict stock price for investors is a very important research problem. In literature, data mining techniques have been applied to stock (market) prediction. Feature selection, a pre-processing step of data mining, aims at filtering out unrepresentative variables from a given dataset for effective prediction. As using different feature selection methods will lead to different features selected and thus affect the prediction performance, the purpose of this paper is to combine multiple feature selection methods to identify more representative variables for better prediction. In particular, three well-known feature selection methods, which are Principal Component Analysis (PCA), Genetic Algorithms (GA) and decision trees (CART), are used. The combination methods to filter out unrepresentative variables are based on union, intersection, and multi-intersection strategies. For the prediction model, the back-propagation neural network is developed. Experimental results show that the intersection between PCA and GA and the multi-intersection of PCA, GA, and CART perform the best, which are of 79% and 78.98% accuracy respectively. In addition, these two combined feature selection methods filter out near 80% unrepresentative features from 85 original variables, resulting in 14 and 17 important features respectively. These variables are the important factors for stock prediction and can be used for future investment decisions. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:258 / 269
页数:12
相关论文
共 54 条
[1]  
Abraham A, 2001, LECT NOTES COMPUT SC, V2074, P337
[2]  
[Anonymous], 2000, Technical Analysis from A to Z
[3]   Surveying stock market forecasting techniques - Part II: Soft computing methods [J].
Atsalakis, George S. ;
Valavanis, Kimon P. .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) :5932-5941
[4]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[5]   A TSK type fuzzy rule based system for stock price prediction [J].
Chang, Pei-Chann ;
Liu, Chen-Hao .
EXPERT SYSTEMS WITH APPLICATIONS, 2008, 34 (01) :135-144
[6]  
CORDINLY G, 2007, GUIDE STOCK EXCHANGE
[7]  
Dash M., 1997, Intelligent Data Analysis, V1
[8]  
De Jong K.A., 1990, P INT WORKSHOP PARAL, P38
[9]  
DU JL, 2003, KNOW HOW APPL TECHNI
[10]   The use of data mining and neural networks for forecasting stock market returns [J].
Enke, D ;
Thawornwong, S .
EXPERT SYSTEMS WITH APPLICATIONS, 2005, 29 (04) :927-940