Predicting business failure using classification and regression tree: An empirical comparison with popular classical statistical methods and top classification mining methods

被引:82
作者
Li, Hui [1 ]
Sun, Jie [1 ]
Wu, Jian [1 ]
机构
[1] Zhejiang Normal Univ, Sch Business Adm, Jinhua City 321004, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Business failure prediction (BFP); Data mining; Classification and regression tree (CART); SUPPORT VECTOR MACHINES; BANKRUPTCY PREDICTION; FINANCIAL RATIOS; NEURAL-NETWORK; ENSEMBLE; MODEL; ALGORITHMS; SYSTEM;
D O I
10.1016/j.eswa.2010.02.016
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Predicting business failure is a very critical task for government officials, stock holders, managers, employees, investors and researchers, especially in nowadays competitive economic environment. Several top 10 data mining methods have become very popular alternatives in business failure prediction (BFP), e.g., support vector machine and k nearest neighbor. In comparison with the other classification mining methods, advantages of classification and regression tree (CART) methods include: simplicity of results, easy implementation, nonlinear estimation, being non-parametric, accuracy and stable. However, there are seldom researches in the area of BFP that witness the applicability of CART, another method among the top 10 algorithms in data mining. The aim of this research is to explore the performance of BFP using the commonly discussed data mining technique of CART. To demonstrate the effectiveness of BFP using CART, business failure predicting tasks were performed on the data set collected from companies listed in the Shanghai Stock Exchange and Shenzhen Stock Exchange. Thirty times' hold-out method was employed as the assessment, and the two commonly used methods in the top 10 data mining algorithms, i.e., support vector machine and k nearest neighbor, and the two baseline benchmark methods from statistic area, i.e., multiple discriminant analysis (MDA) and logistics regression, were employed as comparative methods. For comparative methods, stepwise method of MDA was employed to select optimal feature subset. Empirical results indicated that the optimal algorithm of CART outperforms all the comparative methods in terms of predictive performance and significance test in short-term BFP of Chinese listed companies. (C) 2010 Elsevier Ltd. All rights reserved.
引用
收藏
页码:5895 / 5904
页数:10
相关论文
共 57 条
[21]   Bankruptcy prediction using case-based reasoning, neural networks, and discriminant analysis [J].
Jo, HK ;
Han, IG ;
Lee, HY .
EXPERT SYSTEMS WITH APPLICATIONS, 1997, 13 (02) :97-108
[22]  
Jones S., 2007, The British Accounting Review, V39, P89
[23]  
Koh H.C., 1992, Journal of Business Finance Accounting, V19, P187, DOI DOI 10.1111/J.1468-5957.1992.TB00618.X
[24]   Mining the customer credit using classification and regression tree and multivariate adaptive regression splines [J].
Lee, TS ;
Chiu, CC ;
Chou, YC ;
Lu, CJ .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 50 (04) :1113-1130
[25]   Classification of functional data: A segmentation approach [J].
Li, Bin ;
Yu, Qingzhao .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (10) :4790-4800
[26]  
LI H, 2008, KNOWLEDGE BASED SYST, V21
[27]   A data mining approach to the prediction of corporate failure [J].
Lin, FY ;
McClean, S .
KNOWLEDGE-BASED SYSTEMS, 2001, 14 (3-4) :189-195
[28]   Developing a business failure prediction model via RST, GRA and CBR [J].
Lin, Rong-Ho ;
Wang, Yao-Tien ;
Wu, Chih-Hung ;
Chuang, Chun-Ling .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) :1593-1600
[29]   Comprehensible credit scoring models using rule extraction from support vector machines [J].
Martens, David ;
Baesens, Bart ;
Van Gestel, Tony ;
Vanthienen, Jan .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2007, 183 (03) :1466-1476
[30]  
Martin D., 1977, J BANK FINANC, V1, P249, DOI [10.1016/0378-4266(77)90022-X, DOI 10.1016/0378-4266(77)90022-X]