Detecting Accounting Fraud in Publicly Traded US Firms Using a Machine Learning Approach

被引:205
作者
Bao, Yang [1 ]
Ke, Bin [2 ]
Li, Bin [3 ]
Yu, Y. Julia [4 ]
Zhang, Jie [5 ]
机构
[1] Shanghai Jiao Tong Univ, Antai Coll Econ & Management, Shanghai, Peoples R China
[2] Natl Univ Singapore, NUS Business Sch, Dept Accounting, Singapore, Singapore
[3] Wuhan Univ, Econ & Management Sch, Dept Finance, Wuhan, Peoples R China
[4] Univ Virginia, McIntire Sch Commerce, Charlottesville, VA 22903 USA
[5] Nanyang Technol Univ, Sch Comp Engn, Singapore, Singapore
基金
中国国家自然科学基金;
关键词
C53; M41; fraud prediction; machine learning; ensemble learning; FINANCIAL-STATEMENTS; EQUITY INCENTIVES; MANAGEMENT FRAUD; CORPORATE FRAUD;
D O I
10.1111/1475-679X.12292
中图分类号
F8 [财政、金融];
学科分类号
0202 ;
摘要
We develop a state-of-the-art fraud prediction model using a machine learning approach. We demonstrate the value of combining domain knowledge and machine learning methods in model building. We select our model input based on existing accounting theories, but we differ from prior accounting research by using raw accounting numbers rather than financial ratios. We employ one of the most powerful machine learning methods, ensemble learning, rather than the commonly used method of logistic regression. To assess the performance of fraud prediction models, we introduce a new performance evaluation metric commonly used in ranking problems that is more appropriate for the fraud prediction task. Starting with an identical set of theory-motivated raw accounting numbers, we show that our new fraud prediction model outperforms two benchmark models by a large margin: the Dechow et al. logistic regression model based on financial ratios, and the Cecchini et al. support-vector-machine model with a financial kernel that maps raw accounting numbers into a broader set of ratios.
引用
收藏
页码:199 / 235
页数:37
相关论文
共 59 条
[41]   Comparing Boosting and Bagging Techniques With Noisy and Imbalanced Data [J].
Khoshgoftaar, Taghi M. ;
Van Hulse, Jason ;
Napolitano, Amri .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2011, 41 (03) :552-568
[42]   Prediction Policy Problems [J].
Kleinberg, Jon ;
Ludwig, Jens ;
Mullainathan, Sendhil ;
Obermeyer, Ziad .
AMERICAN ECONOMIC REVIEW, 2015, 105 (05) :491-495
[43]   Detecting Deceptive Discussions in Conference Calls [J].
Larcker, David F. ;
Zakolyukina, Anastasia A. .
JOURNAL OF ACCOUNTING RESEARCH, 2012, 50 (02) :495-540
[44]   The Information Content of Forward-Looking Statements in Corporate Filings-A Naive Bayesian Machine Learning Approach [J].
Li, Feng .
JOURNAL OF ACCOUNTING RESEARCH, 2010, 48 (05) :1049-1102
[45]  
Liu XY, 2013, IMBALANCED LEARNING: FOUNDATIONS, ALGORITHMS, AND APPLICATIONS, P61
[46]   Exploratory Undersampling for Class-Imbalance Learning [J].
Liu, Xu-Ying ;
Wu, Jianxin ;
Zhou, Zhi-Hua .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2009, 39 (02) :539-550
[47]   Earnings management and annual report readability [J].
Lo, Kin ;
Ramos, Felipe ;
Rogo, Rafael .
JOURNAL OF ACCOUNTING & ECONOMICS, 2017, 63 (01) :1-25
[48]  
Murphy K.J., 1999, HDB LABOR EC, VIII., P2485
[49]   Financial Statement Fraud Detection: An Analysis of Statistical and Machine Learning Algorithms [J].
Perols, Johan .
AUDITING-A JOURNAL OF PRACTICE & THEORY, 2011, 30 (02) :19-50
[50]   Finding Needles in a Haystack: Using Data Analytics to Improve Fraud Prediction [J].
Perols, Johan L. ;
Bowen, Robert M. ;
Zimmermann, Carsten ;
Samba, Basamba .
ACCOUNTING REVIEW, 2017, 92 (02) :221-245