Simultaneously Discovering and Quantifying Risk Types from Textual Risk Disclosures

被引:272
作者
Bao, Yang [1 ]
Datta, Anindya [1 ]
机构
[1] Natl Univ Singapore, Dept Informat Syst, Singapore 117417, Singapore
关键词
topic modeling; latent Dirichlet allocation; text analysis; econometric analysis; risk disclosures; TOPIC MODEL; INFORMATION-CONTENT; VOLATILITY; VOLUME; PRESS; COST; TONE;
D O I
10.1287/mnsc.2014.1930
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
Managers and researchers alike have long recognized the importance of corporate textual risk disclosures. Yet it is a nontrivial task to discover and quantify variables of interest from unstructured text. In this paper, we develop a variation of the latent Dirichlet allocation topic model and its learning algorithm for simultaneously discovering and quantifying risk types from textual risk disclosures. We conduct comprehensive evaluations in terms of both conventional statistical fit and substantive fit with respect to the quality of discovered information. Experimental results show that our proposed method outperforms all competing methods, and could find more meaningful topics (risk types). By taking advantage of our proposed method for measuring risk types from textual data, we study how risk disclosures in 10-K forms affect the risk perceptions of investors. Different from prior studies, our results provide support for all three competing arguments regarding whether and how risk disclosures affect the risk perceptions of investors, depending on the specific risk types disclosed. We find that around two-thirds of risk types lack informativeness and have no significant influence. Moreover, we find that the informative risk types do not necessarily increase the risk perceptions of investors-the disclosure of three types of systematic and liquidity risks will increase the risk perceptions of investors, whereas the other five types of unsystematic risks will decrease them. Data, as supplemental material, are available at http://dx.doi.org/10.1287/mnsc.2014.1930.
引用
收藏
页码:1371 / 1391
页数:21
相关论文
共 51 条
[1]  
[Anonymous], 2011, P 5 INT JOINT C NATU
[2]  
[Anonymous], 2003, P 26 ANN INT ACM SIG
[3]  
Aral S., 2011, Thirty Second International Conference on Information Systems, P1
[4]  
Asuncion A., 2009, C UNC ART INT UAI QU, P27, DOI DOI 10.1080/10807030390248483
[5]  
Bin Lu, 2011, 2011 IEEE International Conference on Data Mining Workshops, P81, DOI 10.1109/ICDMW.2011.125
[6]   A CORRELATED TOPIC MODEL OF SCIENCE [J].
Blei, David M. ;
Lafferty, John D. .
ANNALS OF APPLIED STATISTICS, 2007, 1 (01) :17-35
[7]   Variational Inference for Dirichlet Process Mixtures [J].
Blei, David M. ;
Jordan, Michael I. .
BAYESIAN ANALYSIS, 2006, 1 (01) :121-143
[8]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[9]  
Brody S., 2010, P HUMAN LANGUAGE TEC, P804
[10]   The information content of mandatory risk factor disclosures in corporate filings [J].
Campbell, John L. ;
Chen, Hsinchun ;
Dhaliwal, Dan S. ;
Lu, Hsin-min ;
Steele, Logan B. .
REVIEW OF ACCOUNTING STUDIES, 2014, 19 (01) :396-455