Bi-LSTM Model to Increase Accuracy in Text Classification: Combining Word2vec CNN and Attention Mechanism

被引:320
作者
Jang, Beakcheol [1 ]
Kim, Myeonghwi [1 ]
Harerimana, Gaspard [1 ]
Kang, Sang-ug [1 ]
Kim, Jong Wook [1 ]
机构
[1] Sangmyung Univ, Dept Comp Sci, Seoul 03016, South Korea
来源
APPLIED SCIENCES-BASEL | 2020年 / 10卷 / 17期
基金
新加坡国家研究基金会;
关键词
text classification; CNN; Bi-LSTM; attention mechanism; BIDIRECTIONAL LSTM;
D O I
10.3390/app10175841
中图分类号
O6 [化学];
学科分类号
070301 [无机化学];
摘要
There is a need to extract meaningful information from big data, classify it into different categories, and predict end-user behavior or emotions. Large amounts of data are generated from various sources such as social media and websites. Text classification is a representative research topic in the field of natural-language processing that categorizes unstructured text data into meaningful categorical classes. The long short-term memory (LSTM) model and the convolutional neural network for sentence classification produce accurate results and have been recently used in various natural-language processing (NLP) tasks. Convolutional neural network (CNN) models use convolutional layers and maximum pooling or max-overtime pooling layers to extract higher-level features, while LSTM models can capture long-term dependencies between word sequences hence are better used for text classification. However, even with the hybrid approach that leverages the powers of these two deep-learning models, the number of features to remember for classification remains huge, hence hindering the training process. In this study, we propose an attention-based Bi-LSTM+CNN hybrid model that capitalize on the advantages of LSTM and CNN with an additional attention mechanism. We trained the model using the Internet Movie Database (IMDB) movie review data to evaluate the performance of the proposed model, and the test results showed that the proposed hybrid attention Bi-LSTM+CNN model produces more accurate classification results, as well as higher recall and F1 scores, than individual multi-layer perceptron (MLP), CNN or LSTM models as well as the hybrid models.
引用
收藏
页数:14
相关论文
共 42 条
[1]
[Anonymous], 2020, INT J ENV RES PUB HE
[2]
[Anonymous], 2016, REPRESENT YOURSELF C
[3]
[Anonymous], 2016, ARXIV161101884
[4]
[Anonymous], ARXIV14042188
[5]
[Anonymous], ARXIV170201923
[6]
Ceraj T., 2019, REDEFINING CANC TREA
[7]
Understanding the use of Virtual Reality in Marketing: A text mining-based review [J].
Correia Loureiro, Sandra Maria ;
Guerreiro, Joao ;
Eloy, Sara ;
Langaro, Daniela ;
Panchapakesan, Padma .
JOURNAL OF BUSINESS RESEARCH, 2019, 100 :514-530
[8]
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[9]
A Fusion Model-Based Label Embedding and Self-Interaction Attention for Text Classification [J].
Dong, Yanru ;
Liu, Peiyu ;
Zhu, Zhenfang ;
Wang, Qicai ;
Zhang, Qiuyue .
IEEE ACCESS, 2020, 8 :30548-30559
[10]
Text Classification Research with Attention-based Recurrent Neural Networks [J].
Du, C. ;
Huang, L. .
INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2018, 13 (01) :50-61