Improving BERT-Based Text Classification With Auxiliary Sentence and Domain Knowledge

被引：73

作者：

Yu, Shanshan ^{[1
,2
]}

Su, Jindian ^{[3
]}

Luo, Da ^{[3
]}

机构：

[1] Guangdong Pharmaceut Univ, Coll Med Informat Engn, Guangzhou 510640, Peoples R China

[2] Guangdong Pharmaceut Univ, Guangdong Prov Precise Med & Big Data Engn Techno, Guangzhou 510640, Peoples R China

[3] South China Univ Technol, Coll Comp Sci & Engn, Guangzhou 510640, Peoples R China

来源：

IEEE ACCESS | 2019年 / 7卷

基金：

中国国家自然科学基金;

关键词：

Natural language processing; text classification; bidirectional encoder representations from transformer; neural networks; language model;

D O I：

10.1109/ACCESS.2019.2953990

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

General language model BERT pre-trained on cross-domain text corpus, BookCorpus and Wikipedia, achieves excellent performance on a couple of natural language processing tasks through the way of fine-tuning in the downstream tasks. But it still lacks of task-specific knowledge and domain-related knowledge for further improving the performance of BERT model and more detailed fine-tuning strategy analyses are necessary. To address these problem, a BERT-based text classification model BERT4TC is proposed via constructing auxiliary sentence to turn the classification task into a binary sentence-pair one, aiming to address the limited training data problem and task-awareness problem. The architecture and implementation details of BERT4TC are also presented, as well as a post-training approach for addressing the domain challenge of BERT. Finally, extensive experiments are conducted on seven public widely-studied datasets for analyzing the fine-tuning strategies from the perspectives of learning rate, sequence length and hidden state vector selection. After that, BERT4TC models with different auxiliary sentences and post-training objectives are compared and analyzed in depth. The experiment results show that BERT4TC with suitable auxiliary sentence significantly outperforms both typical feature-based methods and fine-tuning methods, and achieves new state-of-the-art performance on multi-class classification datasets. For binary sentiment classification datasets, our BERT4TC post-trained with suitable domain-related corpus also achieves better results compared with original BERT model.

引用

页码：176600 / 176612

页数：13

共 24 条

[1] [Anonymous], TECH REP
[2] [Anonymous], 2019, ARXIV190505583
[3] [Anonymous], P WORKSH ICLR JAN
[4] [Anonymous], TECH REP
[5] [Anonymous], TECH REP
[6] Conneau A, 2017, 15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, P1107
[7] Deep Pyramid Convolutional Neural Networks for Text Categorization
Johnson, Rie
Zhang, Tong
[J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 562 - 570
[8] Kokkinos F., 2017, P 15 C EUROPEAN CHAP, P586
[9] Maas A. L., 2011, P 49 ANN M ASS COMPU, P142, DOI DOI 10.5555/2002472.2002491
[10] McCann B., 2017, ADV NEURAL INFORM PR, P6294

← 1 2 3 →