Improving BERT-Based Text Classification With Auxiliary Sentence and Domain Knowledge

被引:73
作者
Yu, Shanshan [1 ,2 ]
Su, Jindian [3 ]
Luo, Da [3 ]
机构
[1] Guangdong Pharmaceut Univ, Coll Med Informat Engn, Guangzhou 510640, Peoples R China
[2] Guangdong Pharmaceut Univ, Guangdong Prov Precise Med & Big Data Engn Techno, Guangzhou 510640, Peoples R China
[3] South China Univ Technol, Coll Comp Sci & Engn, Guangzhou 510640, Peoples R China
来源
IEEE ACCESS | 2019年 / 7卷
基金
中国国家自然科学基金;
关键词
Natural language processing; text classification; bidirectional encoder representations from transformer; neural networks; language model;
D O I
10.1109/ACCESS.2019.2953990
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
General language model BERT pre-trained on cross-domain text corpus, BookCorpus and Wikipedia, achieves excellent performance on a couple of natural language processing tasks through the way of fine-tuning in the downstream tasks. But it still lacks of task-specific knowledge and domain-related knowledge for further improving the performance of BERT model and more detailed fine-tuning strategy analyses are necessary. To address these problem, a BERT-based text classification model BERT4TC is proposed via constructing auxiliary sentence to turn the classification task into a binary sentence-pair one, aiming to address the limited training data problem and task-awareness problem. The architecture and implementation details of BERT4TC are also presented, as well as a post-training approach for addressing the domain challenge of BERT. Finally, extensive experiments are conducted on seven public widely-studied datasets for analyzing the fine-tuning strategies from the perspectives of learning rate, sequence length and hidden state vector selection. After that, BERT4TC models with different auxiliary sentences and post-training objectives are compared and analyzed in depth. The experiment results show that BERT4TC with suitable auxiliary sentence significantly outperforms both typical feature-based methods and fine-tuning methods, and achieves new state-of-the-art performance on multi-class classification datasets. For binary sentiment classification datasets, our BERT4TC post-trained with suitable domain-related corpus also achieves better results compared with original BERT model.
引用
收藏
页码:176600 / 176612
页数:13
相关论文
共 24 条
  • [1] [Anonymous], TECH REP
  • [2] [Anonymous], 2019, ARXIV190505583
  • [3] [Anonymous], P WORKSH ICLR JAN
  • [4] [Anonymous], TECH REP
  • [5] [Anonymous], TECH REP
  • [6] Conneau A, 2017, 15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, P1107
  • [7] Deep Pyramid Convolutional Neural Networks for Text Categorization
    Johnson, Rie
    Zhang, Tong
    [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 562 - 570
  • [8] Kokkinos F., 2017, P 15 C EUROPEAN CHAP, P586
  • [9] Maas A. L., 2011, P 49 ANN M ASS COMPU, P142, DOI DOI 10.5555/2002472.2002491
  • [10] McCann B., 2017, ADV NEURAL INFORM PR, P6294