基于朴素贝叶斯的文本分类研究综述

被引:252
作者
贺鸣
孙建军
成颖
机构
[1] 南京大学信息管理学院
关键词
自动分类; 朴素贝叶斯; 特征选择; 特征过滤;
D O I
暂无
中图分类号
G254.1 [分类法];
学科分类号
120501 [图书馆学];
摘要
文本自动分类是自然语言处理领域的重要分支之一,已经形成了大量的模型以及算法,其中基于朴素贝叶斯的相关研究是该领域持续的热点。本文对基于朴素贝叶斯的文本自动分类研究进行了系统的综述。探讨了多项式模型和多元伯努利模型等经典的朴素贝叶斯分类方法。重点分析了经典的特征选择方法以及包括ALOFT等在内的多种改进的特征选择方法。论文还对从加权、避免平滑等视角的NB改进算法进行了梳理。最后,提出了进一步改进NB的主要思路。
引用
收藏
页码:147 / 154
页数:8
相关论文
共 18 条
[1]
A new feature selection based on comprehensive measurement both in inter-category and intra-category for text categorization.[J].Jieming Yang;Yuanning Liu;Xiaodong Zhu;Zhen Liu;Xiaoxu Zhang.Information Processing and Management.2011, 4
[2]
A global-ranking local feature selection method for text categorization [J].
Pinheiro, Roberto H. W. ;
Cavalcanti, George D. C. ;
Correa, Renato F. ;
Ren, Tsang Ing .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (17) :12851-12857
[3]
A new feature selection algorithm based on binomial hypothesis testing for spam filtering [J].
Yang, Jieming ;
Liu, Yuanning ;
Liu, Zhen ;
Zhu, Xiaodong ;
Zhang, Xiaoxu .
KNOWLEDGE-BASED SYSTEMS, 2011, 24 (06) :904-914
[4]
Automatically computed document dependent weighting factor facility for Na?ve Bayes classification.[J].Lam Hong Lee;Dino Isa.Expert Systems With Applications.2010, 12
[5]
Constrained domain maximum likelihood estimation for naive Bayes text classification [J].
Andres-Ferrer, Jesus ;
Juan, Alfons .
PATTERN ANALYSIS AND APPLICATIONS, 2010, 13 (02) :189-196
[6]
Feature selection with a measure of deviations from Poisson in text categorization.[J].Hiroshi Ogura;Hiromi Amano;Masato Kondo.Expert Systems With Applications.2008, 3
[7]
A novel feature selection algorithm for text categorization.[J].Wenqian Shang;Houkuan Huang;Haibin Zhu;Yongmin Lin;Youli Qu;Zhihai Wang.Expert Systems With Applications.2006, 1
[8]
Improving self-organization of document collections by semantic mapping.[J].Renato Fernandes Corrêa;Teresa Bernarda Ludermir.Neurocomputing.2006, 1
[9]
Best terms: an efficient feature-selection algorithm for text categorization [J].
Fragoudis, D ;
Meretakis, D ;
Likothanassis, S .
KNOWLEDGE AND INFORMATION SYSTEMS, 2005, 8 (01) :16-33
[10]
Boosting Na?¨ve Bayes text classification using uncertainty-based selective sampling.[J].Han-Joon Kim;Je-Uk Kim;Young-Gook Ra.Neurocomputing.2005,