Building text classifiers using positive and unlabeled examples

被引：414

作者：

Bing, L ^{[1
]}

Yang, D ^{[1
]}

Li, XL ^{[1
]}

Lee, WS ^{[1
]}

Yu, PS ^{[1
]}

机构：

[1] Univ Illinois, Dept Comp Sci, Chicago, IL 60612 USA

来源：

THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS | 2003年

关键词：

D O I：

10.1109/icdm.2003.1250918

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper studies the problem of building text classifiers using positive and unlabeled examples. The key feature of this problem is that there is no negative example for learning. Recently, a few techniques for solving this problem were proposed in the literature. These techniques are based on the same idea, which builds a classifier in two steps. Each existing technique uses a different method for each step. In this paper, we first introduce some new methods for the two steps, and perform a comprehensive evaluation of all possible combinations of methods of the two steps. We then propose a more principled approach to solving the problem based on a biased formulation of SVM, and show experimentally that it is more accurate than the existing techniques.

引用

页码：179 / 186

页数：8