Semi-supervised learning based on nearest neighbor rule and cut edges

被引:66
作者
Wang, Yu [1 ,2 ]
Xu, Xiaoyan [1 ]
Zhao, Haifeng [3 ]
Hua, Zhongsheng [1 ]
机构
[1] Univ Sci & Technol China, Sch Management, Hefei 230026, Anhui, Peoples R China
[2] Chongqing Univ, Sch Business Adm & Econ, Dept Informat Management & Informat Syst, Chongqing 400030, Peoples R China
[3] Tongji Univ, Sch Econ & Management, Shanghai 200092, Peoples R China
基金
中国国家自然科学基金;
关键词
Semi-supervised learning; Nearest neighbor rule; Cut edges; UNLABELED DATA; CLASSIFICATION;
D O I
10.1016/j.knosys.2010.03.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a novel semi-supervised learning approach based on nearest neighbor rule and cut edges In the first step of our approach, a relative neighborhood graph based on all training samples is constructed for each unlabeled sample, and the unlabeled samples whose edges are all connected to training samples from the same class are labeled. These newly labeled samples are then added into the training samples In the second step, standard self-training algorithm using nearest neighbor rule is applied for classification until a predetermined stopping criterion is met. In the third step, a statistical test is applied for label modification, and in the last step, the remaining unlabeled samples are classified using standard nearest neighbor rule The main advantages of the proposed method are: (1) it reduces the error reinforcement by using relative neighborhood graph for classification in the initial stages of semi-supervised learning: (2) it introduces a label modification mechanism for better classification performance. Experimental results show the effectiveness of the proposed approach (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:547 / 554
页数:8
相关论文
共 25 条
[1]  
[Anonymous], 2006, BOOK REV IEEE T NEUR
[2]  
[Anonymous], 1998, P 11 ANN C COMP LEAR
[3]  
[Anonymous], 2000, P 9 INT C INF KNOWL
[4]  
Bennett K. P., 1998, P NEUR INF PROC SYST
[5]  
Blum A., 2001, Learning from labeled and unlabeled data using graph mincuts
[6]   Identifying mislabeled training data [J].
Brodley, CE ;
Friedl, MA .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1999, 11 :131-167
[7]   Learning from labeled and unlabeled data: An empirical study across techniques and domains [J].
Chawla, N ;
Karakoulas, G .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2005, 23 :331-366
[8]   NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].
COVER, TM ;
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+
[9]  
Cozman F. G., 2003, P 20 INT C MACH LEAR
[10]   Locally adaptive metric nearest-neighbor classification [J].
Domeniconi, C ;
Peng, J ;
Gunopulos, D .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (09) :1281-1285