Introduction to semi-supervised learning

被引:450
作者
Goldberg, Xiaojin [1 ]
机构
[1] University of Wisconsin, Madison
来源
Synthesis Lectures on Artificial Intelligence and Machine Learning | 2009年 / 6卷
关键词
Cluster-then-label; Co-training; Expectation maximization (EM); Gaussian mixture model; Harmonic function; Label propagation; Manifold regularization; Mincut; Multiview learning; Self-training; Semi-supervised learning; Transductive learning;
D O I
10.2200/S00196ED1V01Y200906AIM006
中图分类号
学科分类号
摘要
Semi-supervised learning is a learning paradigm concerned with the study of how computers and natural systems such as humans learn in the presence of both labeled and unlabeled data. Traditionally, learning has been studied either in the unsupervised paradigm (e.g., clustering, outlier detection) where all the data are unlabeled, or in the supervised paradigm (e.g., classification, regression) where all the data are labeled. The goal of semi-supervised learning is to understand how combining labeled and unlabeled data may change the learning behavior, and design algorithms that take advantage of such a combination. Semi-supervised learning is of great interest in machine learning and data mining because it can use readily available unlabeled data to improve supervised learning tasks when the labeled data are scarce or expensive. Semi-supervised learning also shows potential as a quantitative tool to understand human category learning, where most of the input is self-evidently unlabeled. In this introductory book, we present some popular semi-supervised learning models, including self-training, mixture models, co-training and multiview learning, graph-based methods, and semi-supervised support vector machines. For each model, we discuss its basic mathematical formulation. The success of semi-supervised learning depends critically on some underlying assumptions. We emphasize the assumptions made by each model and give counterexamples when appropriate to demonstrate the limitations of the different models. In addition, we discuss semi-supervised learning for cognitive psychology. Finally, we give a computational learning theoretic perspective on semi-supervised learning, and we conclude the book with a brief discussion of open questions in the field. Copyright; 2009 by Morgan & Claypool.
引用
收藏
页码:1 / 116
页数:115
相关论文
共 206 条
[1]  
Abney S., Semisupervised Learning for Computational Linguistics, (2007)
[2]  
Altun Y., McAllester D., Belkin M., Maximum margin semi-supervised learning for structured variables, Advances in Neural Information Processing Systems (NIPS), 18, (2005)
[3]  
Amazon Mechanical Turk
[4]  
Amini M., Laviolette F., Usunier N., A transductive bound for the voted classifier with an application to semi-supervised learning, Advances in Neural Information Processing Systems 21, (2009)
[5]  
Ando R., Zhang T., A framework for learning predictive structures from multiple tasks and unlabeled data, Journal of Machine Learning Research, 6, pp. 1817-1853, (2005)
[6]  
Argyriou A., Efficient approximation methods for harmonic semi-supervised learning, (2004)
[7]  
Ashby F.G., Queller S., Berretty P.M., On the dominance of unidimensional rules in unsupervised categorization, Perception & Psychophysics, 61, pp. 1178-1199, (1999)
[8]  
Azran A., The rendezvous algorithm: Multiclass semi-supervised learning with Markov random walks, Proceedings of the 24th Annual International Conference on Machine Learning (ICML 2007), pp. 49-56, (2007)
[9]  
Balcan M.-F., Blum A., A PAC-style model for learning from labeled and unlabeled data, COLT 2005, (2005)
[10]  
Balcan M.-F., Blum A., An augmented pac model for semi-supervised learning, Semi-Supervised Learning, (2006)