ON THE EXPONENTIAL VALUE OF LABELED SAMPLES

被引:113
作者
CASTELLI, V [1 ]
COVER, TM [1 ]
机构
[1] STANFORD UNIV,DEPT ELECT ENGN,INFORMAT SYST LAB,STANFORD,CA 94305
基金
美国国家科学基金会;
关键词
D O I
10.1016/0167-8655(94)00074-D
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Consider the problem of classifying a sample X(0) into one of two classes, using a training set Q. Let Q be composed of l labeled samples {(X(1), theta(1)), ..., (X(l), theta(l))} and u unlabeled samples {X'(1), ..., X'(u)}, where the labels theta(i) are i.i.d. Bernoulli(eta) random variables over the set {1, 2}, the observations {X(i)}(l)(i=1) are distributed according to f(theta i)(.) and the unlabeled observations {X'(j)}(u)(j=l) are independently distributed according to the mixture density f(X')(.) = nf(1)(.)+(1-eta)f(.). We assume that f(1)(.), f(2)(.) and eta are all unknown. Let f(1)(.) and f(2)(.) belong to a known family F, and assume that the mixtures of elements of F are identifiable. Even when the number of unlabeled samples is infinite and the decision regions can therefore be identified, one still needs labeled samples to label the decision regions with the correct classification. Letting R(l, u) denote the optimal probability of error for l labeled and u unlabeled samples, and assuming that the pairwise mixtures of F are identifiable, we obtain the obvious statements R(0, u)=R(0, infinity) = 1 /2, R(1, 0) less than or equal to 2 eta eta, R(infinity, u) =R*, and then prove R(1, infinity) = 2R*(1-R*), where R* is the Bayes probability of error, and R(l, infinity) = R*+exp{-alpha l+o(l)}, where the exponent alpha is given by -log(2 root eta eta integral root f(1)(x)f(2)(x)dx). Thus the first labeled sample reduces the risk from 1/2 to 2R*(1-R*) and subsequent labeled samples in the training set reduce the probability of error exponentially fast to the Bayes risk.
引用
收藏
页码:105 / 111
页数:7
相关论文
共 18 条