Latent semantic kernels

被引:114
作者
Cristianini, N [1 ]
Shawe-Taylor, J [1 ]
Lodhi, H [1 ]
机构
[1] Univ London Royal Holloway & Bedford New Coll, Dept Comp Sci, Egham TW20 0EX, Surrey, England
基金
加拿大自然科学与工程研究理事会;
关键词
kernel methods; latent semantic indexing; latent semantic kernels; Gram-Schmidt kernels; text categorization;
D O I
10.1023/A:1013625426931
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Kernel methods like support vector machines have successfully been used for text categorization. A standard choice of kernel function has been the inner product between the vector-space representation of two documents, in analogy with classical information retrieval (IR) approaches. Latent semantic indexing (LSI) has been successfully used for IR purposes as a technique for capturing semantic relations between terms and inserting them into the similarity measure between two documents. One of its main drawbacks, in IR, is its computational cost. In this paper we describe how the LSI approach can be implemented in a kernel-defined feature space. We provide experimental results demonstrating that the approach can significantly improve performance, and that it does not impair it.
引用
收藏
页码:127 / 152
页数:26
相关论文
共 28 条
  • [1] Aizerman M., 1964, AUTOMAT REM CONTR, V25, P821, DOI DOI 10.1234/12345678
  • [2] [Anonymous], 2001, NV2TR1998030 MATH WO
  • [3] [Anonymous], 2000, ADV LARGE MARGIN CLA
  • [4] [Anonymous], P 5 ANN WORKSH COMP
  • [5] [Anonymous], 2005, EUR C MACH LEARN
  • [6] Cristianini N, 2000, Intelligent Data Analysis: An Introduction
  • [7] DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
  • [8] 2-9
  • [9] Dumais S, 1998, 7 INT C INF KNOWL MA
  • [10] Dumais S., 1997, AAAI SPRING S CROSS, P115