A comparison of word- and sense-based text categorization using several classification algorithms

被引:54
作者
Kehagias, A [1 ]
Petridis, V
Kaburlasos, VG
Fragkou, P
机构
[1] Aristotle Univ Thessaloniki, Dept Math Phys & Comp Sci, Div Math, GR-54124 Thessaloniki, Greece
[2] Aristotle Univ Thessaloniki, Dept Elect & Comp Engn, Div Elect & Comp Engn, GR-54124 Thessaloniki, Greece
[3] Inst Educ Technol Kavala, Dept Ind Informat, Div Software Syst, GR-65404 Kavala, Greece
关键词
text categorization; word senses; information retrieval; FLNMAP with voting;
D O I
10.1023/A:1025554732352
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most of the text categorization algorithms in the literature represent documents as collections of words. An alternative which has not been sufficiently explored is the use of word meanings, also known as senses. In this paper, using several algorithms, we compare the categorization accuracy of classifiers based on words to that of classifiers based on senses. The document collection on which this comparison takes place is a subset of the annotated Brown Corpus semantic concordance. A series of experiments indicates that the use of senses does not result in any significant categorization improvement.
引用
收藏
页码:227 / 247
页数:21
相关论文
共 32 条
  • [21] PETRIDIS V, 1998, PREDICTIVE MODULAR N
  • [22] PETRIDIS V, 2000, P 7 C MECH MACH VIS, P201
  • [23] The impact on retrieval effectiveness of skewed frequency distributions
    Sanderson, M
    van Rijsbergen, CJ
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1999, 17 (04) : 440 - 465
  • [24] SANDERSON M, 1994, P 17 ANN INT ACM SIG, P142
  • [25] SANDERSON M, 2000, INFORMATION RETRIEVA, V2, P49, DOI DOI 10.1023/A:1009933700147.
  • [26] Scott S., 1998, P COLING ACL WORKSH, P45
  • [27] Machine learning in automated text categorization
    Sebastiani, F
    [J]. ACM COMPUTING SURVEYS, 2002, 34 (01) : 1 - 47
  • [28] Integrating linguistic resources in TC through WSD
    Ureña-López, LA
    Buenaga, M
    Gómez, JM
    [J]. COMPUTERS AND THE HUMANITIES, 2001, 35 (02): : 215 - 230
  • [29] URENALOPEZ LA, 1998, P 1 WORKSH TEXT SPEE
  • [30] Yang Y., 1997, P 14 INT C MACH LEAR, V97, P412, DOI DOI 10.1016/J.ESWA.2008.05.026