Automatic User Goals Identification Based on Anchor Text and Click-Through Data

被引:6
作者
YUAN Xiaojie
机构
关键词
query classification; user goals; anchor text; click-through data; information retrieval;
D O I
暂无
中图分类号
TP391.41 [];
学科分类号
080203 ;
摘要
Understanding the underlying goal behind a user’s Web query has been proved to be helpful to improve the quality of search. This paper focuses on the problem of automatic identifica-tion of query types according to the goals. Four novel en-tropy-based features extracted from anchor data and click-through data are proposed, and a support vector machines (SVM) classifier is used to identify the user’s goal based on these features. Experi-mental results show that the proposed entropy-based features are more effective than those reported in previous work. By combin-ing multiple features the goals for more than 97% of the queries studied can be correctly identified. Besides these, this paper reaches the following important conclusions: First, anchor-based features are more effective than click-through-based features; Second, the number of sites is more reliable than the number of links; Third, click-distribution- based features are more effective than session-based ones.
引用
收藏
页码:495 / 500
页数:6
相关论文
共 2 条
[1]  
A taxonomy of web search[J] . Andrei Broder.ACM SIGIR Forum . 2002 (2)
[2]   An Evaluation of Statistical Approaches to Text Categorization [J].
Yiming Yang .
Information Retrieval, 1999, 1 (1-2) :69-90