Clustering e-commerce search engines based on their search interface pages using WISE-Cluster

被引:8
作者
Lu, Yiyao
He, Hai
Peng, Qian
Meng, Weiyi [1 ]
Yu, Clement
机构
[1] SUNY Binghamton, Dept Comp Sci, Binghamton, NY 13902 USA
[2] Univ Illinois, Dept Comp Sci, Chicago, IL 60607 USA
基金
美国国家科学基金会;
关键词
e-commerce; search engine; clustering; categorization; Web-based information systems;
D O I
10.1016/j.datak.2006.01.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a new approach to clustering e-commerce search engines (ESEs) on the Web. Our approach utilizes the features available on the interface page of each ESE, including the label terms and value terms appearing in the search form, the number of images, normalized price terms as well as other terms. The experimental results based on more than 400 ESEs indicate that the proposed approach has good clustering accuracy. The importance of different types of features is analyzed and the terms in the search form are the most important feature in obtaining quality clusters. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:231 / 246
页数:16
相关论文
共 19 条
[11]  
IPEIROTIS PG, 2001, P 2001 ACM SIGMOD IN, P67, DOI DOI 10.1145/375663.375671
[12]  
Joachims T., 1998, Lecture Notes in Computer Science, P137, DOI DOI 10.1007/BFB0026683
[13]  
MacQueen J., 1967, P 5 BERK S MATH STAT, V14, P281, DOI DOI 10.1234/12345678
[14]  
MENG W, 2002, J KNOWL INF SYST, V4, P132
[15]   Learning to Understand Information on the Internet: An Example-Based Approach [J].
Perkowitz M. ;
Doorenbos R.B. ;
Etzioni O. ;
Weld D.S. .
Journal of Intelligent Information Systems, 1997, 8 (2) :133-153
[16]  
WU W, 2004, P 2004 ACM SIGMOD IN, P95, DOI DOI 10.1145/1007568.1007582
[17]  
Yang Y, 2001, 2001 SECOND INTERNATIONAL CONFERENCE ON ENGINEERING MATERIALS, VOL II, P133
[18]   A study of approaches to hypertext categorization [J].
Yang, YM ;
Slattery, S ;
Ghani, R .
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2002, 18 (2-3) :219-241
[19]  
Zamir O., 1998, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P46, DOI 10.1145/290941.290956