Using information retrieval techniques for supporting data mining

被引:43
作者
Kouris, IN [1 ]
Makris, CH
Tsakalidis, AK
机构
[1] Univ Patras, Sch Engn, Dept Comp Engn & Informat, Patras 26500, Greece
[2] Comp Technol Inst, GR-26110 Patras, Greece
[3] Technol Educ Inst Mesolonghi, Dept Applied Informat Management & Finance, Hellas, Greece
关键词
knowledge discovery; E-commerce; itemsets recommendations; indexing; boolean-ranked queries;
D O I
10.1016/j.datak.2004.07.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The classic two-stepped approach of the Apriori algorithm and its descendants, which consisted of finding all large itemsets and then using these itemsets to generate all association rules has worked well for certain categories of data. Nevertheless for many other data types this approach shows highly degraded performance and proves rather inefficient. We argue that we need to search all the search space of candidate itemsets but rather let the database unveil its secrets as the customers use it. We propose a system that does not merely scan all possible combinations of the itemsets, but rather acts like a search engine specifically implemented for making recommendations to the customers using techniques borrowed from Information Retrieval. (C) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:353 / 383
页数:31
相关论文
共 43 条
  • [1] AAS, 1997, 922 NORW COMP CTR
  • [2] Achlioptas D., 2001, PROC 33 ACM S THEORY, P611
  • [3] Online generation of association rules
    Aggarwal, CG
    Yu, PS
    [J]. 14TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 1998, : 402 - 411
  • [4] Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
  • [5] Agrawal R, 1994, P 20 INT C VER LARG, V1215, P487
  • [6] [Anonymous], MINING WEB DISCOVERI
  • [7] [Anonymous], 1949, Human behaviour and the principle of least-effort
  • [8] [Anonymous], P INT C VER LARG DAT
  • [9] Azar Y., 2001, P 33 ANN ACM S THEOR, P619, DOI [10.1145/380752.380859, DOI 10.1145/380752.380859]
  • [10] Bayardo R.J., 1999, P 5 ACM SIGKDD INT C, P145, DOI [10.1145/312129.312219, DOI 10.1145/312129.312219]