Mining and reasoning on workflows

被引:36
作者
Greco, G
Guzzo, A
Manco, G
Saccà, D
机构
[1] Univ Calabria, Dept Math, I-87036 Arcavacata Di Rende, CS, Italy
[2] Univ Calabria, DEIS Dept, I-87036 Arcavacata Di Rende, CS, Italy
[3] CNR, ICAR, Inst High Performance Comp & Networks, I-87036 Arcavacata Di Rende, CS, Italy
关键词
data mining; workflow management;
D O I
10.1109/TKDE.2005.63
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Today's workflow management systems represent a key technological infrastructure for advanced applications that is attracting a growing body of research, mainly focused in developing tools for workflow management, that allow users both to specify the "static" aspects, like preconditions, precedences among activities, and rules for exception handling, and to control its execution by scheduling the activities on the available resources. This paper deals with an aspect of workflows which has so far not received much attention even though it is crucial for the forthcoming scenarios of large scale applications on the Web: Providing facilities for the human system administrator for identifying the choices performed more frequently in the past that had lead to a desired final configuration. In this context, we formalize the problem of discovering the most frequent patterns of executions, i.e., the workflow substructures that have been scheduled more frequently by the system. We attacked the problem by developing two data mining algorithms on the basis of an intuitive and original graph formalization of a workflow schema and its occurrences. The model is used both to prove some intractability results that strongly motivate the use of data mining techniques and to derive interesting structural properties for reducing the search space for frequent patterns. Indeed, the experiments we have carried out show that our algorithms outperform standard data mining algorithms adapted to discover frequent patterns of workflow executions.
引用
收藏
页码:519 / 534
页数:16
相关论文
共 31 条
[1]  
Agarwal R., 1994, P 20 INT C VER LARG, V487, P499
[2]  
Agrawal R, 1998, LECT NOTES COMPUT SC, V1377, P469
[3]  
AGRAWAL R, 1995, PROC INT CONF DATA, P3, DOI 10.1109/ICDE.1995.380415
[4]  
[Anonymous], 1979, Computers and Intractablity: A Guide to the Theoryof NP-Completeness
[5]  
[Anonymous], 2002, EFFICIENTLY MINING F, DOI DOI 10.1145/775047.775058
[6]  
Bonner A. J., 1999, Proceedings of the Eighteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, P294, DOI 10.1145/303976.304005
[7]  
COOK JE, 1995, PROC INT CONF SOFTW, P73, DOI 10.1145/225014.225021
[8]  
Davulcu H., 1998, Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. PODS 1998, P25, DOI 10.1145/275487.275491
[9]   Discovery of frequent DATALOG patterns [J].
Dehaspe, L ;
Toivonen, H .
DATA MINING AND KNOWLEDGE DISCOVERY, 1999, 3 (01) :7-36
[10]   AN OVERVIEW OF WORKFLOW MANAGEMENT - FROM PROCESS MODELING TO WORKFLOW AUTOMATION INFRASTRUCTURE [J].
GEORGAKOPOULOS, D ;
HORNICK, M ;
SHETH, A .
DISTRIBUTED AND PARALLEL DATABASES, 1995, 3 (02) :119-153