Genetic process mining: an experimental evaluation

被引:249
作者
de Medeiros, A. K. A. [1 ]
Weijters, A. J. M. M. [1 ]
van der Aalst, W. M. P. [1 ]
机构
[1] Eindhoven Univ Technol, Dept Technol Management, NL-5600 MB Eindhoven, Netherlands
关键词
process mining; genetic mining; genetic algorithms; Petri nets; workflow nets;
D O I
10.1007/s10618-006-0061-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the aims of process mining is to retrieve a process model from an event log. The discovered models can be used as objective starting points during the deployment of process-aware information systems (Dumas et al., eds., Process-Aware Information Systems: Bridging People and Software Through Process Technology. Wiley, New York, 2005) and/or as a feedback mechanism to check prescribed models against enacted ones. However, current techniques have problems when mining processes that contain non-trivial constructs and/or when dealing with the presence of noise in the logs. Most of the problems happen because many current techniques are based on local information in the event log. To overcome these problems, we try to use genetic algorithms to mine process models. The main motivation is to benefit from the global search performed by this kind of algorithms. The non-trivial constructs are tackled by choosing an internal representation that supports them. The problem of noise is naturally tackled by the genetic algorithm because, per definition, these algorithms are robust to noise. The main challenge in a genetic approach is the definition of a good fitness measure because it guides the global search performed by the genetic algorithm. This paper explains how the genetic algorithm works. Experiments with synthetic and real-life logs show that the fitness measure indeed leads to the mining of process models that are complete (can reproduce all the behavior in the log) and precise (do not allow for extra behavior that cannot be derived from the event log). The genetic algorithm is implemented as a plug-in in the ProM framework.
引用
收藏
页码:245 / 304
页数:60
相关论文
共 58 条
  • [1] Agrawal R, 1998, LECT NOTES COMPUT SC, V1377, P469
  • [2] INDUCTIVE INFERENCE - THEORY AND METHODS
    ANGLUIN, D
    SMITH, CH
    [J]. COMPUTING SURVEYS, 1983, 15 (03) : 237 - 269
  • [3] [Anonymous], 1998, LECT NOTES COMPUTER
  • [4] [Anonymous], 2000, EUR CONC ENG C SCS E
  • [5] Bourdeaud'huy T., 2002, P 2 IEEE INT C SYST, V1, P528
  • [6] Cook J. E., 1998, ACM Transactions on Software Engineering and Methodology, V7, P215, DOI 10.1145/287000.287001
  • [7] Discovering models of behavior for concurrent workflows
    Cook, JE
    Du, ZD
    Liu, CB
    Wolf, AL
    [J]. COMPUTERS IN INDUSTRY, 2004, 53 (03) : 297 - 319
  • [8] Software process validation: Quantitatively measuring the correspondence of a process to a model
    Cook, JE
    Wolf, AL
    [J]. ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 1999, 8 (02) : 147 - 176
  • [9] COOK JE, 1998, P 6 INT S FDN SOFTW, P35
  • [10] De Medeiros A.A., 2004, BETA WORKING PAPER S