Process mining: a two-step approach to balance between underfitting and overfitting

被引:293
作者
van der Aalst, W. M. P. [1 ]
Rubin, V. [2 ]
Verbeek, H. M. W. [1 ]
van Dongen, B. F. [1 ]
Kindler, E. [3 ]
Gunther, C. W. [1 ]
机构
[1] Eindhoven Univ Technol, NL-5600 MB Eindhoven, Netherlands
[2] Software Design & Management sd&m AG, Offenbach, Germany
[3] Tech Univ Denmark, DK-2800 Lyngby, Denmark
关键词
PROCESS MODELS; PETRI NETS; DISCOVERY;
D O I
10.1007/s10270-008-0106-z
中图分类号
TP31 [计算机软件];
学科分类号
081205 [计算机软件];
摘要
Process mining includes the automated discovery of processes from event logs. Based on observed events (e.g., activities being executed or messages being exchanged) a process model is constructed. One of the essential problems in process mining is that one cannot assume to have seen all possible behavior. At best, one has seen a representative subset. Therefore, classical synthesis techniques are not suitable as they aim at finding a model that is able to exactly reproduce the log. Existing process mining techniques try to avoid such "overfitting" by generalizing the model to allow for more behavior. This generalization is often driven by the representation language and very crude assumptions about completeness. As a result, parts of the model are "overfitting" (allow only for what has actually been observed) while other parts may be "underfitting" (allow for much more behavior without strong support for it). None of the existing techniques enables the user to control the balance between "overfitting" and "underfitting". To address this, we propose a two-step approach. First, using a configurable approach, a transition system is constructed. Then, using the "theory of regions", the model is synthesized. The approach has been implemented in the context of ProM and overcomes many of the limitations of traditional approaches.
引用
收藏
页码:87 / 111
页数:25
相关论文
共 42 条
[1]
Agrawal R, 1998, LECT NOTES COMPUT SC, V1377, P469
[2]
Alves de Medeiros A.K., 2006, THESIS EINDHOVEN U T
[3]
[Anonymous], BPM0630
[4]
The synthesis problem for elementary net systems is NP-complete [J].
Badouel, E ;
Bernardinello, L ;
Darondeau, P .
THEORETICAL COMPUTER SCIENCE, 1997, 186 (1-2) :107-134
[5]
Badouel E., 1998, Lectures on Petri Nets I: Basic Models. Advances in Petri Nets, P529
[6]
Bergenthum R, 2007, LECT NOTES COMPUT SC, V4714, P375
[7]
Cook J. E., 1998, ACM Transactions on Software Engineering and Methodology, V7, P215, DOI 10.1145/287000.287001
[8]
Cortadella J, 1997, IEICE T INF SYST, VE80D, P315
[9]
Deriving Petri Nets from finite transition systems [J].
Cortadella, J ;
Kishinevsky, M ;
Lavagno, L ;
Yakovlev, A .
IEEE TRANSACTIONS ON COMPUTERS, 1998, 47 (08) :859-882
[10]
CORTADELLA J, 1995, P 1995 IEEE ACM INT, P164