To prevent spam users, you can only post on this forum after registration, which is by invitation. If you want to post on the forum, please send me a mail (h DOT m DOT w DOT verbeek AT tue DOT nl) and I'll send you an invitation in return for an account.
Q: How does Inductive Miner directly follows detect embedded optional sequences?
Again, "Inductive Miner  directly follows" in ProM 6.10 has me stumped. The implementation obviously has a way to detect optional sequences within sequences, which is nice. However, while reading Sander Leemans' Ph. D. thesis, I cannot find the specification for this behaviour. The specification in the dissertation seems to lead to a flat structure in those examples. This is surprising, as a relevant example showing the nested structure is contained in Sander's thesis itself. So the example and the behaviour of ProM 6.10 match, but I can't for the life of me find the section of the thesis where this is specified.
Can anyone point me to that section, please?
Here's the example:
Consider this log:
(a)
(a, b, c),
(a, c),
(a, e),
(b, c, d, e),
(c, d),
(c, e),
(e)
This log will lead to the DFG that is shown in Figure 5.20 of the thesis as the graph of process tree M61.
The graph has edges (a, b), (a, c), (a, e), (b, c), (c, d), (c, e), (d, e) where a, b, c, e are start nodes and a, c, d, e are end nodes.
As I understand the definition of sequence cut and the function sequenceDfgSplit, we should get a flat sequence in which every activity is optional (that is, has a tau activity as an alternative). This flat structure does not capture the facts that if b does not end the process, then c must occur afterwards, and if d occurs, then c must occur before it. ProM discovers a process tree that captures these facts by discovering toplevel clusters (a), (b,c,d), (e), each of which according to sequenceDfgSplit is optional, and then recursing on (b,c,d) to find a sequence cut in which c is not optional.
I do not understand what part of the specification allows the miner to derive that result, in particular, how it manages to group (b,c,d) into one cluster on the toplevel.
 Sebastian
Comments

Update: I have looked at the source code of CutFinderIMSequence, and there's the following passage:
* Optimisation 482015: do not greedily use the maximal cut, but
* choose the one that minimises the introduction of taus.So obviously the implementation has been enhanced and differs from the thesis. I'd still like to know if this is a spurofthemoment implementation decision, or if this is theoretically described in a paper somewhere.
In particular, there might be several alternative nonmaximal cuts to consider. Are there any results on how much this slows up the algorithm?
Howdy, Stranger!
Categories
 1.5K All Categories
 45 Announcements / News
 215 Process Mining
 6  BPI Challenge 2020
 9  BPI Challenge 2019
 24  BPI Challenge 2018
 27  BPI Challenge 2017
 8  BPI Challenge 2016
 67 Research
 974 ProM 6
 379  Usage
 286  Development
 8 RapidProM
 1  Usage
 6  Development
 54 ProM5
 19  Usage
 185 Event Logs
 30  ProMimport
 75  XESame