Q: Inductive Miner directly follows  weaker requirements loop footprints?
If an activity from a redo part has a connection to/from the body part, then it has connections to/from all start/end activities
I wonder if this requirement may not be relaxed to the following:
If an activity from a redo part has a connection to/from the body part, then
either it has connections to/from all start/end activities
or all those connections are to start or from end activities.
The difference is that we would still be able to find a loop cut if a redo part is not connected to all start/end activities, but in case it isn't, is not connected to anything but start/end activities. This would cover some cases of incompleteness.
Consider Sander's example in figure 6.8 (a) on p. 214 of his thesis. (This example is taken out of context, as it is intended to illustrate IMc, but a similar point can be made here about IMd.) Sander presents log99:
(a, b, c, d, e, f, a, c, b, d)
(a, c, d, b, e, f, a, b, c, d)
(c, a, b, d)
(c, a, d, b)
(c, d, a, b)
Inductive Miner directly follows returns a flower model after first applying the fallthrough strictDfgTauLoop. With my revision, we would be able to find a loop cut and derive the following process tree:
LOOP(PARA(SEQ(a,b), SEQ(c,d)), SEQ(e,f))
which would capture the facts that a always occurs before b, c before d and the redo part must contain e and f.
Here is another example. Consider the log
(a, b, c, a, b, c, f)
(a, b, d, e, a, b, d, f)
Intuitively, we might expect to get a tree from this log that has a loop which either ends in c and loops back with a tau, or ends in d and loops back with an e. However, original IMd would reason as follows: When c, d are identified as end nodes (by the split after the first sequence cut) we cannot put e into the redo part, because e is not connected from end node c. So no loop cut exists and strictDfgTauLoop applies, which removes the edge from c to a. We can then discover another sequence and are left with an almostloop, in which a similar problem applies: e is only connected from d, but not connected from the new end node b. In this case, there isn't even an edge from b or d to a, so strictDfgTauLoop does not apply, and we get a flower model over {a,b,d,e}.
Things change when we add the trace (a, b, c, e, a, b, c, f) to the log. This trace contains an edge from c to e. IMd then derives the following structure, which is closer to our expectations:
SEQ(LOOP(SEQ(LOOP(SEQ(a,b,XOR(c,tau)),tau),XOR(d,tau)),e),f)
With my revision, IMd would derive this identical structure from the original (incomplete) log.
Can anyone see anything wrong with my proposed improvement to IMd? (Sander, please?)
Comments

Here's another example, this time not involving incompleteness, but a loop where the redo part is connected to the body:(a, b, c, d, e, a, b, c, d)(a, b, c, d, e, b, c, d)(c, d)With the original requirement, no cut or fallthrough would apply in IMd and we would find a flower model over {a,b,c,d,e}. (In fact, ProM embeds that flower model in an XOR with a taubranch, thus including the empty trace, the reason for which escapes me.)With the relaxation, we can find a loop cut. We can place {e,b} into the redo part, because from that set node e is connected from all end nodes (d ), while the successors of b are start nodes (c ). So we can derive the following tree:LOOP(XOR(a,SEQ(c,d)),SEQ(XOR(e,tau),XOR(b,tau)))
Howdy, Stranger!
Categories
 1.6K All Categories
 45 Announcements / News
 221 Process Mining
 6  BPI Challenge 2020
 9  BPI Challenge 2019
 24  BPI Challenge 2018
 27  BPI Challenge 2017
 8  BPI Challenge 2016
 67 Research
 988 ProM 6
 384  Usage
 287  Development
 8 RapidProM
 1  Usage
 6  Development
 54 ProM5
 19  Usage
 185 Event Logs
 30  ProMimport
 75  XESame