To prevent spam users, you can only post on this forum after registration, which is by invitation. If you want to post on the forum, please send me a mail (h DOT m DOT w DOT verbeek AT tue DOT nl) and I'll send you an invitation in return for an account.

Parallel task

nicogsch
edited June 2012 in Event Logs
Hi everyone!

I'm doing a project that contains eventlogs with parallel tasks (AND-split/AND-join), something like this:

CASE ID TIMESTAMP  ACTIVITY STATE
687431 01-07-2000 A       Rejected
687431 01-07-2000 B         Complete
687431 01-07-2000 C         Complete
687431 01-07-2000 D         Complete
687431 01-12-2000 A         Complete
687431 01-12-2000 E         Complete
687431 01-12-2000 F         Complete
...                ...                  ...             ...

The process always start  with 4 parallel tasks, A || B || C || D (AND-split);  and B->E, C->E (AND-join). The process have 44 different tasks.

So, I would like to know how XESame (or ProM) can recognize paralell tasks. Do i need to do something special in the mapping? because at the moment, i made a normal mapping like papers examples, but in ProM techniques just a sequence pattern appear for the first tasks, like A->B->C->D

Any idea? hope can you help me, please.

Best Regards!


Best Answer

  • JBuijs
    Accepted Answer
    Process discovery algorithms indeed use the order of events as they appear in the event log, where they should be ordered chronologically.

    I do think it would not be a good idea for process discovery algorithms to assume parallelism when they find several events that appear on the same day. There might be another source of information (key in the database for instance) that can be used to order events when the time does not provide details such as hour and minute of execution.

    Therefore I think that Thomas' solution is the best one for now by allowing the user to 'shuffle' the event log if he knows that events that occur on the same day actually occur in parallel.
    Joos Buijs

    Senior Data Scientist and process mining expert at APG (Dutch pension fund executor).
    Previously Assistant Professor in Process Mining at Eindhoven University of Technology

Answers

  • Hi Nicogsch,

    XESame does not recognize anything, it just takes data and puts it in an event log format. So that should be just fine.

    All process mining/process discovery techniques in ProM should be able to detect parallelism between tasks, if it is frequent enough (if you have 100000 instances of ABCD and only 1 of each of the other combinations it might not be discovered).
    Something you could do rather quickly is to add an artificial start and end task to each trace (there are plugins that do this). Some algorithms require a unique start and end event for process discovery.

    Which algorithms did you try?
    Joos Buijs

    Senior Data Scientist and process mining expert at APG (Dutch pension fund executor).
    Previously Assistant Professor in Process Mining at Eindhoven University of Technology
  • nicogsch
    edited June 2012
    Hi Joos, 

    Thanks for your quick response.

    I added the artificial start and end task to each traces like you said but the same result still appear. I tried the fuzzy model, heuristic miner, alpha... In all of these, the model shown is: Start->A->B->C->D . like a sequence where 'A' is the first step, then 'B' is the second step...but all of this steps must be in parallel..

    What can i do? which is the better algorithm for work with parallel split/join? and which are the optimal config for these instead the default?

    I'll be very grateful for your help.

  • All algorithms will be able to detect parallelism, if it is present clearly enough.
    The ILP miner might also be a good idea to try since it will ensure a perfect fitness and therefore will try to capture different orders of activities.

    Are you really sure that different traces have different orders of the mentioned activities??? It does not sound like it...
    Joos Buijs

    Senior Data Scientist and process mining expert at APG (Dutch pension fund executor).
    Previously Assistant Professor in Process Mining at Eindhoven University of Technology
  • nicogsch
    edited June 2012
     I was looking the log, and yes, you are right. The different traces have same order of activities.
    The eventlog is about students process (graduated), so activities were recorded per each semester (12 sem).
    Example (1º semester):

    CaseID Timestamp CourseID Grade State
    1         01-07-2000 A            75       Aproved
    1         01-07-2000 B            92       Aproved
    1         01-07-2000 C            40        Rejected
    1         01-07-2000 D            62       Aproved
    2         01-07-2000 A              80 Aproved
    2         01-07-2000 B              67 Aproved
    2         01-07-2000 C              70 Aproved
    2         01-07-2000 D              78 Aproved

    So, the log save the activities in same order, but in the model, these activities are parallel. 

    I read some papers about educational process mining and curriculum process mining, also read "Process Mining from educational Data" (Handbook of Educational Data Mining) and i looked the examples but i couldn't find some examples eventlogs of them. 


  • Hey guys,

    I do have the same problem, but havn't had the time to really investigate this. As far as I know, process mining algorithms only detect concurrency if tasks occur in different orderings over different cases as Joos says. They do not recognize it if they occur always in the same ordering even if they all share the same timestamp. A quick idea would be to write some small plug-in that stirs events which have the same timestamp over all cases of a log. Then mining algorithms will discover this. But it seems this is just a "quick and dirty" workaround. What do you think?

    Best regards
    Thomas
  • ok Joss, thanks a lot for the response and explanation. I'll do what Thomas said.
Sign In or Register to comment.