To prevent spam users, you can only post on this forum after registration, which is by invitation. If you want to post on the forum, please send me a mail (h DOT m DOT w DOT verbeek AT tue DOT nl) and I'll send you an invitation in return for an account.
Ideas for data transformation & analysis
Hello all,
I am new in process mining and I need some help/directions how could I get some value from data collected.
Below is an example how my data are currently structured, but I can transform them if needed.
I have "status from" and "status to" that can easily be converted to "transition", or to create one entry per status arrival.
I have seen in exercises that there is usually 1 entry per case, listing horizontally all statuses (or transitions?). Do I need to convert it in such way? Could I do it with Prom?
Please suggest what would be the best way to visualize those data. Thanks
I am new in process mining and I need some help/directions how could I get some value from data collected.
Below is an example how my data are currently structured, but I can transform them if needed.
I have "status from" and "status to" that can easily be converted to "transition", or to create one entry per status arrival.
I have seen in exercises that there is usually 1 entry per case, listing horizontally all statuses (or transitions?). Do I need to convert it in such way? Could I do it with Prom?
Please suggest what would be the best way to visualize those data. Thanks
Hi Michalis,
I think that you have 4 main columns:
- case (as case identifier)
- Status to (as activity)
- Department (as resource, or group)
- Data (as timestamp)
I think that status from is not directly relevant, at least not for a first analysis, mainly because it is implied by the previous event (except for the first event of course).
Similarly, duration is also derived.
So, I would suggest to try to import your data using these sessions and see what comes out. You can just load your data as a CSV file into ProM and run the conversion algorithm.
We usually have data in this shape, where each row is one event (and not where each row is a case).
Joos Buijs
Senior Data Scientist and process mining expert at APG (Dutch pension fund executor).
Previously Assistant Professor in Process Mining at Eindhoven University of Technology -
Hi Joos,
thanks for your answer. The term activity is bit confusing for me.
Should I consider the "Status-to" as activity, or the transition? I get totally different results when loading the csv.
For example if I consider "In analysis" (status) as activity, then I will get about 25 different activities, while if I consider "Open -> In analysis" (transition) as activity, then I get more than 100 different combinations.
Is there any beginner's-guide with best practices? (I had followed the Coursera course 2 years ago)
Hi Michalis,
I think your first option would work best, e.g. use status as an activity.
If you would like to know more I can recommend another process mining MOOC we developed titled "Process mining with ProM":
This might answer your questions already in the first week
Joos Buijs
Senior Data Scientist and process mining expert at APG (Dutch pension fund executor).
Previously Assistant Professor in Process Mining at Eindhoven University of Technology -
Thanks. I will check it
Howdy, Stranger!
- 1.6K All Categories
- 45 Announcements / News
- 225 Process Mining
- 6 - BPI Challenge 2020
- 9 - BPI Challenge 2019
- 24 - BPI Challenge 2018
- 27 - BPI Challenge 2017
- 8 - BPI Challenge 2016
- 68 Research
- 1K ProM 6
- 395 - Usage
- 288 - Development
- 9 RapidProM
- 1 - Usage
- 7 - Development
- 54 ProM5
- 19 - Usage
- 187 Event Logs
- 32 - ProMimport
- 75 - XESame