To prevent spam users, you can only post on this forum after registration, which is by invitation. If you want to post on the forum, please send me a mail (h DOT m DOT w DOT verbeek AT tue DOT nl) and I'll send you an invitation in return for an account.
Dataset conversion error from CSV to XES(ProM 6) -Duplicate events observed
As part of the BPI Challenge 2017, we have performed a join on both the application and offer log that was provided. We have a consolidated CSV file of 1.21M rows. But while converting the CSV to XES using ProM 6, we have observed that the number of events (rows) are exactly double (around 2.3M).
Below are the settings which we used in ProM.
We mapped the cases(Application IDs) to the events and added the timestamp(startTime and completeTime) as per the format.
The remaining settings have been shown in the screenshot.
Kindly help as number of events ideally should be 1.2M.
Thank you
Akshar Solanki
Below are the settings which we used in ProM.
We mapped the cases(Application IDs) to the events and added the timestamp(startTime and completeTime) as per the format.
The remaining settings have been shown in the screenshot.
Kindly help as number of events ideally should be 1.2M.
Thank you
Akshar Solanki
Dear Akshar,
This makes sense. Each row in the table gets converted into a start event (based on the start event time) and an end event (base on the end even time). Hence there are twice as many events as rows.
If you want to remove events and focus only on the complete events for example, you need to filter afterwards. -
Dear bfvdonge,
Thank you for your reply. It really helps alot.
Howdy, Stranger!
- 1.6K All Categories
- 45 Announcements / News
- 225 Process Mining
- 6 - BPI Challenge 2020
- 9 - BPI Challenge 2019
- 24 - BPI Challenge 2018
- 27 - BPI Challenge 2017
- 8 - BPI Challenge 2016
- 68 Research
- 1K ProM 6
- 395 - Usage
- 290 - Development
- 9 RapidProM
- 1 - Usage
- 7 - Development
- 54 ProM5
- 19 - Usage
- 187 Event Logs
- 32 - ProMimport
- 75 - XESame