To prevent spam users, you can only post on this forum after registration, which is by invitation. If you want to post on the forum, please send me a mail (h DOT m DOT w DOT verbeek AT tue DOT nl) and I'll send you an invitation in return for an account.

Cumulative net worth - different values between dataset in xes and csv

I have been using both the .xes and .csv dataset, available on the challenge webpage, on different software.

The .xes, when opened in ProM, and filtered on event attributes, shows a maximum Netw worth value of 28.405.633,00 EUR Milion, that is 2,84*10exp7. It refers to Logistic services, Road Packed, (Purch. Doc. 4507000684, item 10, Remove Payment Block).

The .csv file, when opened with different commercial software (if the dataset is not a built-in by the software house) shows a much bigger figure.

I checked the table connecting the .xes file to PowerBI and it shows for the same trace a value of 2,84*10exp14. The table needs a transofrmation, because net worth figures are with a decimal zero that is converted into integer, so there is a zero to be taken away, but still the value is 2,84*10exp13.
By arranging the Net Worth column on descending filter, it shows that there are weird figures for the following purchase doc.: 4507000684, 4507000430, 4507001930,  4507004994.

We could filter out those traces, nonetheless, it is advisable to double-check the .csv and .xes.

Answers

  • I must correct myself: I checked the .csv file with PowerBI (not the .xes). Findings remain anyway.
  • Hi Lorenzo,
    I just checked both files for the event maximum Netw worth value of Purch. Doc. 4507000684 - and the values are identical: 2.8405633E7.
    Maybe double-check the transformations you mentioned.
    :)
  • Thank you for your answer.  First, my fault: the maximum value is 2.89945303E7(EUR, Purch. Doc 4507004994). This is correct in ProM with *xes and other software (Celonis and Disco) with their prebuilt dataset) as well as other softwares (TimelinePI, Minit and Lana uploading the *csv file.  That is or the process mining.
    Passing to some analytics, reading the *csv file on Alteryx, I get the same result. 
    If I read the *csv file with PowerBI, I get those weird figures as mentioned in my first comment.
    It is not a big issue, for the purpose of mining the process. It could be if using different ools to extend the range of visualization you want to achieve.
Sign In or Register to comment.