To prevent spam users, you can only post on this forum after registration, which is by invitation. If you want to post on the forum, please send me a mail (h DOT m DOT w DOT verbeek AT tue DOT nl) and I'll send you an invitation in return for an account.
Different event labels yield different models.
We made this confusing observation and would be interested in any
idea on what's causing it: In the context of an analysis with the Fuzzy Miner (Mine for a Fuzzy Model) in ProM 6.1, we changed the event labels in our data from German to
English. This resulted in a different model for the data set
with the English labels, though nothing else was changed. We checked if
the event frequencies are identical between the two label set: they
are. We even created a third data set with labels different from the
other two, and guess what, this yielded yet another
model, not identical to the first or the second. Our conclusion at this
stage is that the choice of labels affects the outcome of the mining.
Clearly, this should not be the case! Is there anything trivial (such as
the length of the labels) that might cause
this? We might have overlooked something embarrassingly simple, but
we'd be really interested in what this is. Thanks in advance.
In an older version of ProM (5.0) we could make the same oberservation: different event labels yield different fuzzy models.
Best Regards,
Christoph
idea on what's causing it: In the context of an analysis with the Fuzzy Miner (Mine for a Fuzzy Model) in ProM 6.1, we changed the event labels in our data from German to
English. This resulted in a different model for the data set
with the English labels, though nothing else was changed. We checked if
the event frequencies are identical between the two label set: they
are. We even created a third data set with labels different from the
other two, and guess what, this yielded yet another
model, not identical to the first or the second. Our conclusion at this
stage is that the choice of labels affects the outcome of the mining.
Clearly, this should not be the case! Is there anything trivial (such as
the length of the labels) that might cause
this? We might have overlooked something embarrassingly simple, but
we'd be really interested in what this is. Thanks in advance.
In an older version of ProM (5.0) we could make the same oberservation: different event labels yield different fuzzy models.
Best Regards,
Christoph
Best Answer
-
Dear Christoph,
This is an expected behavior as per the implementation of Fuzzy miner if you use the default settings. This happens because the default settings has the "data value correlation" binary metric set to true. If this metric is selected, then the activity names are compared for their string edit distance and the distance value is used while generating the model.
Hope this clarifies your question. Should you have any further questions, please let us know.
Kind Regards,
JC
Howdy, Stranger!
Categories
- 1.6K All Categories
- 45 Announcements / News
- 225 Process Mining
- 6 - BPI Challenge 2020
- 9 - BPI Challenge 2019
- 24 - BPI Challenge 2018
- 27 - BPI Challenge 2017
- 8 - BPI Challenge 2016
- 68 Research
- 1K ProM 6
- 393 - Usage
- 287 - Development
- 9 RapidProM
- 1 - Usage
- 7 - Development
- 54 ProM5
- 19 - Usage
- 187 Event Logs
- 32 - ProMimport
- 75 - XESame