Metrics for evaluation and ProDiGen

syjing628 · June 2019

Hey all,

I am currently developing an algorithm of process mining. I met two problems.

Problem one:

In my algorithm, it consists of a number of loops. In each loop, I want to evaluate a candidate by four metrics, including fitness, precise, generalization, simplicity. Can anybody tell me how can I implement it?

Problem two:

I want to compare my algorithm with ProDiGen algorithm. But I can not find the package. Can anybody tell me where can I find it?

Thank you in advance.

Regards.

Si-Yuan Jing

hverbeek · June 2019

Dear Si-Yuan Jing,

For themetrics, you could have a look at the way it is done in the Evolutionary Tree Miner, see the sources on https://svn.win.tue.nl/trac/prom/browser/Packages/EvolutionaryTreeMiner/Trunk/src/org/processmining/plugins/etm/fitness/metrics. Note that these metrics assume the model to be a process tree, I'm not sure whether this fits your needs.

As far as I know, the ProDiGen algorithm is not available in any version of ProM that we distribute. I've found a link to a ProDiGen website (http://tec.citius.usc.es/SoftLearn/ProDiGen.html) but that results in an error message. Perhaps you could contact the authors of the paper (see for example https://link.springer.com/chapter/10.1007/978-3-319-10172-9_8) directly?

Kind regards,

Eric.

syjing628 · June 2019

Dear Eric,

Thanks for your help.

I also wonder that is there a package provides these metrics which are based on petri net? Right now, PNetReplayer is employed in my algorithm. I follow a formula which is used for calculation of fitness value. Is it right?

total_fitness = Sigma(trace_fitness * trace_num) / total_trace_num

Kind regards,

Si-Yuan Jing

hverbeek · June 2019

Dear Si-Yuan Jing,

This depends on which fitness metric you want to have. Your formula computes the average trace fitness, which is not exactly the same as the log fitness as reported by the PNetReplayer. The difference between both is that for the average trace fitness the fitness values are accumulated, whereas for the log fitness the replays costs are accumulated.

Kind regards,

Eric.

syjing628 · June 2019

Dear Eric,

In the above formula, the "trace_fitness" is gotten by SyncReplayResult.getInfo(PNRepResult.TRACEFITNESS).

I studied the code of org.processmining.plugins.astar.petrinet.AbstractPetrinetReplayer.java.
In line 382-386, the algorithm put the trace fitness into a SyncReplayResult object. I am not sure whether it is the log fitness or not. If not, where can I find the code for calculation of log fitness.

Additionally, I also wonder that where can I find a tool for calculation of precise, generalization and simplicity of a petri net model.

Regards,

Si-Yuan Jing

hverbeek · June 2019

Dear Si-Yuan Jing,

No, this is not the log fitness. Every SyncReplayResult corresponds to one alignment, and its fitness to a trace fitness. Unfortunately, the trace fitness are not sufficient to compute the log fitness. In short, both the trace fitness and the log fitness are fractions. To compute the log fitness one needs to divide the sum of the trace fitness nominators by the sum of the trace fitness denominators. For example, if we have trace fitnesses 2/4 and 3/12, this would result in 5/16, which is not the same as for the trace fitnesses 1/2 (=2/4) and 1/4 (=3/12), which would result in 2/6 (>5/16).

As far as I know, the PNetReplayer does not compute the log fitness.

For the other metrics, you would have to check the literature. There exist different metrics for precision and simplicity. ProM includes some tools to compute these metrics, I guess. Perhaps https://svn.win.tue.nl/trac/prom/wiki/ProM69/Plugins can be of help, which lists all plug-in sin ProM 6.9. You look look for "precision" and "generalization", for "simplicity" you could use the "Show Petri-net Metrics" plug-in.

Kind regards,

Eric.

syjing628 · June 2019

Dear Eric,

Thanks so much! It helps me a lot.

Regards,

Si-Yuan Jing

Metrics for evaluation and ProDiGen

Answers

Howdy, Stranger!

Categories

In this Discussion