This is Flux Capacitor, the company weblog of Fluxicon.
You can find more articles here.

You should follow us on Twitter here.

ProM Tips — Which Mining Algorithm Should You Use? 27

Probably the most well-known and popular process mining tool available is ProM, an open source toolkit developed at Eindhoven University of Technology. ProM is a good choice to explore process mining, because it has consistently been at the forefront of that technology1.

If you start up ProM for the first time to try out Process Mining, the number of available plugins (almost 300) can be daunting. Just look at the plugins that discover a process model, and you end up counting at least 16.

What not to do

Many people have read about the alpha-algorithm in some paper, or in the ProM tutorial, and just keep using that one. Don’t do this. The alpha-algorithm is beautiful from a scientific perspective, because it can be formalized in 8 lines (see page 83 in this presentation) and because interesting properties can be proven around it.

For real-life logs, the alpha-algorithm is almost never the right choice. It won’t work. Well, it will give you a result, of course (it always does) — But the result won’t be good. So, don’t use it.

The 3 recommended Mining Algorithms

So, which algorithm should you use? I can recommend you to use the following three process discovery plugins in ProM.

1. Heuristic miner

The Heuristic Miner was the second process mining algorithm, closely following the alpha algorithm. It was developed by Dr. Ton Weijters, who used a heuristic approach to address many problems with the alpha algorithm, making this algorithm much more suitable in practice.

The Heuristic miner (previously Little Thumb) derives XOR and AND connectors from dependency relations. It can abstract from exceptional behavior and noise (by leaving out edges) and, therefore, is also suitable for many real-life logs.

One of the advantages is that a Heuristic net can be converted to other types of process models, such as a Petri net for further analysis in ProM.

2. Fuzzy miner

The Fuzzy miner is one of the younger process discovery algorithms, and was developed by Fluxicon co-founder Christian W. Günther in 2007. It is the first algorithm to directly address the problems of large numbers of activities and highly unstructured behavior.

The Fuzzy miner uses significance/correlation metrics to interactively simplify the process model at desired level of abstraction. Compared to the Heuristic miner it can also leave out less important activities (or hide them in clusters) if you have hundreds of them.

The fuzzy model cannot be converted to other types of process modeling languages, but you can use it to animate the event log on top of the created model to get a feeling for the dynamic process behavior.

[Update: The process mining algorithm in Disco is based on the Fuzzy miner. You can read more about how the Disco miner has been further developed based on the Fuzzy miner in the Disco Tour here.]

3. Multi-phase miner

The Multi-phase miner was the first algorithm to explicitly use the OR split/join semantics, as found in EPCs, enabling it to express complex behavior in relatively well-structured models. It was developed by Dr. Boudewijn van Dongen, a process mining veteran and longtime leading developer of ProM.

The Multi-phase miner folds XOR, AND, and OR connectors from so-called runs and displays the resulting model as an EPC. The EPC can then be exported to Aris (e.g., in Aris graph format) and further processed from there.

One of the advantages of the Multi-phase miner is that it constructs a model that always “fits” the complete event log (more on that in a later post). However, it is seldom useful for more complex processes because the model becomes unreadable.

What do you think?


In this post, I have tried to give you a pragmatic recommendation for which mining algorithm you should use, and when. So, while there may be other plugins that are fascinating from a scientific standpoint, I have focused here on what works in practice.

Let me know if you disagree, and please share your experiences in the comments!


  1. Another reason that many people like ProM, obviously, is that it’s free. But that is a topic for an entirely different discussion… 

Comments (27)

Hi! That was a good explanation about the diference between the pratical and theoretical use of the plugins to mining process. I had some doubts about this because I started to study process mining and in the beggining it’s dificult to know what is for what =P.
In my work (an bachelors degree monography) I’m writting about process mining and will not be possible to use real logs to show how the ProM works :(, so I’m going to simulate a process model in CPN tools and explain about the mining and analisys possibilities.
So, the main doubt that had arisen in my head was if I’m doing the right thing by describe the alpha algorithm and use it to mining a simulated event log. By the point of view that it is good for this type of work (cientific), as you have said, I’ve realized that everything it’s ok. Thanks by the post!

Hi Samuel,

I definitely think that the alpha algorithm is great to describe process mining, and it may work on your simulated process data. Nevertheless, I would still recommend to evaluate the quality (fitness) of the mined model. You can use the Conformance Checker, which is available from the analysis menu directly on the mined Petri net (see http://prom.win.tue.nl/research/wiki/online/conformance_checker).

Doing this will give you confidence in the result, and in case it is not good you could still try the Heuristic miner, for example.

Good luck with your project!

Hi Anne,

Thanks by the recommendation! I agree with you that it’s important to evaluate the quality of a mined process model =D. I think that my project will get better showing it.

best regards.

Thanks for the descriptions! I am starting to use ProM in my master’s project, and this helps to understand the purposes of some of the algorithms.

Great! Thanks, Rob

Hi Anne,
do you have longer lists of comparison between the algorithms plugins (short description, when to use, strengths and weakness)? or is it available somewhere?

Thank you and best regards,

Wikan

Hi Wikan,

I recommend to take a look at this recent publication from Leuven, where you will find a comprehensive assessment of a number of algorithms based on artificial and real-life logs.

Best regards,
Anne

Hi Anne,
I’ve recently started to improve my knowledge on PM algorithms, in particular about Fuzzy Miner. I’ve read some publbications about it, and I’ve tried to use in Prom. However, I cannot understand how the metrics aggregation works. I mean, let’s consider the unary metrics, frequency and routing; after I have calculated them, what is the “formula” to aggregate them and to obtain the value actually displayed on the node in the final graph? Could you suggest me some materials where it is explained? Thank you!

Hi Laura,

I recommend to look at Christian’s thesis here (and the source code in ProM)

Hi Anne,
thank you for your reply, this thesis looks very useful to me!

Best Regards,
Laura

[…] good mining tool to start with is the Fuzzy miner: Go to the menu and select Mining –> Raw ExampleLog.mxml.gz (unfiltered) –> Fuzzy Miner […]

Hi, I’m facing a problem with mining algorithms. I would like to know exactly what are the 4 classes of process mining algorithms, if possible with references Please.

Hi Cedrico,

What do you mean with 4 classes of process mining algorithms? Are you referring to some categories that you have seen? Which ones?

Hi, I have ProM 6 and mined using Fuzzy Model, but now I can’t find the Animation option and cannot view it as animation. How can I do this? Thanks!

I have tried also the steps here https://www.google.ro/url?sa=t&rct=j&q=&esrc=s&source=web&cd=4&ved=0CDIQFjAD&url=https%3A%2F%2Fsvn.win.tue.nl%2Frepos%2Fprom%2FDocumentation%2FPackage%2520Fuzzy.docx&ei=uza2VOjaGYH1UumNhIAP&usg=AFQjCNHTCoAdGibWoL20sJYh7yB9n9vl7A&bvm=bv.83640239,d.d24&cad=rja, but when getting to the step: Animate Event Log in Fuzzy Instance, the Start button is inactive, I can’t click on it… Why is this happening? Thank you!

I found it, I had to select the appropriate log file to play on the animation, didn’t notice that it wasn’t selected! Silly me 😀

Hi Maria, Glad you found it!

Note that you can best ask your questions about ProM in the ProM forum here: http://www.win.tue.nl/promforum/

Hi Anne,

I want to join ProM forum, but I don’t have invitation code. What can I do?

Kind Regards!

Theogene

Hi Theogene,

You can contact h DOT m DOT w DOT verbeek AT tue DOT nl to request an invitation.

Best,
Anne

I want to join ProM forum, but I don’t have invitation code. What can I do?

The address “h DOT m DOT w DOT verbeek AT tue DOT nl” in the “To” field was not recognized. Please make sure that all addresses are properly formed.

This e-mail is valid?

Hi anne,

i am working on fraud detectiong using pro M. recently i start exploring some conformance checking plugin in Prom. could you suggest , what algorithm should i started to learn first ( in case of fraud detection ) ?

thanks in advice,

regards,

Hi Evi, I recommend to ask your question in the ProM forum here: http://www.win.tue.nl/promforum/

Hi. I need to know the precision formula in ProM.
How can I access to conformance furmula in ProM?
Thank you

Hi janan, you can best ask your ProM questions in the ProM forum here: http://www.win.tue.nl/promforum/

Hello,

I am newly introduced into process mining as I am undertaking such topic in my BSC Project.

I wanted to know if there is a simplified article of the plugin model repair that would provide a good overview of this plugin.

I also wanted to if there is a tutorial on the Prediction plugin (Decision Trees) in ProM that operates on running instances.

A tutorial on Model Repair plugin would also be really helpful.

Much Appreciated.

Hi Menna, You can best ask your questions about ProM in the ProM forum here: http://www.win.tue.nl/promforum/


Leave a reply