You are reading Flux Capacitor, the company weblog of Fluxicon.
Here, we write about process intelligence, development, design, and everything that scratches our itch. Hope you like it!

Regular Updates? Use this RSS feed, or subscribe to get emails here.

You should follow us on Twitter here.

How To Check Segregation of Duties with ProM 2

The segregation of duties, also called the 4-Eyes-Principle, is one way for organizations to reduce the risk of fraud. For example, it may not be allowed for the same person to initiate a purchase order and pay the invoice for the same item.

Segregation of duties is often controlled via role-based access management in the IT systems. However, there are situations in which after-the-fact verification (based on audit files) is needed.

Here are three examples:

  1. No preventive mechanisms are in place. Not every organization employes preventive mechanisms to ensure segregation of duties via IT controls. Sometimes there are simply not enough people to realize segregation of duties via separate roles. But auditors still have to prove that the 4-Eyes-Principle was obeyed in the operations.

  2. Changing roles create loopholes. Changing roles may create loop holes for bypassing segregation of duty IT controls and create a risk for fraud. For example, a person who initiated a purchase order in role A may over time obtain role B and thus be able to pay the open invoice after the role change. Even complex role management tools usually verify the risk of violation at a static point in time (not over time).

  3. Access management may have been circumvented. Processes often run across different systems. Increased certainty is needed in today’s climate in addition to preventive controls and beyond sampling. By automatically checking 100% of the process log files for violations of segregation of duty constraints, auditors can provide a higher assurance.

In this post, I give you a step-by-step instruction for how to actually check segregation of duty constraints using Nitro and ProM.

1. Determine Segregation of Duty rules

Before you start, you need to know what the segregation of duty rules for your process are. For example, in a Purchase-to-Pay process it is most likely not allowed that the same person issues a purchase order and also approves it.

Here is an example from this ERP vendor blog. The matrix illustrates with an ‘X’ all those two tasks that should be separated. The red marking highlights one of the task combinations that are not allowed:

In the rest of this post, I continue with the call center demo example used earlier. This way, even if you don’t have a log file that you want to check yourself, you can follow the steps using the demo file that comes with Nitro. (Download the free demo version of Nitro here.)

2. Import Audit File

Using Nitro the process log can be imported from a CSV or Excel file. The meaning of the columns is configured in the GUI.

You need to at least configure the following columns:

The other columns are optional. For example, you can configure the columns as shown in the screenshot shown above.

Now, the audit file can be exported in MXML format, which is needed for importing the data in ProM 5.2. (Download ProM here.)

3. Choose 2 Activities

After the import of the converted log file in ProM, start the LTL Checker by choosing ‘Analysis → Raw ExampleLog.mxml.gz (unfiltered) → LTL Checker‘ from the menu.

In the LTL Checker settings screen:

  1. Choose ‘exists_person_doing_task_A_and_B‘ from the list of pre-defined formulas. This is the formula that checks segregation of duties.
  2. Write down the names of the two activities that should not be performed by the same person for the same case.
  3. Click on ‘Check formula

4. View and Export Violations

Now, potential violations are displayed and the details can be exported.

In the screenshot above you see the result for our segregation of duty check with respect to the activities ‘Email Outbound‘ and ‘Call Outbound‘.1

In total, there were 75 cases for which the segregation of duty rule was violated (‘Correct process instances‘ means that the formula could be matched) and 3810 cases were without problem (‘Incorrect process instances‘ means that the formula was not matched — so this is a bit counter-intuitive).

You can also switch between the “Correct” and “Incorrect” set of cases and inspect individual process instances. For example, in the screenshot above the case 3278 is visualized and the found Segregation of duty violation is highlighted.

For further analysis in Excel, you can export the found violations by choosing ‘Exports → Correct instances → CSV for log Exporter‘ from the menu.

Discussion

Do you think checking segregation of duties after-the-fact makes sense? Have you needed it at some point in time? Which tools did you use, and what did you like or dislike about that solution?

Let us know in the comments.


  1. Granted, the example does not make any sense here. This call center process simply does not have any segregation of duty constraints. But I am sure you will have plenty of examples from your own processes.  


There are 2 comments for this article.
Process Mining for Usability Tests 2

You might have noticed: Products—and especially consumer electronics—are becoming more and more complex. As a result, people are not always able to deal with these complexities and usability becomes a distinguishing factor in brand reputation and customer satisfaction.

Process mining is a new technology that makes invisible process flows visible by analyzing existing log data in a bottom-up manner. Earlier, we have seen how process mining can be applied to the test process of ASML and an HR process. But can process mining also help to improve the usability of products?

Usability Testing

In the context of the Master’s project of Pieter Hofstra, together with Jeroen Keijzers, Yuan Lu, and Ton Weijters, we investigated the applicability of process mining for usability tests1.

Usability tests, for example first-use consumer tests, can help to get early feedback from the field while there is still time to adapt the product before releasing it to the market. Traditional usability measures include mostly static information, for example:

The results typically do not reflect the temporal aspects of the test data. So, in this project, we looked at how process mining can be used to get insights into the actual user behavior.

Experimental Setup

For the project, a group of 29 Dutch volunteers (from age of 22 to 66) participated in a usability test for a new television. 19 participants were male and 10 were female. The usability test took place in a simulated living room to make them feel at home as much as possible.

The participants were asked to complete the following three tasks:

  1. Channel selection. After installation of the television, channel RTL 7 has been automatically programmed on channel 25. The participants were asked to put RTL 7 on channel 7.
  2. Dual screen. The ‘Dual screen’ function is innovative in comparison with previous versions of the product. It is one of the features promoted by marketing to sell the product. The participants were asked to watch the channels NEDERLAND 2 and NET 5 simultaneously.
  3. Digital picture. Another function that is new in comparison with previous versions of the television is the ‘Digital picture’ function, which allows to view digital pictures from a USB stick on the television screen.

The “correct procedures” to solve each of these three usability tasks is shown in the picture below. Further down, you can see the process models of the actual user behavior for the middle task (Dual Screen).



(Process models of the optimal user behavior for solving each of the three television usability tasks.)

The entire experiment took about 15-40 minutes per person, depending on the participants’ performance. During the experiment, the participants and the television screen were captured on a video camera. From these video recordings, an event log of the actual usage behavior was created semi-automatically. You can find more details about the event log creation in Pieter’s Master thesis.

Results

One of the goals of the study was to assess the effect of a consumer’s (product) knowledge on usability. People with high product knowledge are assumed to be more familiar with the product, and to have more experience in using it.

The test participants were divided into ‘High Knowledge’ (13 people) and ‘Low Knowledge’ (16 people) groups based on their knowledge ratings in the questionnaire2. Process mining was done on the usage logs of these two groups separately.



(Process model discovered for the ‘High Knowledge’ group performing the Dual Screen task. The numbers and coloring indicate frequencies.)

Look at the behavior of the ‘High Knowledge’ group performing the Dual Screen task above. One can nicely see the paths that were taken from the start to the end of the task.



(Process model discovered for the ‘Low Knowledge’ group performing the Dual Screen task. Compared to the ‘High Knowledge’ group this model is more complex showing more variability among the group of users.)

In the model above it is very visible that the people in the ‘Low Knowledge’ were even further away from the optimal solution. One person even had to give up.

I find this is a nice example that through the visualization of actual user behavior it is possible to reveal usage patterns, which provide both qualitative and quantitative feedback.

Not only for Consumers

Also in a business context usage behavior can be crucial. For example, in call centers there is an increasing use of analytics for operational performance management. Agents often have to switch between 4-6 different applications (Siebel, SAP, etc.) while handling a call. Desktop analysis tools can analyze the key strokes of an agent and the resulting insight can be used to build an abstracting layer on top of the actual applications that matches the typical call flow.

Do you see other examples where understanding user behavior is important? Let us know in the comments.


  1. See details in P.P.H.J. Hofstra. Analysing the Effect of Consumer Knowledge on Product Usability Using Process Mining Techniques. Master’s thesis, Eindhoven University of Technology, Department of Industrial Design, Eindhoven, The Netherlands, 2009.  
  2. The knowledge level of each participant was measured by asking them questions assessing their “familiarity” and “expertise” with televisions and computers. For example, participants had to react to statements such as “Compared to most other people, I know less about televisions/computers” (familiarity) and “I usually talk with friends and colleagues about new developments regarding televisions/computers” (expertise) on a 5-point Likert scale (ranging from total disagreement to total agreement). For more details see Pieter’s thesis.  


There are 2 comments for this article.
Nitro 2.0.6

It has been more than two weeks since I told you about the 2.0.2 update for Nitro. In this time, we have steadily released further updates, the latest of which is version 2.0.6, released last Saturday.

For the most part, updates should not be something you have to worry about. We keep fixing bugs, improving performance, and introducing new features continually, and you will get them automatically via Nitro’s built-in auto-update feature. We believe that, instead of tracking and installing updates, your time is better spent on actually interesting stuff, hence our auto-update approach. Still, for some of you it may be interesting to learn about what we’ve been up to.

So we will keep you informed about the development of Nitro here on our blog, in irregular intervals. In this post I will highlight two noteworthy changes in Nitro 2.0.6. The technically inclined can find a more comprehensive list of changes below, and your copy of Nitro will auto-update to version 2.0.6 the next time you start it up (as described in my last post).

Avoid unsuitable attributes

Many event logs in CSV or Excel format have columns with information that is unsuitable for process mining. One example is a column which contains free-text comments. You cannot create a process model from that kind of data, since every event carries a unique value. Another example are columns that have only one or two values over the whole data set. You may actually want to see that information in the converted log, but then again, maybe you’d rather not.

Since version 2.0.3, we have added a feature to Nitro which warns you about unsuitable attribute columns, such as the examples described earlier. In the above screenshot, you can see that Nitro now displays a warning badge in the column configuration panel when it thinks that column may be unsuitable. In this example, the column contains “more than 99%” unique values, which means that almost any event has a value in that column which is not repeated in any other row.

Nitro does not forbid you from using these columns for conversion. However, you may run into problems during conversion if unsuitable attributes are selected. Furthermore, these attributes almost always turn out to be unsuitable for analysis later on, or create problems in your analysis software. So, when you choose to use a column for an attribute and see that warning, we recommend that you remove it before converting the log, if possible.

Redesigned case analysis charts

With version 2.0 of Nitro we released a completely redesigned analysis view, complete with charts for analyzing the structure of your log. For version 2.0.5 we have redesigned two of these charts again, to be more useful.

In the overview section of the analysis view, the “Case duration” and “Events per case” charts now behave differently than the other histograms in the analysis view. Rather than featuring each value in a separate column, our redesigned charts actually take advantage of the fact that they display actual ranges of values, rather than a set of discrete values. Now, the horizontal axis corresponds to the range of values (i.e., duration or number of events), while the vertical axis shows the frequency of each value.

We think that this view better corresponds to typical statistical distribution charts, and that it gives you more actionable information about the structure of cases in your log.

Change log

For the sake of completeness, here you can find the list of changes per released version of Nitro:

Version 2.0.3
(10 March 2011)

Version 2.0.4
(11 March 2011)

Version 2.0.5
(17 March 2011)

Version 2.0.6
(19 March 2011)

That’s it

Thanks again for all your bug reports, feature requests, and general thoughts and ideas about Nitro! Please keep them coming, either through Nitro’s built-in feedback feature or by sending a mail to support@fluxicon.com!



There are no comments for this article yet. Add yours!
How Process Mining compares to Standard Query Tools

If you have data and questions about your process, there are many powerful tools around that you can use to manipulate, query, and analyze the data to answer these questions. For example, SAS is often used by auditors to combine and filter data. Routines can be programmed and automated to a large extent.

So, how are these tools different from process mining?

The main difference is that you need to know what you are looking for if you use a query-based tool.

Process mining allows for a much more explorative analysis of your process, without the need to have all the questions in advance. Here are 2 examples.

Discovery

Process discovery works by taking the real execution logs of your process as input and then generates a graphical model of what has been happening.



Even if you are not sure what exactly you are looking for, process mining can provide you with an accurate picture about how your business process looks like in practice. This again may trigger questions that you would have never thought of in advance.

For example, in one of our customer projects we found out that advances that were made for some clients sometimes lead to a double payment in the regular process. The advance payment process was manually managed and thought to be under control. But just from looking at the discovered process model it became clear that much more cases slipped through the manual control than people thought.

Conformance

Often, there already exists a description or a model of the process as it should be. By comparing the actual log data from the IT system with the ideal process, one can find out where deviations have occurred and how many.



Checking data against a complete process description is almost impossible to do in a standard query tool, because it is hard to capture the complete target process in a query.


Query tools are very powerful and can be best combined with process mining. Do you have any experience of using both? What are your observations?



There are no comments for this article yet. Add yours!
Nitro 2.0.2

I just wanted to briefly let you know about two small updates we made to Nitro this week.

Fixing bugs…

On Tuesday, Joos alerted us to two bugs that were present in Nitro 2.0.0. The first one is especially annoying, since it led to the fact that you would see our “demo limitation” dialog every time after you exported a log — even if your log was below the demo limit (or the limit set by your ticket).

Export limit dialog

The only time you should see this dialog is of course in a situation like above, when you have actually exceeded the export limit.

The other bug was a problem where exporting a certain type of event logs to XES would result in files that could not be properly loaded by ProM 6. Both these bugs were fixed in Nitro 2.0.1, which was released yesterday.

…and adding features

Today we received a feature request from Martina, who asked whether it was possible to export the information shown in the analysis view to Excel. That is actually an awesome idea, and I wondered why I did not think of this in the first place. Sometimes viewing this information is only the first step, and you want to analyze it further with Excel or statistics software, or create some nice charts from it.

Export analysis data from Nitro

However, I did not want to clutter Nitro’s user interface with more buttons. In my experience, having a nice and clean user interface really helps to find your way around a software tool, and makes you more productive.

The solution I came up with is, in hindsight, rather obvious: When you right-click1 any table in Nitro’s analysis result, you can now choose to export this table’s data to a CSV2 file, which can be loaded in Excel and many other tools.

Get it while it’s hot

We have released Nitro 2.0.2 today, which incorporates both the bug fixes contained in 2.0.1, as well as the export of analysis data to CSV, as described above. How do you get the latest Nitro version?

A big thanks to all of you who have sent us your bug reports and suggestions for Nitro! We work hard to fix bugs as soon as we are aware of them, and we always try to implement your suggestions. Some of them just take a little longer, until we know how to do it right.

Thanks for your patience, and keep that feedback coming, either through Nitro’s built-in feedback or by sending us a mail to support@fluxicon.com!


  1. On Mac OS X, you can press the “control” button while you click. 
  2. Comma-separated values 


There are no comments for this article yet. Add yours!
The Time is Ripe for Process Mining: Interview in I/O Magazine 5

From right: Eric, Wil, Boudewijn and myself

“Powerpoint reality” is what Wil van der Aalst calls the level of insight that organizations often have into their business processes. The processes that are modeled, communicated and put on Powerpoints are usually much more complex in reality than people think they are.

At the same time, IT systems record detailed information about the executed processes. These data can be automatically analyzed by process mining techniques to generate graphical process maps which bring the actual process reality into the picture.

In an interview for the Dutch I/O ICT magazine, Wil van der Aalst, Boudewijn van Dongen, Erik Verbeek and myself talked to Karina Meerman about process mining. Here is the full article (in Dutch).

It is always a challenge to explain process mining to people outside the field. There is just so much context to be considered:

  1. There needs to be a realization of the importance of business processes and process thinking for improving the quality and efficiency of organizations. Otherwise the question is, “Why would you even want process models, be it in Powerpoint or not?”
  2. You need to have IT support in place for these processes, and the data that are collected by the systems need to fulfill certain minimum criteria. Also, the person you are talking to needs to be aware of the fact that this data is already available. Otherwise you would not believe that it is possible to automatically discover processes by looking at the data.

As for the latter point, Wil made it clear that there is more and more data, so this point is hardly an issue anymore1:

The amount of data grows exponentially. Earlier on, you went to a travel agent and you got a paper ticket. The number of transfer times was limited. Now, when you book a ticket online, that site contacts several airline companies and a payment system. All those events are recorded.

Wil explained that these data reflect the real world in an increasingly better way:

The time is ripe for process mining. The digital world is very close to the real world. The introduction of diagnosis-treatment combinations in hospitals — which mean that payments are only provided based on registered event logs — has triggered an explosion of data. And if you, for example, order a book at Bol.com then it does not matter whether the book really is in stock — that you can see it on a shelf — because if the information system says it is there, then it is there.

What about your experience? Do you find it difficult to explain process mining? And if so, which aspect or part of it do you find particularly hard to explain? Let us know in the comments.


  1. Freely translated by myself.  


There are 5 comments for this article.
Applying Process Mining to an HR Process 1

Last time, I showed you some results from a case study with ASML. Today, I want to talk about a process mining analysis that we performed for a customer’s internal HR process.

HR process

In Human Resources (HR), one of the typical processes is that the internal HR department reacts to requests and questions from employees of the company. For example, the employees may have questions about their contracts or training programs.

In the HR department we worked with, the service is delivered in a 3-line model: Easy cases can be resolved at the 1st line, while more complicated questions may need more activities and are either handled in the 2nd line or even with the help of an external specialist.

Goal of the analysis

The goal of the analysis was to get a clear picture of the current ‘As-is’ process because the company wants to deploy a new IT system in the HR department and use these insights to improve the process.

The event log

The screenshot below shows an anonymized fragment of the data that was extracted from the current HR system. More detailed information about the questions of the employees were available but have been removed here for confidentiality reasons.



Figure 1: The Input Data from the HR System contained information about the individual status changes for each handled case.

Using Nitro, we could easily convert these data in an event log that could then be analyzed with the process mining toolset ProM.

Furthermore, we could explore different views on the same data. For example, we chose to analyze the differences of the HR process for cases that are handled in the 1st line vs. those that need help from the 2nd line or the specialists.

Process mining results

Using process discovery, we could automatically discover an objective picture of the HR process in these 3 service lines (see below). One can see that cases in the 1st line are indeed directly handled, while in the 2nd line and with the specialist there are more steps necessary.



Figure 2: A process model of the HR process that was automatically discovered based on the event log of the HR system. The numbers and the coloring indicate the frequency of activities and followed paths.

Because the log data also contains information about the time of the status changes, we can dive deeper and analyze the timing behavior of the process.

For example, in the process fragment below one can see that there is considerable time lost when a case is scheduled for a specialist until it is actually picked up by the specialist.



Figure 3: A fragment of the discovered process model annotated with performance information at the arcs (in days).

Here are some further results from the analysis:

We also analyzed the root causes of overly long-lasting cases by comparing different topics asked by the employees to the HR department.

Bottom line

The main focus of the analysis was on the process flow and its variations, and on the target cycle time. While the cycle times did match the intended service level, the variation analysis was surprising: The top 3 process variants were followed in 80% of all cases, but there were 210 different process variants in total.

Based on our ‘As-Is’ process analysis, the company used their domain knowledge to identify suitable improvements. The results gave the process owner a solid, data-based foundation to understand the current process reality before making any improvements in the new system.



There are 1 comments for this article.
How Process Mining Compares to Data Mining 5

You may remember that, in my last post I have sketched the differences between process mining and business intelligence. Another way to position process mining is to compare it to data mining. There are lots of data mining tools that are used to support business decisions in specific areas (for example: which products should be placed together in the supermarket, or: where you should send your marketing flyer), but they do not work well for processes.

At the same time, organizations spend lots of money on modeling processes. Because the process modeling is done manually, these models are quickly becoming outdated and out of touch with reality — and so they often they end up as dead piles of paper that have no value.

In my opinion, process mining technology combines the strengths of both data mining and process modeling: By automatically creating process models based on existing IT log data, process mining yields live models that are connected to the business and can be updated easily at any point in time.

Huge amounts of data

Process mining has more in common with data mining than just the “mining” part: Just like data mining, process mining takes on the challenge to process large volumes of data that simply cannot be evaluated by hand anymore.

Enterprise IT systems collect more and more data about the business processes they support. These data usually reflect very closely what happened in “the real world” and can be a great source of insight for understanding and improving the business.

Process perspective

Unlike data mining, process mining focuses on the process perspective: It includes the temporal aspect and looks at a single process execution as a sequence of activities that have been performed.

Most data mining techniques extract abstract patterns in the form of, for example, rules or decision trees. In contrast, process mining creates complete process models, and then uses them to precisely highlight where the bottlenecks are.

Also exceptions are important

In data mining, generalization is very important to avoid what is called “overfitting the data”. This means that one wants to strip away all the examples that do not match the general rule.

In process mining, generalization is also necessary to deal with complex processes and understand the main process flows. However, understanding the exceptions is often important to discover inefficiencies and points of improvement.

Focus on discovery

In data mining, models are often trained to make predictions about future similar instances in the same space. Quite a few data mining and machine learning methods operate as a “black box” that spills out predictions without the possibility to trace back the “why”.

Because today’s business processes are so complex, accurate predictions are often unrealistic. The gained knowledge and deeper insights from the discovered patterns and processes help to deal with the complexity, which is where the true value is.

So, while process mining and data mining have a lot in common, there are also fundamental differences in what they do, and where they can be useful. Is there anything that I missed? Let me know in the comments.



There are 5 comments for this article.
Nitro 2.0 5

I am happy and proud to announce the release of Nitro 2.0.

It has been almost two months since our last update to Nitro 1.2.9, which had fixed almost all of the problems our users were experiencing, and delivered best-in class performance for event log conversion. With Nitro 2.0 we are introducing a set of new features which will make working with log files more efficient and useful. And, as if that was not enough, we could dramatically improve Nitro’s performance and fix even more bugs.

You can download Nitro 2.0 for Windows and Mac OS X here. And while that download is running, you can read about what has changed in Nitro 2.0 below.

Redesigned Analysis UI

A lot of people have told us that, while they love Nitro and enjoy the ease of use in configuring their CSV or Excel files for conversion, they found the Analysis view to be somewhat lacking. We designed Nitro as a tool for getting your data from CSV or Excel into ProM1 as fast as possible. And, for that purpose, providing a lot of information about the log data itself did not seem to be a priority.

But it’s not like we didn’t see a problem with that, and we certainly listen to your feedback2. We have been working hard on providing a more useful analysis view for quite some iterations. But each of the solutions we came up with so far was just not good enough, and we had to discard it. But now we think we have nailed the problem, and we are proud to present you a completely reimagined and redesigned log analysis view in Nitro 2.0!

Once you have converted your log data, you enter the Overview screen, which shows you an executive summary of your data. The chart in the center of the screen shows the density of events over time, but you can easily switch it to show the density of cases, or to show the distribution of case duration and the number of events per case. Nitro’s charts are also no longer static, and you can drill into the information a little deeper by hovering over them with your mouse pointer.

For frequency distribution charts, Nitro is the first process mining tool to now also provide Pareto Charts as an alternative to plain histograms. Pareto charts are, effectively, histograms that additionaly feature a curve for the cumulative percentage, which makes it easier for you to see what subset of, e.g., cases are responsible for most of the time spent in your process.

In a table at the bottom Nitro now shows you high-level information for each case in your log, including its start and end time, duration, and number of events. If you want to quickly find the longest-running case, or the one with the most events, you can easily discover those since we have included an in-line histogram of those metrics in the table.

On the left of the analysis view, you can navigate between the overview and more in-depth analysis views for activities, resources, and each attribute in your event log. The frequency distribution of the attribute’s values in the log is displayed as a Pareto chart or a regular histogram. At the bottom of the screen, a table explicitly lists all values for that attribute with their respective frequencies. As with the case overview table, we have also integrated an inline histogram which allows you to quickly spot the frequency distribution right in this table.

We have performed lots of process mining analysis projects with a number of tools, and getting a thorough understanding of the structure of your data is an essential task before you dive into in-depth analysis such as mining the control-flow graph. Back in the days, I have designed the log dialog for ProM 53. I still like that solution, and Nitro’s redesigned analysis view is nowhere near as comprehensive as ProM’s log summary. But I feel that Nitro’s new analysis view does a much better job at providing that essential, immediate feedback on your raw data.

We are of course very curious about what you think of our new analysis view! Please don’t hesitate to let us know in the comments, or drop us a mail — we are indeed listening!

Performance

A beautiful and useful result view is certainly a good thing to have, and we at Fluxicon always try to excel when it comes to usability and UI design. At the same time, we are also genuine speed freaks4, and if there is one thing we really hate then it is waiting for the computer to do its job. In fact, we think that the faster you get your job done and can shut down Nitro, turning to analysis, the better we have done our job.

The 1.2.9 version of Nitro already provided best-in-class performance, in that no other tool allowed you to convert your data faster. Well, that was just not fast enough for us and we are proud to announce that, with Nitro 2.0, you will generally convert your logs more than twice as fast! In addition to that, the memory footprint of Nitro 2.0 is at least 60% smaller than that of 1.2.9. It’s now truly about time to attack those monster logs!

In the above chart, you can see the runtime performance for parsing a CSV file and for exporting that data to compressed MXML, compared between Nitro 1.2.9 and Nitro 2.0. When parsing, Nitro 2.0 is 1.85 x faster than 1.2.9, and for export the speedup is even 2.54 x.

I have worked on the problem of storing and managing event logs for more than five years now, resulting first in ProMimport, then the redesigned log layer of ProM 5.x, and finally the OpenXES library, which powers log storage and management for ProM 6. Step by step, I had addressed a number of performance bottlenecks, and the design implemented in OpenXES is pretty fast. However, OpenXES needs to provide a generic solution, since it needs to support mining and analysis algorithms with very diverse requirements.

Nitro 2.0 is powered by Octane, which is our completely re-engineered log storage and management layer. The design of Octane approaches the problem of managing logs from a completely different angle than previous solutions, which enables it to provide that outstanding performance. One of the reasons for that performance boost is that Octane takes full advantage of the specific set of requirements present in an application like Nitro.

The above chart shows the runtime performance of parsing a compressed MXML file for Nitro 2.0 (powered by Octane) compared to OpenXES (as used in, e.g., ProM 6), and you can see why we love Octane: Nitro 2.0 is more than 8.3 x faster than OpenXES5.

For reasons that should be obvious, I cannot go into much detail of how Octane manages to speed past any competing solution we are aware of. Anyways, storing and managing event logs is quite a boring subject, since it is pure engineering — All the magic usually happens in analysis and mining algorithms. However, every major leap in that analysis and mining space has been enabled by introducing higher-performance log layers.

So now, more than ever: If you like your log conversion easy and fast, Nitro is your friend.

Full log format support

Starting with version 2.0, Nitro supports reading and writing every event log format used in practice. That means, in addition to reading event log data from CSV and Excel files, you can now also directly load MXML and XES files into Nitro. If you want to convert your old MXML logs to XES, or if you want to use Nitro’s new analysis view with your existing MXML or XES logs, no problem.

Also, most commercial process mining software unfortunately does not support the XES standard yet. And while we hope that this will change sooner rather than later, it makes no sense to wait, and we at Fluxicon like pragmatism. This is why Nitro now also supports writing logs to CSV, enabling you to analyze your XES and MXML logs with software that can only handle CSV data.

Bug fixes

All bugs that users have reported with version 1.2.9 are now fixed in Nitro 2.0. Also, we have discovered a number of bugs that nobody has seen in use yet, and of course we have fixed them right away.

So, with Nitro 2.0 you will only get brand new bugs… :-) And if you run into them, please do us a favor and tell use about it, either via Nitro’s built-in feedback feature, or by sending us a mail to support@fluxicon.com. We hate bugs with a passion, and we do our best to fix them and update Nitro as soon as we can. Promised.

New pricing

You may have already noticed that we have adjusted our pricing for Nitro. Our previous pricing structure was aimed at power users in the enterprise space, and for some people that are professionals, but not enterprise users, that just didn’t sit right. We agree with you, and our new pricing structure includes a new plan for people like freelance consultants, who don’t need that same level of support, and whose requirements are generally more modest.

That being said, the new pricing structure looks as follows:

If you can’t find yourself in the above plans, please get in touch. We know that everybody’s requirements are different, and we will try our best to find a solution that works for both of us. But we don’t like to force you into the typical enterprise sale where, in order to get a quote or even look at the product, the seller forces you into lengthy discussions with their sales staff and consultants. So, if you like it simple, use our standard plans above. If your requirements are different, please get in touch with us at support@fluxicon.com!

Also, if you would like to use Nitro in education or for your academic research, you can get a Nitro community ticket free of charge. We also support selected non-commercial use and projects we like with free community tickets. Please get in touch with us at support@fluxicon.com, and we will get you going.


  1. Or, any other tool that can read MXML or XES logs, of course. 
  2. Please keep sending your suggestions to support@fluxicon.com, or simply send them directly from Nitro! 
  3. The default log visualization of ProM 6 is also closely based on that initial design. 
  4. Not in the “snorting amphetamines until the early morning” sense, mind you… 
  5. Just to be clear here, I am not saying that OpenXES’s performance is bad in any way. That would be kind of stupid since I wrote most parts of it myself, and not that long ago. But it shows that, when you don’t have to support a generic use case, and the many features needed for academic research, you can drastically improve performance. Which is what Octane obviously does. 


There are 5 comments for this article.
How Process Mining Compares to BI 1

You may have wondered what exactly the difference is between Process Mining and Business Intelligence (BI). I get this question all the time. Is Process Mining just old wine in new skins, or even about to replace the “old-fashioned” BI? Here is my take on the topic.


Source: ETL-Tools.Info

I find the above picture a little ugly yet informative in illustrating the different ingredients that usually play together in Business Intelligence technology.

First of all, the extraction of data plays a huge role. According to D. J. Power1,

The term BI is a popularized, umbrella term coined and promoted by Howard Dresner of the Gartner Group in 1989. It describes a set of concepts and methods to improve business decision making by using fact-based support systems.

Second, to be able to support business decisions, the data from different source systems need to be consolidated. They are usually stored in some data warehouse, from which reports can be generated or queries can be answered (e.g., using OLAP).

How does Process Mining compare?

Process mining fits in on the analytics side of the whole BI landscape (so, on the right side of the picture above). It has no particular methods to offer that help with the extraction or management of data. Since the consolidation of different data sources is also crucial for process mining in order to analyze end-to-end processes, existing BI technology could be leveraged here.

In my view, the differentiation of process mining with respect to traditional BI is twofold:

No. 1: The added value of process mining over traditional BI reporting tools lies in the depth of the analysis.

Traditional BI reporting tools focus on the display of Key Performance Indicators (KPIs) for executives in the organization. For example, the cycle times of a customer-facing process may be key in meeting certain service levels that have been agreed.

If the cycle times are out of the acceptable bounds, dashboards can highlight this problem. However, they cannot do much to uncover the root causes for this problem. Process mining can help to provide much deeper insight into the actual processes by uncovering the process flows and bottlenecks based on existing IT logs in a bottom-up manner.

Essentially, BI assumes that the underlying processes are known. Process mining takes the stand that even well-defined processes usually don’t go as planned and need to be brought into light objectively.

No. 2: To be able to apply process mining on data-warehoused logs certain requirements need to be fulfilled.

Put simply, to be able to apply process mining techniques, one needs more detailed information than to compute pre-defined KPI dashboards. Traditionally, data warehouses contained only aggregated data. For example, one would only store one data point for each process instance’s cycle time. In contrast, process mining requires at least one data point for each activity in the process and must keep track of the different process instances.

With an increased focus on continuous monitoring and advancements in data management technology, there now exist data warehouses that hold on to all the detailed, “raw” data points that are a prerequisite for process mining. In this case, process mining can be used as a complementary analysis tool on top of the data warehouse. Otherwise, more direct, native data extraction mechanisms need to be employed.

Makes sense? Please join the discussion and let me know what you think!


  1. See: Power, D.J. A Brief History of Decision Support Systems. DSSResources.COM, World Wide Web, http://DSSResources.COM/history/dsshistory.html, version 4.0, March 10, 2007.  


There are 1 comments for this article.
« Newer posts
Older posts »