The BPI Challenge is an annual process mining competition, which takes place for the fourth time this year. The goal of the challenge is to give both researchers and practitioners the opportunity to do process mining analyses on real-life data (read our interview with Boudewijn, where he tells the story of how the BPI Challenge came to life).
In this competition, anonymized but real data is provided and can be analyzed by anyone using any tools. Submissions can be handed in until July 12, 2014 and the winner will be awarded a prestigious prize!
As always, we make our process mining software Disco available for anyone for the purpose of this challenge. Read on to see what this year’s challenge is about and how you can get started.
This year’s data is provided by the Rabobank Group ICT.
Similar to other ICT companies, Rabobank Group ICT has to implement an increasing number of software releases, while the time to market is decreasing. Rabobank Group ICT has implemented the ITIL-processes and uses the Change process for implementing these so called planned changes.
The data is provided for the following service desk processes: Interaction Management, Incident Management, Change Management (see below).
As you can see in the illustration, a problem reported to the service desk (for example, a slow internet connection) may evolve from an Interaction (an agent from the service desks troubleshoots the reported issue) to an Incident (the issue cannot be solved on the phone but someone has to look into it) up to a Change request (repeated problems of the same kind lead to a structural change that should prevent issues in the future).
The following detailed information is provided about each stage.
In order to manage calls or mails from customers (Rabobank colleagues) to the Service Desk concerning disruptions of ICT-services, a Service Desk Agent (SDA) logs calls/mails in an Interaction-record and relates it to an Affected Configuration Item (CI). The SDA can either resolve the issue for the customer directly (First Call Resolution) or create an Incident-record to assign the issue to an Assignment Group with more technical knowledge to resolve the service disruption.
If similar calls/mails are received by the Service Desk, a SDA can decide to relate multiple Interaction-records to one Incident-record. Further logging of Activities to resolve the service disruption will be done in the Incident-record.
Based on an estimated Impact and Urgency, done by the SDA, an Incident-record is prioritized and gets a deadline to resolve the service disruption. A Team leader within the Assignment Group assigns the records to an Operator. The Operator resolves the issue for the customer, or reassigns the record to a colleague if other or more knowledge is needed. After solving the issue for the customer, the Operator relates the Incident-record to the Configuration Item (CausedBy CI) that caused the service disruption. After closing the Incident-record, the customer receives an email to inform that the issue is resolved.
If particular service disruptions reoccur more often than usual, a Problem investigation is started, which will lead to an analysis and improvement plan to prevent the service disruption to happen again. The improvement plan leads to a Request for Change (RfC) on the CausedBy CI. All CI’s are related to a Service Component, Risk Impact Analysis is done by an Implementation Manager assigned to changes related to the specific Service Component.
The Data Set
As in any process mining analysis, the data needs to be linked to a Case ID, Activity, and Timestamp.
The data that you will analyze in this BPI Challenge stems from the IT Service Management (ITSM) software that is used in the service desk to handle the processes described above. Activities and timestamps are recorded within the ITSM system for the processed interactions, incidents, and changes.
An additional difficulty this year is that the data is provided in four pieces for the different processes. For the process mining analysis, the data needs to be combined and this can be done in different ways. It is part of the analysis to understand and prepare the data according to the questions and goals of the analysis.
If you click on the picture below, you can see the data fields that are contained in the four files:
We have imported these four files and already created two additional views for you in a Disco project file that you can simply open with the freely available demo version of Disco.
You can download both the Disco project file and the raw data and data model explanation here:
Download the raw data files in a Zip file (CSV files and explanation reference about the data model)
Download the Disco project file that can be opened with the freely available demo version of Disco
We think that many of you will want to create additional views by combining or importing the data in different ways. The two views that we created (integrated incidents and a more detailed view on the change process) are just an example and a starting point.
If you have created another combination of the data that you want to analyze in Disco as well, you can send the file to firstname.lastname@example.org and we will add your view to the project file. You will be mentioned as the creator of the new data view and updates to the project file will be made available here on this blog post for the community.
New This Year: Student Challenge
View Process Mining Research Institutes in a larger map
Students all over the world are starting to learn about process mining. However, learning about it in theory and applying process mining in practice are quite a different story.
To give students the possibility to develop their process mining skills, this year’s BPI Challenge, for the first time, includes a separate student competition. Student groups are invited to participate in the challenge, and their submissions will be evaluated separately from the regular submissions.
Because students normally do not have any experience with process analysis and improvement work at companies, we decided to pair them with a mentor who is a practitioner and can guide them and be available for questions. This way, the student teams will learn more and deliver better analyses. More than a dozen practitioners in Europe, Scandinavia, the US, and South America have already volunteered to mentor a student team.
To be matched with a mentor, students can email email@example.com by 31 May 2014.
Update: Please apply to the mentorship program through the form at the BPIC 2014 Student Challenge website.
Do you want to be a mentor as well? This in no way hinders your own participation in this year’s BPI challenge (the student challenge is completely separate). Let us know and we will try to match you with a student team in your geographical region.
Now, there is one more bonus attached to the student competition: Two extra prizes are available for the winners.
Questions About the Process
One of the challenges of a process mining project is that you need a starting point to understand the process context and what business questions and goals are relevant for the analysis. Otherwise it is really easy to get lost in the data.
This is why the data in the BPI Challenge is not just dropped over the fence, but the data providers are encouraged to provide questions that can be the starting point for the people who are participating in the challenge.
Here is what the Rabobank ideally would like to know about the data:
Rabobank Group ICT is looking for fact-based insight into sub questions, concerning the impact of changes in the past, to predict the workload at the Service Desk and/or IT Operations after future changes.
The challenge is to design a (draft) predictive model, which can be used to implement in a BI environment. The purpose of this predictive model will be to support Business Change Management in implementing software releases with less impact on the Service Desk and/or IT Operations.
We have prepared several case-files with anonymous information from Rabobank Netherlands Group ICT for this challenge. The files contain record details from an ITIL Service Management tool called HP Service Manager. We provide you with extracts in CSV with the Interaction-, Incident- or Change-number as case ID. Next to these case-files, we provide you with an Activity-log, related to the Incident-cases. There is also a document detailing the data in the CSV file and providing background to the Service Management tool.
- Identification of Impact-patterns: We expect there to be a correlation between the implementation of a change and the workload in the Service Desk (SD) and/or IT Operations (ITO), i.e. increased/decreased volume of Closed Interactions and/or increased/decreased volume of Closed Incidents. Rabobank Group ICT is interested in identifying any patterns that may be visible in the log for various service components to which a configuration item is related, in order to predict the workload at the SD and/or ITO after future changes.
- Parameters for every Impact-pattern: In order to be able to use the results of prior changes to predict the workload for the Service Desk directly after the implementation of future changes, we are interested in the following parameters for every impact-pattern investigated in sub question 1:
- What is the average period to return to a steady state?
- What is the average increase/decrease of Closed Interactions once a new steady state is reached?
- Change in Average Steps to Resolution: Since project managers are expected to deliver the same or better service levels after each change implementation, Rabobank Group ICT is looking for confirmation that this challenge is indeed being met for all or many Service Components.
- Creativity challenge: Finally, we challenge the creative minds, to surprise Rabobank Group ICT with new insights on the provided data to help change implementation teams to continuously improve their Standard Operation Procedures.
What can you do if these questions are not interesting or not feasible for you? After all, you may need quite some data mining skills to fully address the questions above.
Don’t worry. Like explained above, the questions from the process owner are intended to provide you with a starting point. The BPI Challenge gives you the chance to practice your process mining skills on real data and there are many ways to do this. Think of a process question that would be relevant for you, for your clients, or – if you are a researcher – what insights would your fantastic new algorithm add in this situation?
What is important is that you clearly state the questions and the reasoning behind your analysis. Motivate why the question is relevant and describe how you approach the analysis in sufficient detail, so that others can understand what you did and why.
How to Submit
You can submit your challenge contribution through the EasyChair system at https://www.easychair.org/conferences/?conf=bpic2014.
A submission should contain a pdf report of at most 30 pages, including figures, using the LNCS/LNBIP format specified by Springer (available both as a Word and as LaTeX template). Appendices may be included, but should only support the main text.
Submission deadline: July 12, 2014, 23:59 CET
Announcement of winners: at the 10th Workshop on Business Process Intelligence (BPI 14), Haifa, Israel, 8th September 2014
Join us for a webinar, where we have invited both the challenge organizers and a process expert from the Rabobank. This is your chance to answer all your questions about the challenge and about the data set.
The tentative date for the webinar is 15 May at 17:00 CET. Sign up now to make sure you don’t miss it!
This is a guest post by Vladimir Rubin. Vladimir shares his experience from applying process mining to software processes for a tourism company.
If you have a process mining case study that you would like to share as well, please contact us at firstname.lastname@example.org.
Why Software Process Mining?
Building flexible, adaptive software systems is becoming more and more important, because businesses need to be able to change rapidly. Especially agile methods and processes are becoming extremely popular, since they naturally deal with business change by decreasing the length of iteration lifecycles and getting quicker responses from the end-users. Additionally, concepts such as continuous integration and delivery support the dynamic rollout of software to customers and enable short user feedback loops.
By using these agile approaches, the end-user becomes a part of the software development life-cycle. His experience and his way of working with the software become accessible and essential for subsequent iterations of software development. This is the point where process mining comes into play.
We have successfully applied process mining, which is normally used more for the analysis of traditional business processes, to the area of software development. Both user interaction and system’s internal behavior can be analyzed with the help of process mining. The results of this analysis can significantly influence the architecture, design, testing, and development of the software system.
In this blog article, we discuss two main use cases:
- The interaction of the end-user (or of a Beta-Tester) with the software system can be logged and, therefore, analyzed with the help of mining tools. Then, the analysis results are given to the business analysts, testers, architects, and developers in order to improve the usability, reliability, efficiency, and other properties of the software system.
- The sequence of services calls (calls of interfaces between components) is usually traced in order to provide developers information about system behavior and failures. This information can be imported in the process mining tool, which helps deriving the view of the software processes from a technical perspective by analyzing the performance and frequency of calls.
Both use cases were inspired by concrete requirements coming from a big European enterprise touristic project:
- The team wanted to analyze the productive behavior of the users in order to see the system failures, bottlenecks, and to gather statistics.
- Several critical performance challenges appeared with an increasing number of users, they had to be identified and solved.
To address these problems, we have written the user logs and the traces of the system. Then, we have imported them in the Disco tool for process mining.
Here is a short overview of the results. The data has been anonymized to protect the confidentiality of the client.
Case 1: User Activity Analysis
In Figure 1 we show the positive behavior of the user – the cases which were successfully finished in the production system. It is a convenient possibility to track the production state and to identify the frequency of the paths taken through the system by the user.
Figure 1: User positive behavior (Frequency View)
In Figure 2 we show the performance view of the negative behavior, i.e. the cases containing failures, and the time wasted.
Figure 2: User negative behavior (Performance View)
In Figure 3 the cases are clustered per variant and the typical behavior is shown. It is helpful for analyzing the individual user behavior patterns and the variety of the business processes.
Figure 3: Variety of Cases (Types of Behavior)
Case 2: System Performance Analysis
For the second case, we have taken the trace of system calls in order to analyze the system behavior. We could identify the most frequent service calls, the spreading of calls, and also the loops, as you can see in Figure 4.
Figure 4: Frequency Analysis
Moreover, we could also see the detailed statistics of calls and, thus, the most critical services from the performance point of view, as shown in Figure 5.
Figure 5: Frequency Statistics
After switching to the performance view of Disco and looking at the total time statistics, we could effectively identify the most time consuming calls in the system. Identifying these delays and increasing the performance had a high priority for the developer team, because a slow service would cause users to abandon the website and potentially leave to a competitor.
Figure 6: Performance Analysis
In this article, we have shown two successful applications of process mining in a concrete enterprise software project.
From our point of view, this is a very fruitful application domain, because productive software systems provide a big amount of data in form of logs and traces. This data can and should be analyzed in order to improve the software quality.
Last week, we published a new article about The Added Value of Process Mining at the respected BPM analyst platform BPTrends. You find a short abstract and a link to the original article below.
Process mining, just like data mining, is a generic technology and can be applied in many different ways. This is an advantage but at the same time it makes it difficult for you to understand what exactly the added value would be for your situation. Should you be interested in process mining and learn more about it? Which kinds of processes can be analyzed with process mining? What benefits would it bring?
In this Article, we give you a framework for the most common process mining use cases, so that you can see where you fit in.
Read the full article at the BPTrends website…
What do you think about the discussed use cases? Which are the ones you find most important? Which ones have we missed? Let us know in the comments.
Process mining can not only be used to analyze internal business processes, but also to understand how customers are experiencing the interaction with a company, and how they are using their products.
“Process Mining and Customer Journeys” was the topic of the first event of the new Special Interest Group (SIG) for Process Mining in the Dutch industry association Ngi-NGN. Fluxicon is on the board of this Ngi SIG group and was co-organizing the event, which took place yesterday on 25 March 2014 in Utrecht, at the Rabobank.
More than 50 people had signed up for the event and it went great. Below is a quick summary for everyone who could not be there.
Introduction Customer Journey
Jaap Rigter from VisionWaves first introduced the topic of customer journeys. He illustrated how customers interact with a company through multiple channels, and how understanding the customer experience across these different channels is critical in understanding the customer and improving her experience.
Introduction Process Mining
I then introduced process mining using the metaphor of sailing boat journeys from 150 years ago. For the people who were already familiar with process mining I had brought the first two applications of process mining to customer journeys, which are probably not what you might think (take a look at the slides to find out).
The center of the first part of the evening was the case study presentation by Ellen van Molle and Bram Vanschoenwinkel from AE. They presented the results from a process mining mining analysis at an interim sector company, where employers were matched with employees.
By understanding how potential employees were using the job search application they could highlight the process areas, where people dropped out. Furthermore, by enhancing the data in a second iteration they were able to check hypotheses of the business such as “mostly elderly people have problems with the navigation in the system”.
The second part of the evening was an open discussion in small groups. As a starting point questions such as “What are the challenges of process mining for customer journeys?” and “What is the added value of process mining for customer journeys?” were provided to the groups. Afterwards, the results from the discussion were summarized.
Here are some of the discussion points I remember:
- One challenge is that the data need to be coupled across multiple channels / systems to get an integrated picture.
- Another challenge is that next to the analyst and the business one actually needs to involve the customer herself to understand the underlying root causes and motivations.
- While the rules for analyzing business processes are mostly well-defined, analyzing customer data is much more sensitive and privacy concerns play an important role.
- Potential benefits that were discussed are, for example, the saving of costs of customers calling the helpdesk by better adjusting the websites so that they find what they need.
- Another mentioned benefit was that by improving the customer experience, businesses can expect more revenue from their happy customers and more recommendations from their customers.
It was a well-attended and very lively event. Thank you all for coming! You can download all slides and more photos of the event at the Ngi-NGN event site here.
Photo Credit: 96dpi via Compfight cc
This is a guest post by Walter Vanherle from Crossroad Consulting in Belgium. Walter shares his experience from applying process mining to an operational process from a security provider.
If you have a process mining case study that you would like to share as well, please contact us at email@example.com.
Security Services companies are caught between the rising costs of operations and the downward price pressure due to direct and indirect competition. Further improvements in operational excellence together with service innovation are key in addressing these challenges.
Service delivery is always managed via agreements in the form of contractual obligations based on target performance. Not reaching pre-set targets has immediate financial implications. The service provider, therefore, actively manages these agreements in order to deliver the services efficiently, with costs/penalties managed in relation to the individual client expectation and priorities between clients.
The goal of the process mining project was to measure the performance of such a security services process and to create a reference base of Key Performance Indicators (KPIs).
The Intervention Management Process
Imagine a bank who is a customer of a security services company. Someone breaks a window and the security alarm goes off at the site of the security service provider: An intervention process is started.
The intervention management process has 2 stages (see also picture below). The first stage starts with a client intervention service request (T0). Then, the dispatching unit covering the confirmation activates the service request (T1), identifies an available agent (T2), and the agent confirms the acceptance of the mission (T3).
The second stage is the intervention itself. The execution of the intervention holds 4 subsequent steps: Effective departure to the location for the intervention (T4), Arrival at the location and start of the observations (T5), End of the Observation and documenting the intervention (T6). End of Mission (T7).
There are four KPIs that are relevant for this process. The most important one is the time from the initial client request to the arrival on site (T0-T5). Also important are the time from the client request to the confirmation (T0-T1), the time from the agent’s confirmation to the arrival on site (T3-T5), and the total time from the initial client request to the end of mission (T0-T7).
The service process execution is registered by a special service management software for security service providers by Risk Matrix Resultants. The anonymized event log held data over a period of 2 years containing all interventions for all clients. The dataset contained over 50.000 cases (missions) and 400.000 events.
The analysis below is based on the data from the missions for one client of the security service company over the timeframe of one year.
Process Mining Results
The expectation was that about 70% of the cases should follow the Straight Through Process (STP) flow with the 7 steps T0-T7 as explained above. Furthermore, the following four additional process variants were expected for the remaining 30% of the cases:
- T0-T1 (request is not confirmed)
- T0-T3 (solved, no intervention is needed)
- T0-T4 (aborted in the recording)
- T0-T7 but without T5 (no intervention is needed by accountable)
But how does the process look like in reality? Using the process mining software Disco, the real process flows could be discovered based on the data.
The process map below has been filtered to show the discovered process only for the five expected variants in the process. What stands out is that the four additional variants are almost never taken compared to the standard, STP variant, which is followed by 1518 cases. The other four expected variants are only taken by 31, 13, 2, and 20 cases, respectively.
The problem is that – unlike assumed – the process does not follow just these five expected paths. The STP variants covers 78% of the cases (this is actually slightly more than expected) but the five expected variants together only make up about 82% in total. So, the question is what is happening in the other 18% of the cases?
If we look at the full process, which has 58 variants (more than ten times as much as expected), then we get the following process map. The STP path is still visible, but there is a lot more variation. So, the question is what are these other variants and why are they there?
If we look at the unexpected variants, then it turns out that there are two types of root causes:
- actual variation in the process
- data quality problems that affect timestamps
For example, if we look at the expected variant “T0-T7 but without T5″, then we see that in addition to the sequence T0,T1,T2,T3,T4,T6,T7 (wich occurred 20 times), there are some additional patterns in the process (see process map below):
- 28 times the process went from T3 straight to T6 without T4 (no departure)
- 21 times the process went from T4 directly to T7 without T6 (no arrival)
- 34 times T3 was directly followed by T7 (no observations at all)
At the same time, there were many variations that were caused by what is called “clock drift“. In this process, many different parties were recording events on different devices (which had different clocks). As a consequence, there were often confused orderings in the process step sequence that were purely caused by such a clock difference (that is, the steps were actually be performed in the right order, but due to the different clocks they appeared inverted).
One example case, where this happens, is shown in the picture below. It seems as if T3 was performed before T2, but actually there is just a 5 sec time difference that is caused by the different clocks of the registering parties.
Such data quality problems do not only make the process variant analysis difficult, but also pose the risk to distort your KPI measuring. For example one of the KPIs was defined as the time from T0-T1. What happens now if T1 has an earlier timestamp than T0 due to the clock drift? If you just measure the time between them in Excel, you would get a negative duration that would reduce the average duration between these steps, which of course is not true.
In the intervention management Process, clock drift can occur for the transactions generated by the hand-held devices (PDAs) used by the field service agents or between the alarm generating system (T0) and the dispatching /intervention management system (T1). When the system clocks of devices are not synchronized, the recorded time stamps can shift with seconds, even minutes influencing the effective SLA timings. Using the case monitoring capacity of Disco with process filtering and visualization techniques we were able to visualize outliers quickly and suggest corrections to the transaction file compensating the irregular observations. We suppressed or eliminated the most prominent outliers from the final process mining file for more accurate performance statistics.
After cleaning the data, the SLA analysis for the KPIs (see above) was performed. We exported the durations from Disco and used a template-based Tableau Software Visualization to produce a cumulative SLA spectrum analysis. You can see such an SLA spectrum analysis for the time from T0-T5 for the year 2012 below.
SLA spectrum analysis for a partial data set
The KPIs T0-T5 and T0-T1 are particularly important, because they are linked with financial compensation. For cost optimization and predictive analytics the process sequences T3-T5, T0-T7 were analyzed. We also filtered out groups of clients with similar or different execution patterns based on their type of service contract.
Furthermore, the following process analyses were performed:
- priority accounts treatment,
- work handover patterns (preferential treatment of agents),
- correct treatment of the intervention priority classes.
Benefits and Lessons Learned
The registration process is both machine and people driven. Our experience shows that service tracking is subject to involuntary and voluntary errors and an ongoing, critical, management component. However, after overcoming these data quality challenges, we could generate many important benefits for the Security Services Company:
- Insight in the process variants helped to focus the communication to the operations teams for more accurate recording of the activities.
- Both conformance and performance analysis showed immediate money on the table (value leakage).
- The provided insight is instrumental input for business strategy and tactics corrections such as adaptations in client segmentation (priority services) and the possibility for more granular time based SLA service pricing.
- More accurate information for better planning. Recommendation for geolocation based research. Process Steps T3-T5 is the critical path in reaching target SLAs.
- More and better information in preparation and planning for client acquisition tactics. The analysis are used in pre-sales and sales campaigns.
If you want to know more about this case study, you can get in touch directly with Walter Vanherle at firstname.lastname@example.org.
Get process mining news plus an extra practitioner article straight into your inbox
Every 1-2 months, we create this list of collected process mining web links and events in the process mining news (now also on the blog, with extra material in the e-mail edition).
Process Mining on the Web
Here are some pointers to new process mining discussions and articles on the web, in no particular order:
Process Mining Videos
There are also many new videos. You can now watch:
To make sure you are not missing anything, here is a list of the upcoming process mining events we are aware of.
The people who have participated in our process mining trainings so far really liked what they got. Sign up now to learn from the experts and quick-start your own process mining initiatives.
Here are the next training dates:
Would you like to share a process mining-related pointer to an article, event, or discussion? Let us know about it!
Should the manager approve her own travel request? Usually, the answer is “no” and there are many other examples of where it is not desirable that the same person performs two or more activities in a process.
For example, in a Dutch government process, where citizens can ask for special support based on their income and expenses that they incurred because of illnesses and other circumstances, the employee handling the payout should not be able to change the bank account number and make the transfer at the same time. Otherwise, it would be too easy to give your own bank account number and transfer money to it.
These kind of rules are common in all companies and they are called “Segregation of Duties” (SoD), or “Four Eyes principle”. The idea is to reduce the risk of fraud by putting systems in place that help to keep people honest. These systems are called “controls” and while some controls are realised by IT others just exist in the process documentation, the business rules.
One of the main tasks of an auditor is to check whether the controls that are defined are actually working.
Process mining can be used to check compliance rules like the segregation of duties. The advantage is that while most IT-based SoD controls are implemented on the level of authorisations (for example, employees who have the ability to change the bank account number cannot transfer the money), managing authorisations is a complex task and people change roles all the time. What happens if a person first had the role, where the bank account number could be changed, and later changes into the role with the ability to transfer money? Cases that were started with the first role could be completed by the same person in the second role.
So, while an auditor will review the IT-based authorization controls, it is also interesting to check the actual process executions to see whether the controls were effective (that is, whether SoD violations did occur or not).
Three years ago, we had shown you already how to check segregation of duties with ProM. With Disco, it has actually been possible to check segregation of duties from the beginning. In this post we want to show you how.
If you want to follow along with the instructions, you can do that by simply downloading the demo version of Disco from our website here and repeating the steps that are shown below. Let’s get started!
Get the Sandbox project
You can use the sandbox example that comes with Disco. After the installation, you will be presented with the following blank screen. Click on the Sandbox… button and …
… then double-click the second data set called Process map 100% detail (or press the View details button).
This is the discovered process map of a purchasing process and you are now in the analysis view. Here, you can look at the actual process flows, use the sliders to simplify the process and change the metrics that are displayed in the process map, all based on the data that was extracted from the IT system.
Add filter for segregation of duty violations
To check for segregation of duty violations, you can add a Follower filter. This filter can be added directly by clicking the filter symbol in the lower left corner, or via a shortcut through the process map.
Imagine that this purchasing process has the SoD constraint that the activities Release Supplier’s Invoice and Authorize Suppliers’s Invoice payment should not be performed by the same person for the same case. You want “four eyes” (two different people) to look over it to make sure this is a real invoice that should be paid.
To add a follower pattern filter, you can simply click on the arc going from Release Supplier’s Invoice to Authorize Suppliers’s Invoice payment as shown below. Once you press Filter this path…
… a new Follower filter will be added to your data set. You can now further customize the Follower filter.
To check for segregation of duties in this example, make these two changes to the Follower filter:
- Tick the box Require the same value of Resource for each pair of events matched above to enable the SoD constraint. Of course you actually want that different people are performing these two tasks. However, here we are checking for violations, so we want to see whether there are cases where the person was the same.
- Change the follower pattern from directly followed to eventually followed. Because we came in through the process map short-cut the direct path is checked. However, we want to catch all violations, regardless of whether these two activities were directly performed after one another or whether a dispute was settled in between.
You could now directly apply the filter, but let us preserve the results in a new bookmark in your project, so that you can refer back to them later on.
This can be done by using the Copy and filter rather than the Apply filter button. With Copy and filter you can give a meaningful name and apply the filter to a new copy of the data set, leaving the current data set as it is. Press Create.
Inspect the results
Now it is time to inspect the results. You will see that almost 40% of the cases are violating this segregation of duties rule! To look at some concrete examples, change from the Map view to the Cases view on the top.
You can see that there are exactly 242 cases that are violating the four eyes principle here. One case, the case with the case ID 15 is shown below and you can see that, indeed, Karalda Nimwada was doing both the Release Supplier’s Invoice and the Authorize Supplier’s Invoice Payment step in this case.
You can browse through to see more examples and export all of them to Excel.
But who exactly is violating the rule most often?
To find out, you can refine the results to focus on just the two activities involved in the SoD constraint in the following way: Click on the filter symbol in the lower left corner to add another filter and add an Attribute filter from the list as shown below.
Then, only keep the two activities we are interested in at the moment (Release Supplier’s Invoice and Authorize Supplier’s Invoice Payment). Press Apply filter.
Now you see that all the other activities have been removed and you can change to the Statistics view to look at the most frequent resources.
In the Resource statistics overview, we see that just two users are involved in the SoD violations.
To take action, we can now check their authorizations or give a targeted training.
Illustration by Wil van der Aalst
We humans are great at deriving concepts. It starts when we are a kid and learn about categories like cats and dogs, or tables and chairs, just by being exposed to many examples. So, if we meet something new, we try to put it into perspective with what we already know.
People who hear about process mining for the first time need to understand how it is different from what they are already familiar with. This is why we have frequently written about how process mining relates to other technologies like data mining or BI on this blog.
However, over time these old blog posts are a little more difficult to find, and since I keep getting these questions I thought it would make sense to revisit them in an overview article.
Above, you see a picture that Wil van der Aalst uses to explain what process mining is: In essence, process mining bridges traditional process analysis techniques like modeling (which are not based on data) and data-driven techniques like data mining (which are not process-oriented).
Read on to learn in more detail how process mining relates to specific techniques and technologies.
1. Business Process Management
The BPM life-cycle shown above is often used to describe how BPM iterates through multiple phases in designing, implementing, analyzing, and then re-designing the processes.
Process mining clearly fits into the analysis phase of the BPM life-cycle. While traditional BPM approaches start with modeling the process, process mining starts by understanding the processes that are already there by discovering the actual processes from data.
Note that BPM is not about implementing BPM systems. Instead, BPM an activity, a practice, about improving processes and might not involve any technical system implementation at all.
2. Lean Six Sigma
Process mining as a technology is agnostic to the specific process improvement (or auditing, risk management, etc.) methodology that is used with it. One popular method that is used in many organizations today is Lean Six Sigma and one of the common approaches is the DMAIC (Define, Measure, Analyze, Improve, Control) approach illustrated above.
Process mining can be used in the ‘As-is’ analysis phase to identify waste and improvement opportunities much faster and more accurately than it would be possible with a manual process mapping approach, but it also provides the opportunity to repeat the process analysis and help with controlling and sustaining the change (something that is otherwise rarely feasible today).
3. Data Mining
Data mining is much older than process mining and, like shown in Wil’s picture above, rarely focusses at processes. A typical data algorithm can be used to derive rules from data about, for example, what people are buying together in a supermarket, or predict in which suburb a marketing campaign would be most effective.
Process mining, like data mining, uses data but discovers and analyzes process models to understand how the ‘As-is’ processes look like. Research-wise, process mining comes more out of the BPM community than the data mining community. There are many possibilities to combine the two areas as they are largely complementary.
4. Business Intelligence
Business intelligence tools and dashboards have to deal with many of the same challenges that process mining has to deal with when analyzing end-to-end processes. For example, data must be pulled together from multiple IT systems and process mining can often benefit from these existing data preparation routines (many of our customers use Disco based on data from the same data warehouse that also feeds their BI dashboards).
These dashboards then analyze specific, pre-programmed Key Performance Indicators (KPIs), but they do not show how the processes work. Process mining is a complementary tool that allows to analyze the processes and find out the root causes of why the KPIs are out of bound. That’s why we sometimes compare BI to a fever thermometer (showing you whether you are sick) while process mining is like an x-ray (looking inside to see what is actually going on).
While people sometimes mistake our process mining animation for simulation, you could almost say that process mining is the opposite of simulation: Process mining starts with the current behavior and automatically discovers a model to show how the process really looks like. Simulation starts with a model and allows to explore alternative ‘what-if’ scenarios.
One big challenge for business process simulation is of course that you need to have a good model of reality to start with. Here, combining process mining and simulation (taking output from a process mining tool and use it as input in a simulation tool) provides powerful opportunities to get a more accurate starting point for the simulations.
6. Big Data
There is still a lot of buzz around Big Data, and one of the challenges that I see people having with Big Data is to extract value out of it. Process mining as a new data analysis technique that is focused on processes provides clear benefits for anyone who seeks to understand their underlying business processes in the pile of data (big or small) that accumulates.
One thing that Big Data has achieved is that it has raised the awareness about how much data there is today. Ten years ago, nobody believed that they had the data to do process mining. Nowadays, when you show process mining to someone they almost instantly recognise the opportunities and think of the data they could analyze with it.
Excel and other query tools (or auditing tools like ACL and IDEA) can be powerful at answering questions that you already have. But they do not allow you to discover new things that you would have never thought of. Furthermore, just like data mining and BI tools they do not provide you with a process perspective.
For concrete examples see this article on Why Process Mining is better than Excel for Process Analysis.
What other techniques or tools would you like to see compared to process mining? Let us know in the comments!
Last year we did three webinars to help you get started with process mining — one in English, one in Dutch, and one in German.
You can now watch video recordings of all three webinars on YouTube.
The videos are about one hour long and a great starting point for everyone who is new to process mining. We cover the following topics:
- What is process mining, and why do I need it?
- How does it work?
- Process mining with Disco
- Case studies
Watch the recordings below.
Watch on YouTube.
Watch on YouTube.
Watch on YouTube.
Also people who already knew about process mining before told me that after seeing this talk they finally understood why it was useful.
Feel free to embed the video on your website or company wiki to help us spread the word about process mining!
The date has been set! This year’s Process Mining Camp will take place on Wednesday 18 June 2014, again in Eindhoven, the Netherlands.
Mark the day in your agenda and check out the new website at www.processminingcamp.com
This is the third year of Process Mining Camp, the process mining conference for practitioners, and this year we are taking it to the next level: More and more companies have gathered experience with process mining by now and the state of the art is advancing. We are busy putting together an exciting program of practice talks, and we are working hard to make the workshops even better than last year.
Last year’s camp was attended by just above 100 process miners, and since that is about the number of people we can fit in the Zwarte Doos, we expect this year’s Process Mining Camp to sell out quickly. To make sure you don’t miss out, leave your email address on the camp website and you will be the first to know when the registration opens.
See you at camp!