Time Capsule — Lalit Wangikar at Process Mining Camp 2013

To celebrate our brand-new home for camp talks we are starting to release the talks from Process Mining Camp 2013 and 2014 for the first time. Grab a snack, sit back, and enjoy the journey through time, back to the early stages of our process mining community from seven years ago!

Lalit Wangikar at CKM Advisors: Process Mining and Data Science in the Financial Industry (United States)

Lalit Wangikar, a partner at CKM Advisors, is an experienced strategic consultant and analytics expert. He started looking for data driven ways of conducting process discovery workshops. When he read about process mining the first time around, about 2 years ago, the first feeling was: “I wish I knew of this while doing the last several projects!".

Interviews are subject to all the whims human recollection is subject to: specifically, recency, simplification and self preservation. Interview-based process discovery, therefore, leaves out a lot of “outliers” that usually end up being one of the biggest opportunity area. Process mining, in contrast, provides an unbiased, fact-based, and a very comprehensive understanding of actual process execution.

Watch Lalit's talk now!

Time Capsule — Tijn van der Heijden at Process Mining Camp 2013

To celebrate our brand-new home for camp talks we are starting to release the talks from Process Mining Camp 2013 and 2014 for the first time.

These talks come with their original descriptions, just as they were presented back then. As it happens, of course some speakers do not work in the same roles anymore, and they may have changed companies. Still, we think you'll agree that the presentations hold up perfectly! Watch this first video straight from our time capsule, and let it teleport you back to the early stages of our process mining community. Grab a snack, sit back, and enjoy the talks in this series, as we wait for this year's camp to draw closer!

Tijn van der Heijden at Deloitte: A Framework for Process Mining Projects (Netherlands)

Tijn van der Heijden is a business analyst with Deloitte. He learned about process mining during his studies in a BPM course at Eindhoven University of Technology and became fascinated with the fact that it was possible to get a process model and so much performance information out of automatically logged events of an information system.

Tijn successfully introduced process mining as a new standard to achieve continuous improvement for the Rabobank during his Master project. At his work at Deloitte, Tijn has now successfully been using this framework in client projects.

Watch Tijn's talk now!

Process Mining Camp on 16 & 17 June — Save the Date!

The date has been set — This year's Process Mining Camp will take place on 16 & 17 June in Eindhoven1!

For the ninth time, process mining enthusiasts from all around the world will come together in the place where process mining was born. Once again, this year's Process Mining Camp will run for two full days. The first day (16 June) will be a day full of inspiring practice talks from a diverse set of companies. The second day (17 June) will be a hands-on workshop day where smaller groups of participants dive into various process mining topics in depth.

Process Mining Camp is not your run-of-the-mill, corporate conference but a community meet-up with a unique flair. We are very proud of the fact that Process Mining Camp is just as international as the process mining community itself. Over the past years, people from 34 different countries have come to camp to listen to their peers, to share their ideas and experiences, and to make new friends.

You know what to do: Open up your calendar, mark 16 & 17 June as busy, and notify your boss that you will be attending a conference in week 25 this year. If you need a visa to travel to the Netherlands, it is a good idea to already apply for that as well. Also: Sign up for the camp mailing list here to be notified when tickets go on sale. And even if you cannot make it to camp this year, you should sign up to receive the presentations and video recordings as soon as they become available.

In the meantime, we are excited to share two things with you:

We made a brand-new home for the camp talks

We record all talks at Process Mining Camp so that everyone of you can learn from the experiences our speakers have shared at camp — regardless of whether you have been there in person or not.

Our campers are honest and forthcoming folks who do not just brag about their successes but also share their pitfalls and failures. Quite often, you can learn much more from those experiences than from “success stories”, where everything just went really great, and everyone got a cookie.

To make it easy for you to dive in and explore the wealth of process mining knowledge that has been accumulated over the years, we have put together a brand-new camp website that already contains all the talks from Process Mining Camp 2012, 2015, 2016, 2017, and 2018, complete with slides and all bells and whistles — And this is of course the place where we intend to put all the rest of our talks going forward, so it's a good idea to save a bookmark.

Now, while we are all patiently waiting for this year's camp to roll around, why don't you check out some of the camp talks that are already online from the previous years? You will get lots of new ideas about approaches and use cases that you might not have considered so far.

Time capsule: Process Mining Camp 2013 & 2014

While we are on the topic of previous camp talks, some of you may have noticed that the talks for 2013 and 2014 have never been released.

Initially, it appeared that there were only quite bad recordings from those two years. Now, we went looking once more and, fortunately, we did find the “good” recordings that were long thought to be lost. We polished them up — and they turned out to be pretty great!

We are going to share those talks with all of you in the coming weeks, right here on our blog. It is going to be quite a journey through time, teleporting us back to the early stages of our process mining community from six and even seven years ago, so strap in!

See you in Eindhoven on 16 June!


  1. Eindhoven is located in the south of the Netherlands. Next to its local airport, it can also be reached easily from Amsterdams Schiphol airport (direct connection from Schiphol every 30 minutes, the journey takes about 1h 20 min). ↩︎

Data Quality Problems In Process Mining And What To Do About Them — Part 14: Unwanted Parallelism

This is the 14th article in our series on data quality problems for process mining. You can find an overview of all articles in the series here.

Disco detects parallelism if two activities for the same case overlap in time. Usually, this is exactly what you want. If you have a process that contains parallel activities, these activities cannot be displayed in sequence because this is just not what happens.

However, sometimes activities can overlap in time due to data quality problems. For example, if you look at the ‘Case 1’ below then you see that activities B and C overlap by just 1 second. The process mining tool sees that both activities are (partially) going on at the same time and shows them in parallel.

Unwanted Parallelism in Process Mining (click to enlarge)

If you expect your process to be sequential, parallelism may be due to the way that the data is recorded. For example, for ‘Case 1’ above the Complete Timestamp for activity B might have been supposed to be the same as the Start Timestamp for activity C, but they were written by the logging mechanism one after the other, or the writing of the Complete Timestamp for activity B might have been delayed by the network in a distributed system.

So, if the process that you have discovered is different from what you expected it to be, it is worth investigating whether you are dealing with parallel activities.

One way to see that you are dealing with parallelism is that the frequency numbers in the process map add up to more than 100%. For example, in the process map above activity A has the frequency 1 but the frequencies at the outgoing arcs add up to 2 (because both parallel branches are entered in parallel). Another way to see this is by switching to the Graph view in the Cases tab, which will not show you any waiting time between activities that overlap in time (if there is 0 time between them the waiting time will be shown as ‘instant’).

How to fix:

  1. First, investigate individual parallel cases to understand the extent and the nature of the parallelism in your process (see below). For example, it could be that parts of your process are actually happening in parallel while you expected the process to be sequential.

  2. If you are sure that the parallelism is due to a data quality problem, you can create a sequential view of your process by choosing just one timestamp column during the import step (see below).

  3. To fully resolve unintended overlapping timestamps for activities that should be recorded in sequence, you need to go back to the data source and correct the timestamps before you import the data again into Disco with both timestamps. Ultimately, the logging mechanism needs to be fixed to prevent the problem to re-occur in the future.

Both of the following strategies can also be useful to better understand processes that actually have legitimate parallel parts in them. Parallel processes are often much more complicated to understand and it can help to look at example cases and at sequential views to better understand the (correct but still complicated) parallel process views.

1. Explore individual parallel cases

When you look at an individual case in the process map for a sequential process, the result is quite boring because you will see just a sequence of activities (unless there is a loop in the process). However, in parallel processes each case can have activities that are performed independently of each other. So, the process map for even a single case can become quite complex.

To fully understand this, it helps to look at an individual case in isolation. For example, in a project management process that contains a lot of parallel activities we might choose to look at the fastest case to get an idea of what the process flow has looked like in the best case. To do this, we sort the cases based on the duration and use the short-cut ‘Filter for case …’ via right-click to automatically add a pre-configured Attribute filter for this case. We can then save this view with the name ‘Fastest case’ (see screenshot below - click on the image to see a larger version of it).

Filtering the fastest case (click to enlarge)

After applying the filter, we can see the process map for just this one case. Although we are looking at a single case, the process map does not just show a sequential flow but a process with parallel activities in several phases (see below).

Proces map for the fastest case (click to enlarge)

When we look at the performance view, we can see that out of the 5 activities that are performed in parallel between the ‘Manage Test plans’ and the ‘Install in Test Environment’ milestones the ‘Quality Plan Approval’ and the ‘CSR Plan Approval’ steps take the longest time (see below).

Performance view for the fastest case (click to enlarge)

When we animate this single case, we see the parallel flows represented by individual tokens as well. For example, in the screenshot below, we can see that on Sunday 20 May the ‘Quality Plan Approval’ and the ‘CSR Plan Approval’ activities were still ongoing (see blue token in activity) while the other three activities had already been finished (see yellow tokens between activities).

Animation for the fastest case (click to enlarge)

In contrast, when we look at the slowest case in this project management process then we observe that the most time is spent in a later part of the process, namely the ‘Run Tests’ activity (see below).

Performance view for the slowest case (click to enlarge)

Tip: To view multiple parallel cases - still isolated from each other - next to each other, you can duplicate the case ID column in the source data and import your data again with the duplicated column configured as part of the activity name. You will then see the process maps for all the individual cases next to each other. Of course, this will be too big for all cases, but you can then again focus on a subset by filtering, e.g, 2 or 3 cases and look at them together.

This will provide you a relative comparison for the performance views. Furthermore, you can use the synchronized animation to compare the dynamic flow across the selected cases with a relative start time in the animation.

2. Import sequential view of the process

To completely “turn off” the parallelism in the process map, you can simply import your data set again and configure only one of your timestamps as a ‘Timestamp’ column. If you have only one timestamp configured, Disco always shows you a sequential view of your process. Even if two activities have the same timestamp they are shown in sequence with ‘instant’ time between them.

Looking at a sequential view of your process can be a great way to investigate the process map and the process variants without being distracted by parallel process parts. Furthermore, taking a sequential view can be a quick fix for a data set that has unwanted parallelism due to a data quality problem as shown above.

If we want to take a sequential view on the example Case 1 from the beginning of this article, we can choose either the ‘Start Timestamp’ or the ‘Complete Timestamp’ as a timestamp during the import step. Keep in mind that the meaning of the waiting times in the process map changes depending on which of the timestamps you choose.

For example, if only the ‘Start Timestamp’ column is configured as ‘Timestamp’ during the import step, then the resulting process map shows a sequential view of the process for Case 1. Because there is only one timestamp per activity, the activities themselves have no duration (shown as ‘instant’). The waiting times reflect the times between the start of the previous activity until the start of the following activity (see below).

Sequential view of Case 1 based on start timeststamps (click to enlarge)

In contrast, when the ‘Complete Timestamp’ column is chosen as the timestamp then the waiting times will be shown as the durations between the completion times of the activities (see below).

Sequential view of Case 1 based on end timeststamps (click to enlarge)

So, keep this in mind when you interpret the performance information in your sequential process map after importing the data again with a single timestamp. 1

To sum it up, parallel processes can cause headaches due to their increased complexity quite quickly. Try the two strategies above and don't forget to also apply the regular simplification strategies to explore what helps best to give you the most understandable view. For example, looking at different process phases of the process in isolation and taking a step back by focusing on the milestone activities can be really useful. All these strategies can also be combined with each other.


  1. Note that the order of the events in a case - and, therefore, the variants - might change when you change the import perspective from ‘start’ timestamp to ‘end’ timestamp (or the other way around) as well. ↩︎

Process Mining in The Assurance Practice — Applications and Requirements

This is a guest article by Suzanne Stoof and Nils Schuijt from KPMG and Bas van Beek from PGGM based on an article that has previously appeared in Compact magazine. If you have a guest article or process mining case study that you would like to share as well, please contact us via anne@fluxicon.com.

PGGM provides Assurance Standard 3402 and Assurance Standard 3000 reports that are specific for each customer. Within PGGM, process mining is used to show that a number of processes can also be tested for multiple clients at once because these processes are generic for multiple pension funds.

We describe the experiences of PGGM with regard to process mining based on a practical example. Specifically, the impact on the work of the auditor for the Assurance Standard 3402 and Standard 3000 report and the conditions are described. We also outline how process mining can be deployed to perform the audit more efficiently and with a higher quality in the future.

Introduction

PGGM is one of the largest pension administration organizations in the Netherlands. It is responsible for the management of the pension administration for multiple pension funds, including the Pension Fund Care and Welfare (PFZW). To demonstrate to its customers that processes are controlled properly, the PGGM Service Organization Control (SOC) provides reports in accordance with the Assurance Standard 3402 and the Assurance Standard 3000. These Assurance Standard 3402 and Standard 3000 Reports are provided specifically for each pension fund.

PGGM and their auditors have discussed the options that may exist to shape the process of testing the internal control measures for the SOC reporting more efficiently. PGGM wants to keep providing separate Assurance Standard 3402 and Standard 3000 reports per pension fund. To be able to test a process in a multi-client fashion, it is important that it can be demonstrated that these processes and corresponding control measures are performed in a generic way for all pension funds. In this context, process mining can help by showing that certain processes are indeed performed in the same way for multiple pension funds. That is why PGGM started to experiment with process mining. Their aim was to achieve both more efficiency and a higher quality for their audits.

Process mining in the audit practice

Within the audit practice [Rama16]1, process mining can be deployed during multiple phases in the audit process:

  1. During walkthroughs. For this, process mining is used to visualize the walkthrough based on the event data. The advantage of this is that not only the happy flow but all possible paths within a process are mapped.
  2. As a basis for sampling or partial observations. By doing this, it is possible to audit only items with a higher risk, for example, because they do not follow the happy flow, but go through an alternative path.
  3. For compliance checking. With this, control measures like a four-eyes principle can be tested in a process for the entire population, for example.

Process mining was initially deployed to perform a line audit of the processes at four PGGM customers. Subsequently, these four process flows were put next to each other to demonstrate that each of the four pension funds follows exactly the same steps within the process.

Experiences with process mining at PGGM

PGGM has established a multidisciplinary process mining project team with expertise in both the domain of pension processes as well as with expertise in process analysis and data analysis.

The first phase of the experiment was focused on the exploration of the possibilities of process mining and the tooling. The added value of process mining quickly became visible as it provided insight into the actual execution of the processes, including the bottlenecks. For instance, it became clear that activities existed that were forwarded many times and without any need, and that the waiting times at the transfer of work between departments were long. PGGM was able to solve these bottlenecks by a redesign of the process flow. Other examples of initiated process improvements are:

  • reduction of the lead time and the creation of customer value by the elimination of activities that do not provide added value to the process;
  • realization of better process control by insight in first time right;
  • design of a multi-client process execution instead of a fund-specific implementation;
  • application of Robotic Process Automation in processes. This means that repeating human activities within administrative processes are performed by software robots.

The next step was to examine how process mining can be deployed to obtain insight into process controls. That was performed based on the principles that process mining results in:

  1. a more efficient implementation of the controls;
  2. time-saving of the audit work for the second and third level;
  3. in the long term probably a greater assurance, because entire populations are checked instead of partial observations.

Process mining can provide additional certainty, because it is based on a comprehensive analysis of the entire population. Therefore, the selection of partial observations, which is often the current methodology, becomes superfluous. Instead, all the activities and underlying relations in the entire population are shown. One example of the application of process mining on an entire population is the confirmation whether all letters sent to the participants were checked by an employee. Another example is the check whether for each change a segregation of duties rule was followed.

Limiting factors of process mining are often (as PGGM experienced as well) that the data architecture is not designed for simple use of process mining. The data preparation takes a lot of time, because the required information is stored in different systems. Furthermore, not all manual activities within the workflow system are logged, which means that not all processes can be covered by the data. A well-structured data-architecture is essential to make optimal use of a process mining tool.

The case of the ‘Disbursement’ process

We explain the application of process mining at PGGM more detail based on a practical example: The Disbursement process.

The starting point for the process mining analysis was a consultation with all parties involved in the disbursement process within PGGM. The purpose of this consultation was to determine the viability of the multi-client execution of the audit work. As a result of the consultation, it has been concluded that the Disbursement process would be eligible to be performed multi-client. The actual viability should, inter alia, be demonstrated by process mining.

In the disbursement process, the pension rights and awards of participants are converted into an actual disbursement. An important part therein is the conversion of the gross amount awarded to the net disbursement rights: The gross/net-calculation. Furthermore, the process includes various checks and authorizations that are necessary due to the nature of the process. The disbursement process includes three main activities (see Figure 1).

Figure 1: The disbursement process

Figure 1: The disbursement process

The first step in the analysis was the creation of an event log. As data source, the payment and financial systems were used. Subsequently, the data was loaded into the process mining-tool.

The first results based on the event log were not satisfactory yet, upon which the event log was enriched with data from other sources, where the auditor was able to follow the data trail. In the end, the final event log that was created resulted in the overview as shown in Figure 2.

Figure 2: Outcome process mining

Figure 2: Outcome process mining

The outcome of the analysis in Figure 2 shows that the process flows of the four pension funds A, B, C, and D work identically. At first, a gross file is generated in the system, where the pension rights are administrated (process step: Gross file). In the gross file, the gross pension rights are recorded. In the next step, the conversion of the gross pension rights into the net payment rights takes place. This calculation is performed by an external party (process step: Gross net calc’). Subsequently, the net disbursement file is received back (process step: Net file). Hereafter, verifications take place if the gross/net-calculation was done properly, after which the authorization and approval of the net disbursement file occur (process step: Authorization). Finally, this disbursement is provided to the payment department, that performs the payment made by the bank (process step: Disbursement).

Another approach that can be applied is that process mining shows the entire process flow. The cases included in the ‘happy flow’ are considered to be in control. What is interesting are any exceptions that become visible. These non-‘happy flow’ paths have to be analyzed and explained, because they are undesirable in the context of process control. As visible in Figure 2, in this process, no exceptions existed.

By means of the analysis and the outcome, as shown in Figure 2, it is demonstrated that the processes are performed identically for multiple pension funds. Making use of process mining, it has been demonstrated that all activities in the process, regardless of which pension fund, follow the same process flow. For the documentation of this conclusion, a description of the log and the way of data extraction from the workflow-tool is included. It is also described which filters have been used in the process mining tool, and the controls are plotted on the process map. In addition to the use of process mining, the analysis is further substantiated by interviews with subject matter experts, a walkthrough, and inspection of, inter alia, operational instructions, policies, and manuals.

Based on the experiences of PGGM, the following lessons learned’ were derived:

  • Ensure an appropriate design of the data architecture;
  • Take advantage of the existing knowledge in the organization and activate it. Think of data analysts, SQL-specialists, process analysts and auditors;
  • Do not only focus on process mining but make use of a combination of data analysis techniques;
  • Experiment and be receptive to new insights and techniques.

Impact on the audit work of the auditor and requirements

During the preliminary stage, PGGM and their auditors talked a lot about the conditions and opportunities to apply process mining in the context of the Assurance Standard 3402/3000-audit, to show that a certain process is generically applied for multiple pension funds.

PGGM wishes to keep the Assurance Standard 3402/3000-reports specific per pension fund. In the event that several processes will be tested multi-client, it is essential that it can be demonstrated that these processes and corresponding control measures actually take place in a generic way for all pension funds.

For this, from the auditor's point of view, a number of matters are important. They are:

  • Scoping. Beforehand, consideration should be given to the scoping, i.e. which pension funds, processes, process steps, etcetera, belong to the audit object;
  • Being able to demonstrate the reliability of the data that is used is of importance. For instance, not all systems are yet able to unlock the data that can be used for process mining;
  • Procedures other than process mining provide additional audit evidence to determine if the process and the control measures are generic, including the review of process descriptions;
  • Explanation of this approach in the Assurance Standard 3402/3000 report.

Because at PGGM two different applications were used in which the pension administrations are performed, the decision was taken that, for this reason, it is not possible to follow a generic methodology for all pension funds. For four pension funds, of which the pension administration is performed within one application, it was decided to further investigate this.

With the help of process mining, it can be demonstrated that the processes follow the same flow for all four pension funds. This shows that the processes and corresponding control measures in the application are performed in a generic way. To the auditor, it was important that PGGM had clearly documented how it came to this conclusion. This, inter alia, means that PGGM had to show the auditor how it had performed the analyses using process mining, and which conclusions have been drawn. The analysis and explanation of the exceptions were then repeated by the auditor. It was also important that the reliability of the data, including the population based on which the process mining took place, could be determined. This includes that it must be traceable how the data (the so-called information produced by the entity’) was obtained from the system, and that it is correct and complete. For this, among other things, it is important that it can be guaranteed that, after downloading the data from the pension administration, no more manual adjustments have been made.

To the auditor, it is also important to confirm that the processes that will be treated as multi-client are carried out by one team, instead of by specific customer teams. Specific customer teams would imply the risk that certain audits could still be performed in another way. Based on process descriptions, we have established that there is one Shared Service Center that performs the processes in a generic way for all pension funds.

From the point of view of the auditor, it is also important that in the Standard 3402/3000-report it is clearly explained to the users that not all processes were individually tested for that specific user, but that it was performed for a number of processes based on a multi-client approach. Both PGGM as well as the auditor clearly explain this in the report. Process mining can thus generate added value to the user of the Assurance Standard 3402/3000-report. In addition to the written explanation, it is recommended to inform the pension funds in time and orally during periodic discussions about this approach.

Future

Currently, we are also looking into the future, where, inter alia, the possibilities are investigated to integrate process mining in the control measures. An example of this is that an employee of the pension administration determines, based on process mining, if no exceptions compared to the standard process exist for an entire population over a certain period. In case there are exceptions, they analyze the exceptions. An advantage of this method is that the complete population is considered in the execution of the control measure. Furthermore, also the auditors base themselves on entire populations, instead of selecting a number of partial observations based on which a conclusion is drawn.

In this way, assurance can be provided based on the complete population in an efficient way, which can also generate added value for the user of the Assurance Standard 3402/3000-report. Additionally, process mining could be deployed as a continuous monitoring tool, where the data could be loaded repeatedly to directly detect deviations within the process.

Conclusion

During the audit of the Assurance Standard 3402 reports by PGGM, it deployed process mining in consultation with KPMG. Hereby it was demonstrated that four of the pension funds follow the same process and that they also make use of the same controls within the process. Process mining provides insight into the entire population, while the auditor usually makes use of partial observations. The next steps in the implementation of process mining at PGGM concern both the combination with other processes and the introduction of process mining as an audit tool within the Assurance Standard 3402/3000 reporting. By the deployment of process mining as control, continuous monitoring also comes a step closer.


  1. [Rama16] E. Ramezani Taghiabadi, P.N.M. Kromhout, and M. Nagelkerke, Process mining: Let DATA describe your process, Compact 2016/4, https://www.compact.nl/articles/process-mining/, 2016. ↩︎