Combining Lean Six Sigma and Process Mining — Part I: Define Phase

Combining Lean Six Sigma and Process Mining: Define Phase

This is the 2nd article in our series on combining Lean Six Sigma and process mining. It focuses on how process mining can be applied in the Define phase of the DMAIC improvement cycle. You can find an overview of all articles in the series here.

Imagine you are working with the director of a loan provider to address the impact of digitalization on the loan process. Due to falling interest rates, the margins of the loan products are under pressure. Therefore, the yield of these loans cannot be guaranteed in the long term, leading to limitations in funding from the shareholders. The director’s strategy is to grow the portfolio by at least 10% to achieve the same return on investment with a lower interest rate. In addition, the costs have to be reduced by 30% within two years to remain competitive in the short term.

You meet with both the customer contact and credit manager. You immediately sense that they are under pressure as they are knee-deep in a big IT change program. They argue whether this would be the right time to take on yet another project. You know that without their commitment, the project would not even lift off. You need to take an approach that gets their attention without putting even more pressure on the workforce.

Therefore, you request access to the data from the primary loan application system, which processes each loan. A couple of days later, you get your hands on a dataset with all the transactions of the loan applications of the past year (see Figure 1).

Data sample Figure 1: Data sample from the transactions in the primary loan application system

The data contains the following information:

  • Application ID: The unique identification number for each application

  • Status: The name of the step in the process that was completed

  • Timestamp: The moment at which the step has been completed

  • Resource: The anonymized1 name of the user who completed the step in the process

  • Department: The department of the user who completed the process step

  • Amount: The amount of the loan application

  • Product: The type of product (application for either a credit or a personal loan)

To start making sense of this data, you quickly get some high-level statistics. For example, using Excel and Minitab2, you can find out that:

  • On average, 5,099 applications are coming in every month. 22% of them are converted to a loan.
  • You can see the frequency of each of the activities.
  • The average lead time from online application to payout is 15 days.

However, you are still missing some context to understand what is going on. It would help if you had a deeper insight into the process itself.

Understanding the process

Value Stream Mapping seems like a good next step to better understand the process. However, this would require involvement from domain experts from operational teams. You would need at least a half-day workshop to map the process and another half-day to identify the waste. Therefore, you take a different approach and use process mining to visualize and understand the process.

The transactional data from the loan application system fulfills the minimum requirements for process mining. So, you can simply import the data set into the process mining software Disco.

During the import step, the ‘Application ID’ column is configured as the Case ID, the ‘Status’ column as the Activity name, the ‘Timestamp’ column is configured as a Timestamp, and all the other fields are included as Resource and Other attributes (see Figure 2).

Import screen Figure 2: Importing the transactional data from the loan application system into the process mining tool Disco

After pressing the Start import button, the process mining tool now automatically discovers a process map that shows how the process happened based on the data (see Figure 3 - Click on the image below to see a larger version).

Discovered process map Figure 3: The resulting process map (has been automatically discovered by the process mining tool)

From the discovered process map, you can see that 16,846 customers apply for a loan online (see 1a in Figure 3), while 3,886 customers prefer to apply by calling the call center directly (see 1b in Figure 3). You can also see that the automated pre-approval credit check rejected 7,036 customers online (see 2 in Figure 3).

For each application that is not automatically rejected, the customer is called back to establish a personal relationship. On the phone, they check whether the requested loan fits their income. An offer package is sent by post to the customer (see 3 in Figure 3). The customer must then provide the required information, such as bank statements, income statements, etc., and return a signed copy of the contract using the return envelope (see 4 in Figure 3).

The underwriter checks if the application is complete and approves or rejects the application (see 5a and 5b in Figure 3). If needed, additional information is requested (see 6 and 7 in Figure 3). If the customer has not accepted the offer, the application is canceled (see 8 in Figure 3).

Don’t forget to take a step back

One advantage of process mining compared to the traditional Value Stream Mapping approach is that you don’t need to create the process map manually. It is automatically created by the process mining tool based on the data you import. But before you can draw any conclusions from these visualizations, you need to make sure that you fully understand the process to explain your observations.

You feel that you are missing some domain expertise and decide to meet with the managers to share your findings so far. You show them an animation of the process map (see Figure 4). The animation brings the process to life by visually moving each case (here, each loan application) as a yellow token through the process based on the actual timestamps in the data.

Figure 4: The animation brings the process to life and shows how it flows

The managers immediately start to raise questions about the conversion rates at certain stages and the number of incomplete cases circulating in the underwriting stage. You are happy to have their attention, but you realize that it is too early to discuss the internal process. First, you need to focus on the essentials from a customer perspective.

Luckily, there is an extensive customer satisfaction report, including the CES and NPS scores. This report identifies the price (i.e., the interest) and the speed and ease of the application process as the most important factors for the customer. The latter point has been translated into the ambition to provide a loan within a week.

Together, you plot this ambition on the process and translate it into the following Critical to Qualities (CTQs):

  1. 90% of the offers should be provided within 8 hours after the first customer contact.
  2. 80% of the customers should have certainty about their loan application within three business days after receiving the signed contract.

Stay tuned to learn how process mining can be applied in the following phases of the DMAIC improvement cycle! If you don’t want to miss anything, use this RSS feed, or subscribe to get an email when we post new articles.


  1. Anonymizing parts of your data set can be a good way to hide sensitive information while preserving the overall characteristics of your process for your analysis. Read our privacy and ethics guide to learn which data fields you typically would want to anonymize for your process mining project (and which types of analyses you still can and which you cannot do anymore afterward). ↩︎

  2. Minitab is a statistical software frequently used by Lean Six Sigma professionals. Website: http://www.minitab.com/en-us/products/minitab/ ↩︎

Anne Rozinat

Anne Rozinat

Market, customers, and everything else

Anne knows how to mine a process like no other. She has conducted a large number of process mining projects with companies such as Philips Healthcare, Océ, ASML, Philips Consumer Lifestyle, and many others.