Which Process Mining Project Should You Start With?

When you start out with process mining, it is often not so easy to know where to start. Which process should you pick first? And which process might be less suitable for your process mining project?

Data availability and process awareness

One way to look at multiple candidates is to assess the data availability and the level of process awareness for each potential project (see the two axes in the image at the top of this article).

Data availability refers to whether there is an IT system that supports the process and collects data about the process steps that were performed. Process awareness refers to the degree to which people know what the process is and how much they follow it.

In the picture above, we describe the four main situations that you can encounter.

1. Bottom-left: Low data availability and low process awareness

For some processes, there is not much data that can be found in the IT systems about the process steps that are performed and the process itself is not very defined either.

For example, an annual planning process that determines the strategy for the coming year may be repeated every year. However, the process is executed via a series of meetings and the outcome is documented in meeting notes and an updated strategy brief for the company.

For processes that should be moved out of this ad-hoc stage into a more structured approach, the focus is on defining the process and the involved process steps first. This definition is a prerequisite to either make the process more repeatable (moving it upwards) or to introduce an IT system that supports the steps of the process in a more direct way (moving it to the right).

Suitability for process mining: Ad-hoc processes are less suitable for process mining. They typically not only lack the data that is needed for the process mining analysis but also the volume and process understanding to make such an analysis worthwhile.

2. Top-Left: Low data availability and high process awareness

Some processes are well-defined already but are still happening in a manual way.

For example, imagine a building permit process at a municipality that has not been digitized yet. The process is clear, documented, and people follow it. However, there is no data available in an IT system that reflects the steps that were taken by the municipality.

The focus for processes in this stage is often on the digitization, which means that an IT system (e.g., a workflow system, a case handling system, or an ERP system) is introduced that supports the execution of the process.

Suitability for process mining: Because of the lack of data, manual processes are not directly applicable for process mining analyses. However, because of the high process awareness it is worthwhile to:

  1. Already collect some data manually. For example, in his Process Mining Camp presentation, Jan Vermeulen from Dimension Data talked about the sales process and how they temporarily collected data in a manual way to gain more visibility into this process.
  2. Make sure that in its digitization trajectory the right data will be collected, so that process mining analyses are possible in the future.

3. Bottom-Right: High data availability and low process awareness

Some processes are already supported by IT systems (i.e., they are digitized) but they are quite unstructured and the system allows for a lot of flexibility in the process.1

For example, many hospitals have introduced electronic healthcare systems (e.g., an EHR system) that facilitates and documents the individual steps that are performed by the medical and administrative staff in the hospital. However, the patient flows themselves are highly individual and the doctors have the full flexibility in this process.

If the interest is there to better understand and improve such unstructured processes, then process mining is an excellent way to discover the actual processes that are performed. This discovery is then the basis to take the process and further define and standardize parts of it.

Suitability for process mining: Unstructured processes are a very interesting application area for process mining. The data is already available, but it is crucial that people with the right domain knowledge are involved to help deal with the complexity and to separate the different types of process that exist within the overall process domain.

4. Top-Right: High data availability and high process awareness

Once a process is fully defined and supported by an IT system, you can consider it to be an automated process. However, there are different degrees of automation that you will find here.

For example, a newly digitized ERP process may be still happening through mostly manual activities (performed by an employee via the IT system) but the routing of which step is next is defined in a relatively fixed way. Another scenario is a Straight Through Processing (STP) workflow process that fully automates the handling of travel expense reimbursements for most employees but has manual steps that are performed for some of the cases (e.g., based on a random sampling or particular process rules). Finally, a completely automated production process might not involve any manual activities anymore at all.

The focus in this stage is on the (further) optimization of the process. Typically, this is an iterative process that happens in multiple improvement cycles.

Suitability for process mining: Process mining is very applicable for such processes, because both the data is available and the necessary process awareness is available to define new goals and further improvements. However the more automated a process already is, the less interesting the process mining analysis often becomes due to the diminishing returns in the actual process improvement. Once the process is fully automated, the focus shifts into monitoring and goes away from the understanding and improving of the process as process mining supports it.

Other factors that are important

As explained in our data extraction checklist, champion support is really critical for any process mining project (see the full overview of all the skills and roles needed for your process mining team here).

In addition, you can ask yourself the following questions:

  • Why do we need to do something now? Find a project, which has a strong sense of urgency. This will help to get the buy-in of the people who are involved and, therefore, can help you to get the necessary resources.
  • What are the process related questions? Define early what the most important questions are that should be answered about each process. This will help to make sure that process mining is actually the right tool to answer these questions.
  • What is the smallest scope that still makes a significant impact? The first project should not be too big (see also this list of process mining success factors here). However, the scope should also not be too small to not be relevant anymore. Pick a project and a scope that allows you to create a success story that paves a way for your future projects.

Do you already have some ideas about which processes you could pick for your first process mining project? Start now and download the worksheet here to plot your candidates!


  1. Note that the fact that a process is supported by an IT system does not mean that it is fully automated. There are many systems that record data about what is happening, but the process itself is fully driven by the people performing the process steps in it.