Process Mining Café, Part III: Customer Journey Mining

Process Mining Café III

Our next Process Mining Café is going to be all about customer journeys! Make sure to tune in at on Wednesday 20 January, at 16:00 CET (no registration required) to join us. On this site, you can watch our café session live stream, and talk with us and other process mining afficionados in our exclusive café chatroom.

Analyzing Customer Journeys with Process Mining

This time, Daisy Wain from GOV.UK and our very own Rudi Niks will be joining us in the café. We will talk about why customer journey processes are often more complex than, for example, administrative processes. To analyze them successfully, it is important that you formulate concrete questions and filter down the data to the subset that relates to your question.

Rudi is no stranger to customer journey analysis, as you may remember from his workshop at Process Mining Camp 2018. He will walk us through the challenges that you can expect when embarking on a customer journey analysis with process mining. Daisy is a Senior Performance Analyst at the Government Digital Service in the U.K. She will share her experiences from mapping the “Start a Business” Journey on GOV.UK.

Join us!

Tune in live at on Wednesday 20 January at 16:00 CET (check your own timezone here) to watch our stream, and to give us your thoughts and questions while we are on the air.

Can’t wait? To pass the time, why don’t you:

See you all in the Process Mining Café next week!

Mapping the “Start a Business” Journey on GOV.UK

Process Mining Camp 2020

This is a guest article by Daisy Wain based on an article that has previously appeared on the Inside GOV.UK blog. If you have a guest article or process mining case study that you would like to share as well, please contact us via

Earlier last year, we began work to make it easier to start a business in the UK. Rather than just looking at individual parts of the process, we are trying to use new data techniques to think about end to end journeys through both content and services. As our understanding of user behaviour becomes more detailed, we can evaluate how effectively GOV.UK is meeting users’ needs and apply this knowledge to our wider work. Our goal is to make it quicker and easier to start a business, supporting new entrepreneurs during a challenging time for the economy.

Using data to visualise and map the journeys

The ‘Start a Business’ team on GOV.UK has been working with data scientists and engineers on the GOV.UK Data Labs team to make better use of end-to-end user journey data, including to see if there are any software or programmes available that can visualise these end-to-end journeys and how we may be able to use them to analyse journeys further to gain a deeper understanding.

The traditional work of a performance analyst in GDS is using the analytical programme Google Analytics (GA). It gives us a lot of functionality in terms of anonymised user data (including the user’s device and their geo-location) and sessions (what pages they visit or elements they click on). However, the interface is limited in being only able to look at a 2 step journey (what page a user is on now and what page a user came from) and therefore is too basic for this work.

We’ve used the underlying data from Google Analytics to be able to see and analyse entire end-to-end user journeys.

Looking at end-to-end journeys, we were interested in all user journeys that went to or through the ‘Set up a [insert business type]’ step by steps. Step by steps on GOV.UK show the logical navigation needed to complete a process on GOV.UK.

To start, we looked at users’ journeys when going through the ‘Set up a limited company: step by step’ and the ‘Set up as self-employed (a ‘sole trader’): step by step’.

Process Mining Camp 2020

Using different tools and approaches to understand journeys

We’ve combined two approaches to using data to better understand user journeys. The first approach is to visualise the most popular journeys. The second is to define a particular journey that we’re interested in, and to see how many users took that journey.

Both approaches examine large datasets to generate new information. We’ve been using Disco for the first approach and a tool called MAQUI for the second approach.

The first approach uses process mining to create an abstract model of the most journeys and interactions then visualises them. It shows clear routes into the step by step, as well as showing which pages users visit, and what they do on those pages - for example, what links they click on or elements (e.g. opening the step by step accordion) they interact with.

This visualisation allows us to begin to identify problem behaviours such as circuitous journeys and bottlenecks.

Process Mining Camp 2020

The image above is an example of the visualisation approach using Disco, showing the most popular user journeys on GOV.UK involving a chosen step-by-step.

The second approach involves defining a journey we’re interested in and seeing how many people take that journey via an open-source tool MAQUI. This tool was developed by Terrance Law as part of their PhD ‘Automated yet Transparent Data Insights’, code available via Github.

Process Mining Camp 2020

The image above is a visualisation of an example of the definition approach, using the tool MAQUI.

These tools and software have enabled us to answer much more detailed questions on user behaviours, such as:

  • What are the most popular routes into the content?
  • What pages do users visit between point A and point B?
  • How do journeys vary depending on the device used?

A simple example of leveraging these tools would be ‘how do users get to the step by step?’ and ‘can this journey be improved?’.

Using both tools, we were able to identify that 20% of users travel to the same three pages before getting into the limited company step by step: the Companies House organisation page, the start a company specialist topic page into the limited companies page (the first step of the limited companies step by step) and then finally users using the breadcrumb to navigate to the main step by step homepage for ‘Set up a limited company’.

A simple answer to ‘how can we improve these journeys’ would be to add a direct link to the step by step. This data driven design change could then be assessed using the scientific method with A/B testing.

Process Mining Camp 2020

The image above is a visualisation of a common user journey, which starts on the Companies House organisation page, moving first to the Start a company page, then the first step of the limited companies step by step and then the main step by step homepage for setting up a limited company.

Additional insights we have gathered include:

  • users re-using the same sidebar navigation for the ‘Register your company’ step, indicating that the start button for the service was not easy to find
  • users bouncing back and forth from the limited company content to the sole trader content - indicating confusion over how business naming rules work for different types of businesses
  • there are virtually no users (only 3%) using the whole step by step format for the sole trader content, meaning this format is not working effectively. The sole traders step by step has a large number of circular journeys back into multiple competing start pages. The page pointing to self-assessment has about half the click-through rate we’d usually expect from a page of this type.

What we’ve learnt about user journeys

This new way of visualising and conducting performance analysis has proven to be a real game changer for GOV.UK because it has allowed us to make better use of an existing dataset and to be able to analyse a whole problem space like never before. The next step is to see how well this fits with other teams, projects and content in GDS. We want to start applying this type of analysis to other areas, so that we can understand how users are reaching different types of content and evaluate if we are providing them with the best possible service.

Recently we looked at how users were reaching a set of specific content pages. By comparing them to the step by steps used in the first analysis, we have started to see clustering of different user behaviour depending on their routes into GOV.UK. This means that there are users who are using GOV.UK differently depending on the content they are looking for.

As a result, content, tooling and journeys can be better optimised based on the most popular journeys we know users are taking.

The next step for this work is to continue to work with the data scientists and engineers to create reusable code. This will eventually allow the performance analysis community to completely self-serve in being able to access the Big Query data required for their team or project and then to use this new tooling, in combination with the more traditional Google Analytics, to offer a whole new layer of data-driven analysis, insights and recommendations.

By bringing together deep data driven insights with the skills of multi-disciplinary teams, we can really start to apply this knowledge to make high-quality informed changes to user journeys and better meet their needs.

Process Mining Trainings 2021

Process Mining Camp 2020

Is 2021 the year that you want to get serious about process mining? Join one of our small-group trainings online!

Disco makes it very easy to get started with process mining: You import some data and Disco produces a process map. That is a great start, but your journey has only just begun — There are a lot of important topics around process mining that you really need to know, so that you can apply it productively: How do you prepare the data? How can you ensure data quality? How can you interpret your results? And also, what kinds of analyses can you even do in the first place?

We have heard all the questions that process mining newcomers ask, and we know the things they often miss when they start out. In this course, we have put all our experience together to give you the essentials that you need to know to be ready for using process mining in practice. Skip the learning by trial and error, and put that documentation and theory books aside for a minute — This training will give you the fundamental knowledge and skills to hit the ground running and use the full potential of process mining in your work.


We give a new training every two months. Our training takes place over a series of interactive web training sessions. This means that you can jump in and ask questions at any point in time, just like you would do in a classroom setting.

In each training, there are four sessions that run from 15:00 to 17:00 CEST each day (see your own timezone here). Between the sessions, you have a few days to process the materials and practice with additional exercises, on your own time. This online training covers all the practical process mining topics of our popular two-day on-site training.

The registration for all the 2021 trainings is now open:

See further details on our training website and reserve your spot for the training now!

From Research to Practice with Hajo Reijers

We had great fun hanging out with Hajo Reijers, Professor for Business Process Management & Analytics at Utrecht University, in last week’s Process Mining Café. You can now watch the recording here.

We started out with a short follow-up to our previous Process Mining Café, discussing utilization as another metric that is only available if you have start and complete timestamps.

The conversation then turned to how processes are made of people, and that process mining needs to be aware of that. With his focus on human-centric processes, Hajo is also interested in empirical research methods. He told us about the “think-aloud” protocol as one of the methods and shared some tips for researchers who are new to empirical research.

In the end, we picked two new process mining papers and discussed what we found interesting about them. One paper provides a framework to categorize the root causes of data quality problems. The other one uses statistical methods to uncover dependency relations between activities in a process and desired/undesired effects.

Here are the links to the two papers that we discussed and to the two main conferences publishing process mining research each year:

Enjoy the holidays, everyone! We’ll see you again in the new year!

Disco 2.10

Software Update

We are happy to announce that we have just released Disco 2.10.

In this release, we have improved the auto-detection of imported file types, along with detailed feedback on parsing errors for MXML and XES. Furthermore, this update fixes a couple of minor issues around setting and applying filters.

We recommend that you update at your earliest convenience. Like every release of Disco, this update fixes a number of bugs and improves the general performance and stability.

Thank you for using Disco! We shipped quite some improvements for our little app this year, and we hope that some of those have made your life a little easier and more productive. Let us know what you think, keep the bug reports coming, and we’ll take it from here — next year!

How to update

Disco will automatically download and install this update the next time you run it, if you are connected to the internet.

If you prefer to install this update of Disco manually, you can download and run the updated installer packages from


  • Filter:
    • Fixed an issue with filtering for explicit case IDs and variants.
    • Properly reflect data set selection after empty filter result.
  • XES Import: More detailed feedback for malformed files.
  • MXML Import: More detailed feedback for malformed files.
  • XES Export: Updated XES version declaration.
  • User Feedback: Improved debug logging.