Process Mining at The Dutch Railways — Process Mining Camp 2017

Are you getting ready for this year’s Process Mining Camp? If you haven’t registered yet, make sure to secure your ticket for 19 and 20 June now!

To get us all into the proper camp spirit, we will be releasing the videos from last year’s camp over the coming weeks. The first speakers at Process Mining Camp 2017 were Remco Bunder and Jacco Vogelsang, two pioneering information analysts at Nederlandse Spoorwegen (Dutch Railways). Remco and Jacco had joined camp in 2016 and got inspired by the talk of Paul Kooij. As innovators, they were eager to get started and find a good use case to show the value of process mining within their organisation. But where to start and how to do it?

They decided to simply apply process mining on every dataset they could put their hands on. By doing this, they gained knowledge and experience and started making unexpected observations. One of the first experiments was to track the OV-Bike (a bike rental service) from ‘rented’ to ‘return’. They saw that a lot of the bikes seemed to be reported as stolen upon return. This was unexpected and further investigation revealed that many of the bikes got reported as stolen because the ‘report stolen’ button was too close to the ‘return’ button. This minor mistake lead to the ordering of too many new bikes.

Eager to find more, they started looking at how the lockers at the stations were being used. Especially if lockers are abandoned for more than 94 hours, it was required to check the locker and empty the content out of the locker. With process mining they were able to show that it would be better to wait an additional 48 hours before emptying the locker.

They continued to analyse even more complex processes such as the reporting and resolution of broken windows, escalators, elevators, etc. by benchmarking the quality and efficiency of resolving these failures for different channels, stations, and vendors. The experiments from Remco and Jacco are a fantastic example of how enthusiasm and persistence can help you grow from a process mining novice to achieving great results within just 1 year.

Do you want to start experimenting with process mining within your organisation? Watch Remco’s and Jacco’s talk now!

———-

If you can’t attend Process Mining Camp this year, you should sign up for the Camp mailing list to receive the presentations and video recordings afterwards.

Process Mining Camp 2018 — Get Your Ticket Now!

Have you always wanted to meet other process miners in person? Perhaps you followed the MOOC and would like to share your experiences with people who are also just starting out. Or you have already worked with process mining for several years and now you want to learn from other organizations about how they made the next step?

Get your ticket for Process Mining Camp on 19 & 20 June now!

For the seventh time, process mining enthusiasts from all around the world are going to come together in the birth place of process mining1. Last year, more than 220 people from 24 different countries came to camp to listen to their peers, share their ideas and experiences, and make new friends in the global process mining community.

Like last year, this year’s Process Mining Camp will run for two days:

Day 1: Practice Talks on 19 June

The first day (Tue 19 June) will be a day full of inspiring practice talks from different companies, as you have seen at previous camps.

While we are still putting the finishing touches on this year’s camp program, we are excited to share with you the first five speakers for our practice talks:

Fran Batchelor — UW Health, United States

UW Health is a large academic medical center associated with the University of Wisconsin-Madison located in Midwestern United States. More than 600,000 patients are served annually at 7 hospitals and 87 outpatient clinics.

Fran Batchelor is a Nursing Informatics Specialist at UW Health supporting surgical services at 3 of its hospitals. Fran will share the challenges and successes of introducing process mining to UW Health. She will also demonstrate how process mining was used to analyze the flow of urgent and emergent surgical cases added to the schedule and how this technology provided a new way of using the data.

Niyi Ogunbiyi — Deutsche Bank, United Kingdom

Deutsche Bank is Germany’s leading bank with a strong presence in Europe and significant presence in Americas & Asia Pacific.

Niyi Ogunbiyi is a Six Sigma Master Black Belt in the Chief Regulatory Office (CRegO) Operational Excellence Team. In his talk, he discusses how the bank has fared in its process mining journey and which lessons they have learnt along the way. One of the things he will show is how they balanced the exploratory and the targeted parts of their process mining analyses.

Marc Tollens — KLM, The Netherlands

Founded in 1919, KLM Royal Dutch Airlines is the oldest scheduled airline in the world still operating under its original name. In 2016, the KLM Group operated worldwide flights with over 200 aircraft, generating €10 billion in revenue and employing 32.000 staff from its Amsterdam basis.

Marc Tollens is a digital product owner. He leads the development teams to develop online services to create an optimal journey for their customers. In his talk Marc will share how he used process mining to help his teams to learn how to get the most out of each sprint.

Dinesh Das — Microsoft, United States

Microsoft is the worldwide leader in software, services, devices and solutions that help people and businesses realize their full potential.

Dinesh Das is the Data Science manager in Microsoft’s Core Services Engineering and Operations organization. The converge of digital technologies with machine learning and cognitive solutions gives him the opportunity to reimagine everything every day. He believes that process mining can be a silver bullet to accelerate the digital transformation and is passionate to share his experience.

Olga Gazina — Euroclear, Belgium

Euroclear is one of the largest Financial Market Infrastructure providers in the world. Many of Euroclear’s business processes rely on sophisticated IT services developing a large variety of reliable, scalable, and secured solutions.

Olga Gazina is working for the Internal Audit department as a Data Analyst. With the goal to make internal controls more efficient, she has applied process mining to the code testing process of the Component and Data Management IT division. Olga will share the main steps of dealing with complex data and tips for finding the most useful angles from which the process should be looked at.

Day 2: Workshops on 20 June

On the second day (Wed 20 June), we will have a hands-on workshop day. Here, smaller groups of participants will get the chance to dive into various process mining topics in depth, guided by an experienced expert.

Participation in workshops is of course optional, but if you want to hone your craft and focus on your topic of choice with a group of like-minded process miners, you will fit right in! The workshops take place in the morning and all four workshops will run in parallel (so you need to pick one).

You can choose between the following four workshops:

Workshop 1 · How can I use process mining in my internal control system?

Marc Gittler & Patrick Greifzu, Deutsche Post DHL Group

The benefits of process mining during internal control and compliance audits have been discussed often in recent days. The main objective of these reviews is to get an overview of high-risk processes and to identify gaps within the internal control system. In the past, traditional data analysis techniques did not produce sufficient results, because they are time-consuming, technically challenging, and not free of bias.

During this workshop, you will have the opportunity to get in touch with experienced auditors, which have used data analytics and process mining techniques for different use cases. You will get an overview of the different phases of an audit process and how process mining will fit in these phases, from preparation to reporting. It does not matter if you work as an auditor, compliance officer, or risk manager — Process mining can be used by all control units (1st to 3rd line of defense) within a company to ensure efficiency and effectiveness of the internal control system.

Marc and Patrick have over ten years of experience as auditors. Before they joined Deutsche Post DHL Group Audit, they worked as auditors in the banking sector. During that time, they applied data analytics and process mining techniques to make their audit work more efficient and target-oriented, especially for operational postal processes, to reduce the risk of losing revenue.

Workshop 2 · How can I prepare bigger data sets for process mining?

Eddy van der Geest, Tata Steel

Preparing your data for process mining can be a complex and time-consuming task. Especially when the data size exceeds the capacity of your beloved Excel application, you need to find other ways to transform your data in the right format for process mining.

ETL (Extract Transform Load) tools make it possible to combine and transform large data sets without any programming skills. In this workshop, you will learn how to perform some common data transformation tasks for process mining with an ETL tool. We will use the data analysis tool KNIME for the hands-on exercises. Even if you are not too tech savvy, you will see that you can prepare your data yourself in minutes, by just dragging and dropping and connecting the dots.

Eddy van der Geest is a senior auditor at Tata Steel. He has more than 25 years experience as an auditor and believes that data is the key to innovate his work. He frequently gives seminars about how to use data analysis tools such as KNIME to help others become more data driven.

Workshop 3 · How can I discover the real customer journey?

Rudi Niks, Fluxicon

Over the past years, many organizations have adopted new channels to interact with their customers. One of the challenges is to give the customer a seamless experience across these channels. Discovering customer journeys with process mining is one of the approaches that has become very successful to understand the real, cross-channel customer experience. Rudi has seen many examples over the years and is finally ready to share some best practices.

In this workshop we will focus on the typical challenges that you will face when analyzing customer journeys with process mining. For example, combining data from multiple sources, and working with the sheer amount of click stream data, can be overwhelming. Furthermore, you can get lost very quickly because the resulting process maps become very complex. To avoid this, you really need to know what you are looking for. Join this workshop and learn how you can bypass these common pitfalls when applying customer journey mining.

Rudi Niks has been one of the first process mining practitioners. He has over ten year of experience in creating value with process mining. At Fluxicon he ensures that Disco miners are the best process miners in the world.

Workshop 4 · What questions can I answer with process mining?

Anne Rozinat, Fluxicon

When you start out with process mining, it is often a bit of a chicken-and-egg problem: You are supposed to start with questions about your process, but which kinds of questions can you actually answer with process mining?

We will give you 20 typical process mining questions as a starting point and show you how to answer them. In this workshop, you will work hands-on with multiple data sets to understand the different approaches for measuring your process performance, analyzing compliance, and answering other process mining questions.

Anne Rozinat is the co-founder of Fluxicon and working with process mining every day. She has obtained her PhD Cum Laude in the process mining group at Eindhoven University of Technology and has given more than 100 process mining trainings over the past years.

Get your ticket now!

Process Mining Camp is not your run-of-the-mill, corporate conference but a community meet-up with a unique flair. The campers are really nice people who do not just brag about their successes but also share their pitfalls and failures, from which you can learn even more than from stories that go well. In addition, you will get lots of ideas about new approaches and use cases that you have not considered before.

Tickets for both the camp day and for the workshops are limited. To avoid disappointment, reserve your seat right away.

We can’t wait to see you in Eindhoven on 19 June!

———

Even if you can’t attend Process Mining Camp this year, you should sign up for the Camp mailing list to receive the presentations and video recordings afterwards.


  1. Eindhoven is located in the south of the Netherlands. Next to its local airport, it can also be reached easily from Amsterdam’s Schiphol airport (direct connection from Schiphol every 15 minutes, the journey takes about 1h 20 min).  
Joris Keizers Wins the Title of Logistics Manager of the Year

Joris Keizers has received the title of Logistics Manager of the Year (see WDP and Logistiek Profs). Joris was chosen for this award due to his work as a process mining pioneer. He has been one of the first to introduce process mining in the logistics world.

To learn more about Joris’ work, watch his presentation at Process Mining Camp 2015 here and read the case study based on which he received the first Process Miner of the Year award in 2016.

Congratulations, Joris!

Process Mining Transformations — Part 2: Unfold Loops for Activity Repetitions

This is the 2nd article in our series on typical process mining data preparation tasks. You can find an overview of all articles in the series here.

In the previous article, we have shown how loops can be split up into individual cases. The same principle can also be useful when looking at looping activities.

For example, let’s take a look at the purchasing process in Figure 1. When we analyze the performance of this process we can see that some cases do not fulfill the SLA of 21 days throughput time. It seems that the two ‘Amend’ activities could be an important factor in these delays. Not only because of the long average waiting times but also because some of the cases go through the ‘Amend’ step multiple times: At least one case went through the ‘Amend Request for Quotation Requester’ step 12 times!

Figure 1: Fragment of the process map for the purchasing process. The primary metric that is shown in the map is ‘Mean duration’ while the secondary metric is ‘Maximum repetitions’.

The nature of a loop (or cycle) is that even if the same activity is repeated within the same case, it is represented by the same activity node in the process map. For example, the secondary metric in the process map in Figure 1 shows that the activity ‘Analyze Request for Quotation’ was performed up to 14 times within a single case. But each of these iterations is represented by the same activity in the map.

In order to understand the impact of these repetitions in more detail, we would like to “unfold” each repetition to take a deeper dive into the repetition patterns.

In this article, we show you how you can achieve this. We will “unfold” each repetition of the activity ‘Analyze Request for Quotation’, ‘Amend Request for Quotation Requester’ and ‘Amend Request for Quotation Requester Manager’ into a separate activity node to analyze the impact of these repetitions in more detail.

Step 1: Transform your data

When you look at case 1212 in Figure 2 below, then you can see that the ‘Analyze Request for Quotation’ activity (highlighted in green) and the ‘Amend Request for Quotation’ activity (highlighted in blue) were repeated multiple times. This means that in the context of the process map from Figure 1 this case moves up and down between the highlighted activity nodes. We would like to unfold the looping activities to get more visibility into the repetition pattern.

Figure 2: Example case 1212 with repeating activity pattern (click on the image to see a larger version).

To make things even more complex, the ‘Amend’ activity can either be performed by the Requester (see light blue highlights for activity ‘Amend Request for Quotation Requester’ in Figure 2) or by the Manager (see dark blue highlight for activity ‘Amend Request for Quotation Requester Manager’ in Figure 2). However, for our specific analysis we do not want to make this distinction. We care about how many amendments were made in total, regardless of whether they were made by the requester or the manager.

To be able to analyze each repetition, we need to add a sequence number to each iteration of these activities within the same case. Similar to the approach of unfolding loops for cases, we will add a counter to each occurrence of the repetition.

Previously, we have shown you how you can do the heavy lifting in Python. In this example we show you how you can do this with an ETL tool. ETL tools have the advantage that you don’t need to be a programmer to do data transformations. We use the ETL tool KNIME but you can use any other ETL tool or programming language of your preference to get the same result.

With special thanks to Eddy van der Geest, who contributed the solution to this specific data transformation question, you can find the KNIME workflow below (see Figure 3). You can also download the data set here and download the KNIME workflow here to follow the example of this article.

Figure 3: KNIME workflow that adds a counter for each repeated occurrence of an ‘Amend’ and ‘Analyze’ activity (click on the image to see a larger version or download the KNIME workflow to follow the steps yourself).

This workflow loads the dataset from the purchasing process and adds the sequence number for each occurrence of an ‘Analyze Request for Quotation’ activity within the same case as a new column to the right (see green highlighted rows in Figure 4). Furthermore, it keeps a joint counter for the repetition of either the ‘Amend Request for Quotation Requester’ or the ‘Amend Request for Quotation Requester Manager’ activities in another new column (see blue highlighted rows in Figure 4).

Figure 4: The result of the data preparation step for the case 1212. You can see that 2 columns are added that include a counter for the ‘Amend’ and ‘Analyze’ activity repetitions.

Based on this transformed data set, we can now analyze our loop pattern in more detail.

Step 2: Analyze the activity repetitions

To actually unfold the loop in the process map in a visual way, we include both the ‘Amend’ sequence number column as well as the ‘Analyze’ sequence number column into the activity name when we import the transformed data set into Disco (see screenshot in Figure 5 below).

Figure 5: The three highlighted columns are all configured as ‘Activity’ (note the little letter symbol in the header) and, therefore, will be concatenated (combined together) into the activity name.

As a result, we have unfolded each activity occurrence in the loop pattern from Figure 1 (see Figure 6 for the same map but with the repetitions unfolded).

For example, rather than one activity with the name ‘Analyze Request for Quotation’ we can now see a separate activity for each iteration. ‘Analyze Request for Quotation-1’ is the first occurrence, ‘Analyze Request for Quotation-2’ the second occurrence, and so on.

Figure 6: Unfolded loop pattern from Figure 1 (click on the image to see a larger version of the map).

The process map has become much bigger now, but for our purposes it is helpful to see in detail how the repeating activities follow each other and in which combinations.

We can now also answer our initial questions about the amendments. For example, say that we want to know how many cases took three or more than three amendments (by the requester or the manager combined). To answer this question, we can simply add an Attribute filter in ‘Mandatory’ mode for the ‘Amend_SequenceNr’ field (see Figure 7 below).

Figure 7: Filter for all cases that had three or more repetitions of an ‘Amend’ activity.

After applying the filter, we can see that 14% of the cases had three or more amendments (see Figure 8 below).

Figure 8: As a result, we find that 14% of the cases had at leaset three ‘Amend’ activities and can analyze this subset of the process in more detail.

The throughput time of the cases that had three or more amendments can now be compared with the overal case durations to see whether they take longer.

And because the loop pattern has been unfolded, we can see exactly how much time passes, for example, between the fourth amendment and the fifth ‘Analyze’ activity, etc. We can play the animation over the unfolded process map, and so on.

It’s generally useful to have repetitions collapsed into a single activity in the process map to get a more compact overview, but sometimes unfolding these activity repetitions is exactly what you might want to do to get to the bottom of your loop patterns.

PGGM Saves Time With Process Mining

This is a guest article by Frank Nobel from Finext and by Henri Martens from PGGM, a pension fund service provider. See the Dutch version here. If you have a guest article or process mining case study that you would like to share as well, please contact us via anne@fluxicon.com.

PGGM, one of the largest pension providers in the Netherlands, wants to make her processes more efficient and reduce the costs of the accountant. To do this, the company has researched the added value of process mining. And with success: the organization expects time savings of 66% for the first, second and third line checks of the processes which were studied in the experiment.

Process mining is a new method for process improvement. All related actions and turnaround times of a process are mapped out based on data.

Time to ask Henri Martens, Manager Shared Service Center Extra Services at PGGM, a couple of questions.

Image: Henri Martens (PGGM) at Finext Round Table event

1. Why did you start with process mining?

From the discussions with our accountant KPMG, process mining was suggested as a possibility to reduce the costs of the complex accountability processes and, therefore, also the costs of the accountant. With this savings potential in mind, I started the experiment.

I am always open for innovative techniques and have been motivated by the experiences of KPMG. A lot of time is spent at PGGM on accountability reports to show our clients that we are managing our processes accurately. Afterwards, it then takes the accountant substantial time to assess the files (and their creation).

2. Which value does process mining provide PGGM?

Process mining provides insight into the process. It shows all actions (recorded in the systems) and their underlying relationships. For example, with process mining we show that all sent letters have been checked. Furthermore, we also show that a second employee has performed this check.

The audit, which is part of the accountability reports, can be carried out faster with process mining and is more complete than a random check. However, we did not only stay with one application, but we have also used the power of process mining at other departments and processes of PGGM. It soon became clear that the use of process mining offers more than just becoming more efficient in executing audits, but that it also provides valuable opportunities to identify improvements in all processes.

3. What was the most important success factor of the experiment?

The most important success factor was to put together a multidisciplinary team with the right combination of expertise. It is important to have the right competencies in-house when you use process mining. It’s a combination of data mining and process analysis.

Within the team we have several experienced colleagues who are proficient in knowing the process execution, but less colleagues with data affinity. I went searching for someone with data affinity who could work full time on process mining and ended up at Finext. The team was then complete for us to start experimenting with the tooling of process mining. Furthermore, I have given the team the time and freedom necessary to develop the experiment themselves and show that process mining has added value for PGGM.

4. What benefits did applying process mining bring you?

We expect at least 66% time savings for the first, second and third line checks of the relevant processes. With the use of process mining we have established that essential processes are carried out in the same way for several clients. This evidence insures that less accountability documentation is needed.

In addition, the analyses have led to more insight in the actual execution of our processes. We have then also been able to implement various process improvements. For example, we were able to apply Robotic Process Automation (RPA) on one of our analyzed processes, whereas this was not considered useful before.

We have now completed the analyses of several processes at different departments. These analyses have been the starting point of process improvements due to the additional insight based on facts from the systems. After an improvement has been implemented, a second check is made to show the finally realized improvement potential.

5. How will you use process mining in the future?

We will use process mining in the future to continuously implement improvements in many more processes. We are now seeking collaboration with the robotization and data science disciplines. This joint group outlines the frameworks for the future use of process mining throughout all of PGGM.

During the experiment, process mining was applied in different ways, both for ad-hoc analyses as well as for long-term solutions. In the near future, it is especially important to continue using process mining in the organization. On the one hand for monitoring the processes for continuous improvements, and on the other hand for implementing standard audit reports.

——————————–

Download Interview Case Study: PGGM Saves Time With Process Mining

You can download this interview as a PDF here for easier printing or sharing with others.






Become the Process Miner of the Year 2018!

Two years ago, we introduced the Process Miner of the Year awards to help you showcase your best work and share it with the process mining community. After Veco won the award in 2016, our friends at Telefonica became the Process Miner of the Year 2017 (read the full case and watch the video recording here).

This year, we will continue the tradition and the best submission will receive the Process Miner of the Year award at this year’s Process Mining Camp, on 19 June in Eindhoven.

Have you completed a successful process mining project in the past months that you are really proud of? A project that went so well, or produced such amazing results, that you cannot stop telling anyone around you about it? You know, the one that propelled process mining to a whole new level in your organization? We are pretty sure that a lot of you are thinking of your favorite project right now, and that you can’t wait to share it.

What we are looking for

We want to highlight process mining initiatives that are inspiring, captivating, and interesting. Projects that demonstrate the power of process mining, and the transformative impact it can have on the way organizations go about their work and get things done.

There are a lot of ways in which a process mining project can tell an inspiring story. To name just a few:

  • Process mining has transformed your organization, and the way you work, in an essential way.
  • There has been a huge impact with a big ROI, for example through cost savings or efficiency gains.
  • You found an unexpected way to apply process mining, for example in a domain that nobody approached before you.
  • You were faced with enormous challenges in your project, but you found creative ways to overcome them.
  • You developed a new methodology to make process mining work in your organization, or you successfully integrated process mining into your existing way of working.

Of course, maybe your favorite project is inspiring and amazing in ways that can’t be captured by the above examples. That’s perfectly fine! If you are convinced that you have done some great work, don’t hesitate: Write it up, and submit it, and take your chance to be the Process Miner of the Year 2018!

How to enter the contest

You can either send us an existing write-up of your project, or you can write about your project from scratch. It is probably better to start from scratch, since we are not looking for a white paper, but rather an inspiring story, in your own words.

In any case, you should download this Word document, which contains some more information on how to get started. You can use it either as a guide, or as a template for writing down your story.

When you are finished, send your submission to info@fluxicon.com no later than 30 April 2018.

We can’t wait to read about your process mining projects!

Case Study: Customer Journey Mining

This is a guest article by Yeong Shin Lee from PMIG and Yongil Lee from LOEN Entertainment.

If you have a process mining case study that you would like to share as well, please contact us via anne@fluxicon.com.

Summary

Korean Internet companies are holding voluminous log data that records users’ service usage behavior. If they can effectively utilize it, they can gain a competitive edge for maximizing their earnings. Yet, most of them are still at an early stage in which they identify users’ rough characteristics by performing simple statistical analyses.

LOEN Entertainment runs Melon, which is the largest online music streaming service in South Korea. They adopted process mining with Disco to analyze their mobile app’s log data. LOEN analyzed new users’ journeys during the day when they signed up with a KakaoTalk account. KakaoTalk is a free mobile instant messaging application for smartphones with free text and free call features. KakaoTalk is used by 93% of smartphone owners in South Korea.

They categorized new users into five segments based on their behavioral pattern and clearly identified the reason why each segment signed up. Furthermore, building on the analysis results, it is planning to conduct a targeted marketing campaign for increasing each segment’s CVR (Conversion Rate). The company is judging that their process mining analysis using Disco plays a key role in understanding new customers and is likely to contribute to maximizing earnings.

Company & Service

With the spread of smartphones, the Korean digital music market has sharply grown, now reaching about $900 million. Melon’s market share is more than 60% and it has secured more than 34 million users and 4.5 million paying customers. It started as SK Telecom’s music service in 2004, when the digital music market was still in its early stages. Later, SK Telecom transferred the service to its subsidiary, LOEN Entertainment.

Kakao took the subsidiary over in January 2016. In collaboration with Kakao, LOEN is now focusing on securing new users. A user with a KakaoTalk account can use Melon’s service without a separate registration process (See Figure 1).

Figure 1: Melon’s Mobile App (Left) and its Login Screen (Right).

Furthermore, they conducted a campaign through which KakaoTalk’s paid emoticons are given to paying Melon subscribers at no cost.
To understand the behavior of new users who signed up with a KakaoTalk account and to increase their CVR, LOEN Entertainment, without getting external consulting, performed a process mining project after adopting Disco. An in-house data analyst prepared the data for process mining and a marketer set the direction of analysis and conducted the process mining analysis using her domain knowledge.

Process

The process that was analyzed is a new user’s journey within the mobile app during the day when they signed up. The reasons for choosing this process are as follows:

  1. First, the process is closely related with the company’s strategic direction, focusing on enlarging its customer base in concert with its parent company (i.e., Kakao).
  2. Second, increasing new users’ CVR contributes to its profit enlargement.
  3. Finally, segmenting new subscribers based on their behavioral patterns and identifying their registration intent helps to maintain long-term relationships with them.

Data

The project team extracted log data from a Hadoop system that records mobile app users’ service usage behavior. Then, the team pre-processed the data and imported it into Disco. ‘User Sequence Number’ and ‘Menu Name’ were configured as case id and activity, respectively.

Due to Disco’s full Unicode support, the team could easily understand the discovered process map with the activity names in Korean. Furthermore, with the help of Disco’s powerful filters a lot of the pre-processing could be done in the process mining tool itself, which reduces the time and effort for the overall process mining analysis.

Results

When the data analysis team uses a general web log analyzer, then it can identify a certain page that a user visited, and its previous and subsequent pages. In contrast, process mining provides an end-to-end process map, repetition patterns, and the duration between pages (menus). Therefore, the team could exactly identify how users use the mobile app service.

By employing the process mining capabilities of Disco, the team analyzed the customer journeys of new users and categorized them, based on their usage pattern, into five customer segments.

Segment 1 is the group of customers who paid a fee for the music service. The process map of this segment is shown in Figure 2 (see next page). The rectangles represent the activities (here, menu names) and the arrows between them show the order in which the pages were visited by the customers. The darker the activities and the thicker the arrows, the more frequently these parts of the process are followed.

Figure 2: Simplified process map of the page flow for the first customer segment (note that the English page names were overlaid for clarity; furthermore some activity names as well as the frequency and performance metrics have been redacted for confidentiality reasons).

Segment 2-5 are customer groups who did not pay for the music service. The team discovered their process maps and was able to clearly identify the customers’ registration intent through the maps. Based on these insights from the process mining analysis, strategies to increase the CVR have been developed.

Impact

The team is judging that it achieved full success in the process mining project. It divided new users into (previously unidentified) five customer segments. For each segment, they could clearly identify the registration intent and the key pages that were visited.

Now, the team is planning to conduct a targeted marketing campaign, customized for each segment, on these key pages where each segment visited frequently. After conducting the campaign, the team will identify how much each segment’s CVR has improved. For the CVR targets that are not achieved, the team will perform a process mining analysis to analyze the customer behavior and find out the root causes of why the target CVR was not achieved. After this initial project, Melon’s process mining analyses using Disco have now become a daily improvement activity.

——————————–

Download Case Study: Customer Journey Mining

You can download this case study as a PDF here for easier printing or sharing with others.

Process Mining Transformations — Part 1: Unfold Loops for Cases

Ideally, your data is in perfect shape and you can immediately use it for your process mining analysis without any changes. Unfortunately, there are many situations, where this is not the case and you actually need to prepare your data set a little bit to be able to answer your analysis questions.

In this series, we will be looking at typical process mining data transformation tasks. Via step-by-step instructions, we will show you exactly how you can accomplish these data preparation steps for your own data:

Part 1: Unfold Loops for Cases (this article)
Part 2: Unfold Loops for Activity Repetitions
Part 3: To be continued…

Unfold Loops for Cases

If you have a ‘loop’ in your process then this means that a certain process step is repeated more than once. While, strictly speaking, the term loop refers to what is also called a ‘self-loop’ (a direct repetition), the term is typically more loosely used to refer to cycles in general in the context of process maps.

Loops are often interesting for a process mining analyst, because they help to spot rework and inefficiencies in the process (see our article on how to identify rework in process mining here).

But sometimes, loops can also get in the way of answering your process mining questions. For example, imagine a process, where a tool such as a heavy-duty power drill can be rented for specialized construction work. To trace the movement of the tools, a barcode has been attached to each drill. The barcode provides a unique identifier for each tool and serves as our process mining case ID.

In addition, the following status changes are tracked with a timestamp for each tool: ‘Pickup’ (a tool is picked up by a customer), ‘Return’ (the tool is returned by the customer), ‘Ready for pickup’ (the tool is back in the store and available for a new rental cycle by a new customer), and ‘Intervention’ (the tool needs to be repaired).

The process map below shows the process that is discovered for this data set by Disco (click on the image to see a larger version).1

As you can see in the process map by following the thick paths, there is a very dominant loop in this process: Each of the 31,592 tools is picked up, returned, and prepared for the next customer several times — See the red arrow that points to the place where the tool rental cycle is restarted again for the next customer.

The problem with this loop is that some questions cannot be answered from this process perspective. For example, what if you want to know:

How many times it took more than two days before a tool was ready for pickup after it was returned by the customer?

Right now, we can only answer this question based on how many tools took more than two days at least once between ‘Return’ and ‘Ready for pickup’, because the tool’s barcode is currently our case ID.

To understand how many times in total a tool took more than two days between ‘Return’ and ‘Ready for pickup’ we need to shift the case ID perspective from the tool ID to a single rental cycle. But to do this, we need a “rental cycle counter” for each tool.

Here is how you can achieve this and break up a loop in your process into multiple case IDs.

Step 1: Sort your dataset

In this first step, you need to make sure that your data is sorted based on your case ID (here the tool’s barcode) and the timestamps. It is not important that the case IDs are in a particular order. But all events that belong to the same case need to be grouped in such a way that they appear after each other in the right sequence (so, you want to have the events in the right order for each case).

There are several ways to do this. For example, you can sort the data in Excel, in your database, or via an ETL tool. But the simplest way of all is to just import your data into Disco and export it as a CSV file again. You will see that the result is a neatly sorted event log.

Step 2: Transform your data

When you look at the sorted data set (see below), then you can see how a single tool ID (here ‘Case 10’) goes through multiple cycles of ‘Pickup’, ‘Return’, and ‘Ready for pickup’.

To be able to analyze each rental cycle separately, this loop needs to be broken up into multiple case IDs: We want to start a new case each time that the cycle repeats again. So, in addition to knowing that the drill with the barcode ‘Case 10’ was rented out, we also want to know whether it was rented out the first, the second, or the 100th time.

Because we do not have such a rental cycle counter yet, we will add it ourselves in this data transformation step. I have used a Python script to generate the sequence counter. But you can do the same with a Visual Basic script or any other programming language of your preference.

To preserve the flexibility to decide later where exactly the rental cycle restarts (at ‘Pickup’, ‘Return’, or ‘Ready for pickup’?), I have simply added a loop counter for each of these activities.

Here is my Python code snippet:

import csv

previous_caseID = 0
Seqnr1 = 0
Seqnr2 = 0
Seqnr3 = 0

print("Start data transformation")

infile = open('tool_rentals.csv', 'rU')
csv_f = csv.reader(infile)

ofile = open('result.csv', 'w')
writer = csv.writer(ofile, delimiter=',', quotechar='', quoting=csv.QUOTE_NONE, escapechar='\\')

for row in csv_f:
current_caseID = row[0]
current_activity = row[2]

if (str(previous_caseID) != str(current_caseID)):
# reset sequence numbers
Seqnr1 = 0
Seqnr2 = 0
Seqnr3 = 0

if (str(current_activity) == 'Pickup'):
Seqnr1 = Seqnr1 + 1

if (str(current_activity) == 'Return'):
Seqnr2 = Seqnr2 + 1

if (str(current_activity) == 'Ready for pickup'):
Seqnr3 = Seqnr3 + 1

# if it's the header row then write the header row
if (current_caseID == 'Case ID'):
# write the header
mylist = [row[0], row[1], 'Repetion_of_pickup', 'Repetion_of_return', 'Repetition_of_ready_for_pickup', row[2]]
else:
# write the values
mylist = [row[0], row[1], str(Seqnr1), str(Seqnr2), str(Seqnr3), current_activity]

# write the row to the output csv file
writer.writerow(mylist)

# update the caseID
previous_caseID = current_caseID

print("Transformation completed")

# close the file readers/writers
infile.close()
ofile.close()

The result of this transformation is a new data set with three additional columns, which count the number of repetitions for the activities ‘Pickup’, ‘Return’ and ‘Ready for pickup’ for each case, respectively (see below).

Step 3: Pick the right perspective and analyze

Let’s say that we want to start a new rental cycle with each ‘Pickup’ activity. This means that, for example, the case with the tool ID ‘Case 10’ should be broken up into multiple cases such as ‘Case 10-0’ (no ‘Pickup’ has occurred yet), ‘Case 10-1’ (the drill has been picked up once), ‘Case 10-2’ (the drill has been picked up a second time), ‘Case 10-3’ (the drill has been picked up a third time), etc.

Each of these cases are much shorter (see the red arrows in the screenshot below) than the previous, very long case ‘Case 10’.

Now that we have added the repetition counter columns, taking this perspective is easy: We can simply configure both the ‘Case ID’ column (this is the tool ID from the barcode) and the new ‘Repetition_of_pickup’ column as a Case ID column in the import step (note the little Case ID symbol in the header row of both columns):

After importing the data into Disco, we remove all tool rental cycle cases that do not start with the ‘Pickup’ activity or that do not reach the ‘Ready for pickup’ activity in their cycle (see our article on ‘how to deal with incomplete cases’ here). This leaves us with 261,594 rental cycles for all tools together (see below).

Out of these 261,594 cases, we can now answer our original question and determine how many times a tool was not ‘Ready for pickup’ again after the ‘Return’ activity within two days. One way to answer this question is to use the Follower filter (see screenshot below).

After applying this filter, we can see that in 83% of the cases it took more than two days2 to have the tool ready for pickup again (see below).

So, if having the tool ready for pickup within two days is our ambition, then currently only 17% of the rental cycles meet this goal and we need to find ways to improve our process.


  1. Note that this process has multiple start and end points, because the data set was extracted for a certain timeframe. Different tools were in different stages of the rental cycle at the beginning and at the end of the data set.  
  2. Note that we are looking at calendar days in this example. If we wanted to analyze this question based on business days, we could do this by removing weekends and holidays using the TimeWarp functionality in Disco as shown here.  
Process Mining Camp on 19 & 20 June — Save the Date!

Have you always wanted to meet other process miners in person? Perhaps you followed the MOOC and would like to share your experiences with people who are also just starting out. Or you have already worked with process mining for several years and now you want to learn from other organizations about how they made the next step?

Open your agenda right now and mark the date: Process Mining Camp takes place again on 19 & 20 June in Eindhoven1 this year!

Process Mining Camp is not your run-of-the-mill, corporate conference but a community meet-up with a unique flair. The campers are really nice people who do not just brag about their successes but also share their pitfalls and failures, from which you can learn even more than from stories that go well. In addition, you will get lots of ideas about new approaches and use cases that you have not considered before.

For the seventh time, process mining enthusiasts from all around the world will come together in the birth place of process mining. Last year, more than 220 people from 24 different countries came to camp to listen to their peers, share their ideas and experiences, and make new friends in the global process mining community.

Like last year, this year’s Process Mining Camp will run for two days:

  • The first day (19 June) will be a day full of inspiring practice talks from different companies, as you have seen from previous camps.
  • On the second day (20 June), we will have a hands-on workshop day. Here, smaller groups of participants will get the chance to dive into various process mining topics in depth, guided by an experienced expert.

Mark these dates in your calendar and sign up for the camp mailing list here to be notified when ticket sales open! Even if you can’t make it this year, you should sign up to receive the presentations and video recordings as soon as they become available.

We can’t wait to see you in Eindhoven on 19 June!


  1. Eindhoven is located in the south of the Netherlands. Next to its local airport, it can also be reached easily from Amsterdam’s Schiphol airport (direct connection from Schiphol every 15 minutes, the journey takes about 1h 20 min).  
Privacy, Security and Ethics in Process Mining — Part 4: Establish a Collaborative Culture

This is the 4th and last article in our series on privacy, security and ethics in process mining. You can find an overview of all articles in the series here.

Perhaps the most important ingredient in creating a responsible process mining environment is to establish a collaborative culture within your organization. Process mining can make the flaws in your processes very transparent, much more transparent than some people may be comfortable with. Therefore, you should include change management professionals, for example, Lean practitioners who know how to encourage people to tell each other “the truth”, in your team (see also our article on Success Criteria for Process Mining).

Furthermore, be careful how you communicate the goals of your process mining project and involve relevant stakeholders in a way that ensures their perspective is heard. The goal is to create an atmosphere, where people are not blamed for their mistakes (which only leads to them hiding what they do and working against you) but where everyone is on board with the goals of the project and where the analysis and process improvement is a joint effort. 


Do: 


  • Make sure that you verify the data quality before going into the data analysis, ideally by involving a domain expert already in the data validation step (see Data Validation Session). This way, you can build trust among the process managers that the data reflects what is actually happening and ensure that you have the right understanding of what the data represents.
  • Work in an iterative way and present your findings as a starting point for discussion in each iteration. Give people the chance to explain why certain things are happening and let them ask additional questions (to be picked up in the next iteration). This will help to improve the quality and relevance of your analysis as well as increase the buy-in of the process stakeholders in the final results of the project. 


Don’t:


  • Jump to conclusions. You can never assume that you know everything about the process. For example, slower teams may be handling the difficult cases, people may deviate from the process for good reasons, and you may not see everything in the data (for example, there might be steps that are performed outside of the system). By consistently using your observations as a starting point for discussion, and by allowing people to join in the interpretation, you can start building trust and the collaborative culture that process mining needs to thrive.
  • Force any conclusions that you expect, or would like to have, by misrepresenting the data (or by stating things that are not actually supported by the data). Instead, keep track of the steps that you have taken in the data preparation and in your process mining analysis. If there are any doubts about the validity or questions about the basis of your analysis, you can always go back and show, for example, which filters have been applied to the data to come to the particular process view that you are presenting.