Process Mining Transformations — Part 2: Unfold Loops for Activity Repetitions

This is the 2nd article in our series on typical process mining data preparation tasks. You can find an overview of all articles in the series here.

In the previous article, we have shown how loops can be split up into individual cases. The same principle can also be useful when looking at looping activities.

For example, let’s take a look at the purchasing process in Figure 1. When we analyze the performance of this process we can see that some cases do not fulfill the SLA of 21 days throughput time. It seems that the two ‘Amend’ activities could be an important factor in these delays. Not only because of the long average waiting times but also because some of the cases go through the ‘Amend’ step multiple times: At least one case went through the ‘Amend Request for Quotation Requester’ step 12 times!

Figure 1: Fragment of the process map for the purchasing process. The primary metric that is shown in the map is ‘Mean duration’ while the secondary metric is ‘Maximum repetitions’.

The nature of a loop (or cycle) is that even if the same activity is repeated within the same case, it is represented by the same activity node in the process map. For example, the secondary metric in the process map in Figure 1 shows that the activity ‘Analyze Request for Quotation’ was performed up to 14 times within a single case. But each of these iterations is represented by the same activity in the map.

In order to understand the impact of these repetitions in more detail, we would like to “unfold” each repetition to take a deeper dive into the repetition patterns.

In this article, we show you how you can achieve this. We will “unfold” each repetition of the activity ‘Analyze Request for Quotation’, ‘Amend Request for Quotation Requester’ and ‘Amend Request for Quotation Requester Manager’ into a separate activity node to analyze the impact of these repetitions in more detail.

Step 1: Transform your data

When you look at case 1212 in Figure 2 below, then you can see that the ‘Analyze Request for Quotation’ activity (highlighted in green) and the ‘Amend Request for Quotation’ activity (highlighted in blue) were repeated multiple times. This means that in the context of the process map from Figure 1 this case moves up and down between the highlighted activity nodes. We would like to unfold the looping activities to get more visibility into the repetition pattern.

Figure 2: Example case 1212 with repeating activity pattern (click on the image to see a larger version).

To make things even more complex, the ‘Amend’ activity can either be performed by the Requester (see light blue highlights for activity ‘Amend Request for Quotation Requester’ in Figure 2) or by the Manager (see dark blue highlight for activity ‘Amend Request for Quotation Requester Manager’ in Figure 2). However, for our specific analysis we do not want to make this distinction. We care about how many amendments were made in total, regardless of whether they were made by the requester or the manager.

To be able to analyze each repetition, we need to add a sequence number to each iteration of these activities within the same case. Similar to the approach of unfolding loops for cases, we will add a counter to each occurrence of the repetition.

Previously, we have shown you how you can do the heavy lifting in Python. In this example we show you how you can do this with an ETL tool. ETL tools have the advantage that you don’t need to be a programmer to do data transformations. We use the ETL tool KNIME but you can use any other ETL tool or programming language of your preference to get the same result.

With special thanks to Eddy van der Geest, who contributed the solution to this specific data transformation question, you can find the KNIME workflow below (see Figure 3). You can also download the data set here and download the KNIME workflow here to follow the example of this article.

Figure 3: KNIME workflow that adds a counter for each repeated occurrence of an ‘Amend’ and ‘Analyze’ activity (click on the image to see a larger version or download the KNIME workflow to follow the steps yourself).

This workflow loads the dataset from the purchasing process and adds the sequence number for each occurrence of an ‘Analyze Request for Quotation’ activity within the same case as a new column to the right (see green highlighted rows in Figure 4). Furthermore, it keeps a joint counter for the repetition of either the ‘Amend Request for Quotation Requester’ or the ‘Amend Request for Quotation Requester Manager’ activities in another new column (see blue highlighted rows in Figure 4).

Figure 4: The result of the data preparation step for the case 1212. You can see that 2 columns are added that include a counter for the ‘Amend’ and ‘Analyze’ activity repetitions.

Based on this transformed data set, we can now analyze our loop pattern in more detail.

Step 2: Analyze the activity repetitions

To actually unfold the loop in the process map in a visual way, we include both the ‘Amend’ sequence number column as well as the ‘Analyze’ sequence number column into the activity name when we import the transformed data set into Disco (see screenshot in Figure 5 below).

Figure 5: The three highlighted columns are all configured as ‘Activity’ (note the little letter symbol in the header) and, therefore, will be concatenated (combined together) into the activity name.

As a result, we have unfolded each activity occurrence in the loop pattern from Figure 1 (see Figure 6 for the same map but with the repetitions unfolded).

For example, rather than one activity with the name ‘Analyze Request for Quotation’ we can now see a separate activity for each iteration. ‘Analyze Request for Quotation-1’ is the first occurrence, ‘Analyze Request for Quotation-2’ the second occurrence, and so on.

Figure 6: Unfolded loop pattern from Figure 1 (click on the image to see a larger version of the map).

The process map has become much bigger now, but for our purposes it is helpful to see in detail how the repeating activities follow each other and in which combinations.

We can now also answer our initial questions about the amendments. For example, say that we want to know how many cases took three or more than three amendments (by the requester or the manager combined). To answer this question, we can simply add an Attribute filter in ‘Mandatory’ mode for the ‘Amend_SequenceNr’ field (see Figure 7 below).

Figure 7: Filter for all cases that had three or more repetitions of an ‘Amend’ activity.

After applying the filter, we can see that 14% of the cases had three or more amendments (see Figure 8 below).

Figure 8: As a result, we find that 14% of the cases had at leaset three ‘Amend’ activities and can analyze this subset of the process in more detail.

The throughput time of the cases that had three or more amendments can now be compared with the overal case durations to see whether they take longer.

And because the loop pattern has been unfolded, we can see exactly how much time passes, for example, between the fourth amendment and the fifth ‘Analyze’ activity, etc. We can play the animation over the unfolded process map, and so on.

It’s generally useful to have repetitions collapsed into a single activity in the process map to get a more compact overview, but sometimes unfolding these activity repetitions is exactly what you might want to do to get to the bottom of your loop patterns.