New Knowledge About Old Systems

Last week’s Process Mining Café was all about legacy systems. Two veterans of the trace, Steve Kilner and Derek Russel, joined us for a discussion on the role process mining can play in better understanding, and improving, legacy systems. You can now watch the recording here.

Our session started with a primer on what legacy systems are in the first place: Old systems that are often poorly understood and critical at the same time. Process mining can help to understand both how these systems are actually used, as well as how the processes that are run on them can be improved.

We also discussed the different approaches to get process mining data from these old systems: Some may have existing logs that can be used. Other setups need to be instrumented or otherwise observed.

Here are the links that we mentioned during the session:

Thanks again, Derek and Steve, for joining us!

Process Mining Café 4: Mining Legacy Systems

Process Mining Café 4

Our Process Mining Café sessions already start to feel like a tradition. Join us again next Wednesday 24 February, at 16:00 CET! (Check your own timezone here).

There is no registration required. Simply point your browser to when it is time. You can watch the café, and share your thoughts and questions while we are on the air, right there on the site.

This time, we are all about process mining in legacy systems.

Legacy systems are old, often mission-critical systems that can cause quite some headaches for their owners. Replacing these old systems is not easy, precisely because so much knowledge has been poured into them. And because the developers who built them are often long gone.

Process mining can help to understand how these systems are used. We have invited Derek Russell, who wrote about legacy system mining on our blog last week, and Steve Kilner, who dove into the topic already many years ago. Derek and Steve know all about legacy systems, and we will be talking about the different approaches to legacy system mining.

Tune in live for the Process Mining Café next week! Add the time to your calendar to make sure you don’t miss it. Or sign up for the café mailing list here if you want to be reminded one hour before the session starts.

Disco 2.11

Software Update

We are happy to announce that we have just released Disco 2.11.

We recommend that you update at your earliest convenience. Like every release of Disco, this update fixes a number of bugs and improves the general performance and stability.

This release marks a big step forwards for the Airlift integration in Disco, with better performance, improved reliability, and a smoother user experience all around. If you pull your log data into Disco via Airlift, this will be a solid upgrade – and if you don’t, this may be a great time to start thinking about it.

A long-lost, dearly beloved, and oft-requested crowd favorite makes a triumphant return: The process map will now remain centered around the mouse pointer when you zoom via the mouse wheel. Just a little goodie that got lost in our transition to multi-touch gestures way back when, and we’re all happy to have it back.

Even if you’re not airlifting, and barely zooming, this update, as always, brings increased performance, the demise of many bugs, and lots of small improvements and fixes all over the place.

Thank you for using Disco! We love hearing about what you do with it, and what you like and don’t like about it, so keep your feedback coming!

How to update

Disco will automatically download and install this update the next time you run it, if you are connected to the internet.

If you prefer to install this update of Disco manually, you can download and run the updated installer packages from


  • Airlift:
    • UI fixes.
    • Improved import performance.
    • Smoother and more resilient client experience.
    • Keep bookmarks of recent connections (optional).
    • More consistent experience for connections with self-signed certificates.
  • Process Map: Keep the mouse pointer centered when zooming via mouse wheel.
  • CSV Import: Improved performance and stability.
  • Workspace: Ensure safe recovery from a corrupted workspace.
  • Octane: Fixed an issue where some case IDs could be truncated.
  • UI: Refined graph transitions.
  • Platform: Java update (Requires manual install).
Analyzing Legacy Systems with Process Mining

IBM 360

This is a guest article by Derek Russell from Objektum Modernization Ltd. You can find an extended version of this article here. If you have a process mining case study that you would like to share as well, please get in touch with us at

Legacy systems are old systems that often support particularly important processes in an organization. At the same time, precisely because they are so old, the inner workings of these systems are typically poorly understood. This makes them hard to adapt or replace altogether.

There have been previous examples, where process mining was used to understand the behavior of a legacy system. However, in these examples there was existing log data that could be analyzed. What do you do if your legacy system does not provide any suitable event log data at all?

This is where the following approach can help: We can create a new logging capability in the legacy system by combining model generation and instrumentation of software code. Here is how it works.

Example: Hotel management system

Let us look at the example of a hotel management system. The system is used by the hotel reception to create new reservations, check in and check out guests, and to keep records of the food and beverages for billing. Figure 1 shows a screenshot of the current desktop application.

Figure 1: Screenshot of the hotel management application

The hotel management wants to extend or replace the system with the goal to let guests make online reservations in the future. When we set out to modernize a system, we need to first fully understand how the existing system is used to make sure that all the important functionalities are covered in our redesign. Unfortunately, there is limited knowledge and documentation available for the hotel management system.

Therefore, we want to use process mining to understand the different scenarios of the current reservation and billing processes. However, the system creates no usage logs at the moment. All we have is the C# source code and the data model in the SQL database.

Step 1: Generate the static model

To create the logging that is required for process mining, we start with the SQL database that stores all the records in a so-called data model. The data model describes the tables, relations, fields, and field types. This description can be extracted from the database in terms of a so-called SQL schema. This schema is translated into objects with attributes and relationships. For example, a customer has a first name, a last name, and a reservation from entry day to departure day (see Figure 2 below).

Figure 2: Generate static model from data model and source code

This model is then extended by parsing the source code (this can be done with virtually any programing language) to provide an overview of all the components in the system including the classes, attributes and methods. This results in the so-called ‘static model’, which gives an overview of all the components in the system.

Step 2: Generate the dynamic model

The static model shows the information that is processed but not the order in which this is done. Software code is composed of classes, representation of objects and properties, and the methods that provide the behavior of the system. However, the static model does not describe the order in which the methods take place.

Figure 3: Generate dynamic model by simulating source code

To gain an understanding of the dependencies between the methods, it is necessary to record and analyze the dynamic execution of the software.

To achieve this, we instrument the source code to enable the logging of program flow during normal usage of the application. This results in a log from which UML sequence diagrams are generated. These sequence diagrams now describe the flow of the methods that are invoked at each object. This ‘dynamic model’ is not a business process but the sequence of methods related to one use case.

Step 3: Extend the dynamic model

For the process mining data we need information about what the case ID and the activity names are. In the dynamic model, we can define the activities by selecting which methods define the start or end of an activity. The model is extended by tagging the methods in the sequence diagram to define when to log what. Note that no code is changed, only properties in the model are set.

Figure 4: Extend the dynamic model with logging by tagging methods

At this point also the case ID and further attributes from the static model can be selected to be included as part of the logging. For example, the reservation number or customer number can be added to represent the case ID. One of the advantages is that you can start small, with minimal impact to the application, and add more information by repeating step 3, 4, and 5.

Step 4: Instrumentation, build, deploy and run

In the next step, we automatically re-generate the application by combining the original source code with the logging directives on the sequence diagrams. The original source code is not touched. It is only combined with the code to introduce the logging behavior described by the tags in the sequence diagrams. This is referred to as instrumentation. It is important that the original source code itself is not changed, because we don’t want to change anything else in the system’s behavior.

Figure 5: Instrument code, build and deploy new version of the system

The instrumented code can be built into a new version of the hotel management application. This instrumented application behaves identically to the original one, with the additional capability of event logging. The logging starts at the moment that the instrumented version is deployed. So, from that moment on it is possible to analyze new reservations and the execution of any other system use case.

Step 5: Analyze logging

The output of step 4 is the ‘runtime logging’ event log that we can now analyze with process mining. We have to wait until enough events have been collected to perform a representative process mining analysis. For each of the run methods for which a tag was added in the sequence diagram an event will be added to the log. A snippet of the resulting log is shown in Figure 6 below.

Figure 6: The event log captured by the instrumented version of the hotel management system

When you import this event log into Disco then the process map shown in Figure 7 below is discovered. In the process mining tool, we can further analyze the system behavior based on the actual usage of the instrumented system.

As soon as we understand the current behavior in detail, we can start working on the new system that supports online reservations for future customers without losing track of all the other scenarios from the current system that still need to be supported.

Figure 7: The discovered process map

This is a small and simple example, but imagine a large legacy system that has many different functionalities. Without process mining we would have to manually look at the source code to understand how the system works. For a large system, going through the entire source code can be a very time-consuming and daunting task.

Furthermore, looking at the source code does not give us any indication about how the system is actually used. So, we might end up transferring pieces of functionality to a replacement system that are no longer necessary, thereby making the new system more complicated than it needs to be.

Process mining is a great way to understand processes of any kind. Leveraging process mining to understand the inner workings of legacy systems is an application area, where this insight is especially valuable.

Customer Journey Analysis with Daisy Wain from GOV.UK

The last Process Mining Café was all about customer journeys. In a customer journey analysis you look at the process from the viewpoint of the customer. We invited Senior Performance Analyst Daisy Wain from GOV.UK and also asked Rudi to join the session. You can now watch the recording here.

Customer Journey Mining at GOV.UK

At first, Daisy gave us a quick overview about her recent analysis of the user journey on the “Start a Business” part of the GOV.UK website.

A regular Google Analytics analysis only allows to look at two steps in a user journey (which page the visitors came from to arrive at the current page). In contrast, process mining allowed Daisy’s team to get an end-to-end view of the full user journey.

Challenge No. 1: What do you see as “Customer Journey”

Then, Rudi took us through the three main challenges that you will encounter when you analyze customer journeys with process mining.

The first challenge is that you need to determine the scope and the level of detail of what you want to see as the customer journey process. This seems simple at first. However, there are typically multiple options and there is not one “correct” answer.

Challenge No. 2: Dealing with Complexity

The second challenge is that you need to deal with complexity in different dimensions. First of all, many customer journey processes run across multiple channels. This means that data from different sources (e.g., click-streams, CRM system, etc.) needs to be combined to get the full view.

Another dimension is that the volume of data grows quickly, especially when you analyze data from high-traffic websites like GOV.UK. You need to select the data for your analysis in a smart way.

And finally, once you import your data into the process mining tool, you can expect a very complex process map as well. Rudi showed us in a live demo how this complexity can be reduced in Disco.

Challenge No. 3: Taking Action

The last challenge is that taking action to actually improve the process can be a little bit more complicated than in a regular improvement project.

Traditional processes typically have a process owner who is responsible for changing the process. For many companies, the responsibilities around customer journeys are not that clear. The topic is still quite new and the responsibility is not pushed to one person. At the same time, the place where the change should be made only becomes clear after the analysis and can affect different departments.

Finally, the customer is central to the customer journey process but cannot directly be controlled. User research needs to be carried out in addition to the actual process mining analysis for deeper root cause analyses.

Here are the links that we mentioned during the session:

A big thanks again to Daisy and Rudi and to all of you for tuning in!

If you have questions or additional comments about process mining for customer journeys, send them our way via We are always happy to hear from you.