Today, we are excited to announce one additional speaker: Prof. Wil van der Aalst will be closing this year’s camp program with a keynote on responsible data science!
Wil van der Aalst — Eindhoven University of Technology, The Netherlands
Events (often hidden in Big Data) are often described as “the new oil”. Techniques like process mining aim to transform these events into new forms of “energy”: Insights, diagnostics, models, predictions, and automated decisions. However, the process of transforming “new oil” (event data) into “new energy” (analytics) can negatively impact citizens, patients, customers, and employees.
Systematic discrimination based on data, invasion of privacy, non-transparent life-changing decisions, and inaccurate conclusions illustrate that data science techniques may lead to new forms of “pollution”. We use the term “Green Data Science” for technological solutions that enable individuals, organizations, and society to reap the benefits from the widespread availability of data while ensuring fairness, confidentiality, accuracy, and transparency.
The sixth speaker at Process Mining Camp 2015 was Edmar Kok, who worked for a project team at DUO, the study financing arm of the Dutch Ministry of Education. The team was responsible for setting up a new event-driven process environment. Unlike typical workflow or BPM systems, event-driven architectures are set up as loosely-coupled process steps. Each step can be either a human task or an automated step. All tasks are then combined in a flexible way. The new system was introduced with the goal to improve the speed of DUO’s student finance request handling processes and to save 25% of the costs.
At camp, Edmar walked us through the specific challenges that emerged from analyzing log data from that event-driven environment and the kind of choices that they had to make. He also discussed the key metrics DUO wanted to monitor from a business side.
Do you want to learn how process mining can be used to very quickly uncover technical errors and KPIs in the pilot phase of a new system? Watch Edmar’s talk now!
To get us all into the proper camp spirit, we have started to release the videos from last year’s camp. If you have missed them before, check out the videos of Léonard Studer from the City of Lausanne, Willy van de Schoot from Atos International, Joris Keizers from Veco, and Mieke Jans from Hasselt University.
The fifth speaker at Process Mining Camp 2015 was Bart van Acker from Radboudumc. There has been a lot of discussion about the challenges that our healthcare systems are facing, because of the aging population and increasing costs. Process improvement (while maintaining or improving quality of care) is therefore very important to keep pace with these developments.
At camp, Bart shared the challenges that he faces in process improvement projects at the hospital. He showed us how process mining can help to bridge the gap between process improvement professionals and the medical staff based on the example of the Intensive care unit and the Head and Neck Care chain at Radboudumc.
Do you want to know which benefits process mining brings to the improvement of healthcare processes? Watch Bart’s talk now!
To get us all into the proper camp spirit, we have started to release the videos from last year’s camp. If you have missed them before, check out the videos of Léonard Studer from the City of Lausanne, Willy van de Schoot from Atos International, and Joris Keizers from Veco.
The fourth speaker at Process Mining Camp 2015 was Mieke Jans from Hasselt University. Mieke was working for Deloitte Belgium for many years before she became an Assistent Professor at Hasselt University. As a process mining consultant, she learned about the challenges of extracting good process mining data out of ERP systems like SAP and Oracle first-hand. Getting some data out of an ERP system is relatively easy. But how do you make sure you extract the right data in the right way?
In her research, Mieke still works on the topic of extracting process mining data out of relational databases. At camp, she walked us through a step-by-step approach for creating a good event log. The starting point is to define the questions that you want to answer using process mining, because they have a direct impact on the way that the data needs to be extracted. There are several decisions that need to be made, and every decision has implications on the way that the data can be analyzed and should be interpreted.
Do you want to know which steps to follow when extracting process mining data out of your ERP or legacy system? Watch Mieke’s talk now!
To get us all into the proper camp spirit, we have started to release the videos from last year’s camp. If you have missed them before, check out the videos of Léonard Studer from the City of Lausanne and Willy van de Schoot from Atos International.
The third speaker at Process Mining Camp 2015 was Joris Keizers from Veco. Joris presented their experience with process mining in a production process environment. With more than 15 years of experience in supply chain management, Joris is the operations manager and Six Sigma expert at Veco. He has used Minitab to statistically analyze the processes and drive improvements. When he discovered process mining, he found that process mining can leverage the human process knowledge in a powerful way that classical Six Sigma analyses can’t.
At camp, Joris showed us a side-by-side comparison based on a concrete example of a Six Sigma and a Process Mining analysis and explained the differences, benefits, and synergies.
Do you want to know how your Six Sigma projects can be enhanced by process mining? Watch Joris’ talk now!
The second speaker at Process Mining Camp 2015 was Willy van de Schoot from Atos International. Willy did not present a particular case study but focused on practical challenges like how to stay on top of your different analysis views. And once you need to present your results to an audience unfamiliar with process mining, how do you communicate your findings to keep everyone on board? In a hands-on segment, she showed the different perspectives she has taken, as well as some tricks of how to prepare the data in such a way that it provides optimal flexibility.
The heart and soul of Process Mining Camp are our practice talks, where process mining professionals share their knowledge and experiences with you, no holds barred. Every talk is followed by ten minutes of discussion with the audience, so that you can get the answers you need.
Today, we are excited to announce the practice talk speakers at this year’s Process Mining Camp. Look forward to a fantastic program packed with interesting use cases and hands-on advice! You will go home with lots of new insights and ideas for your own work.
Carmen Lasa Gómez — Telefónica, Spain
Telefónica is a large telephone operator and mobile network provider. The R&D team in the Digital Operations group in Madrid analyzes and improves processes across global digital services provided by the company.
Carmen is a data analyst in the Digital Operations team. At camp, she will tell us how process mining reshaped the digital platform operations at Telefónica. She identified sources of delays, inefficient communication patterns, and bad practices such as work orders performed out of the scheduled window. As a result, improvements could be made with measurable effects on both the operation costs and the quality of the services.
Marc Gittler & Patrick Greifzu — Deutsche Post DHL Group, Germany
Deutsche Post DHL Group is the world’s leading logistics and mail communications company. The mail division delivers approximately 70 million letters in Germany, six days a week, and provides services across the entire mail value chain.
Marc and Patrick are a Senior Audit Manager and Audit Manager in the Corporate Internal Audit team. They have integrated process mining into DHL’s audit process to improve both the time spent for the analysis and the depth of the information audited. They found that process mining helps to reduce the audit time by 25% in comparison to classical data analytics. In addition, they are now able to identify unknown risks in processes, which helps to add more value to the audits.
Jan Vermeulen — Dimension Data, South Africa
Dimension Data has over 30,000 employees in nine operating regions spread over all continents. They provide services from infrastructure sales to IT outsourcing for multinationals.
Historically, each region was responsible for running their own operations with very little enforced standards from a group perspective. The changing business landscape made it necessary for Dimension Data to standardize all their processes across all continents. But how exactly do you do that? As the Global Process Owner, Jan is responsible for the standardization of these processes. At camp, he will share their journey of establishing process mining as a methodology to analyze and improve operations, assist in winning new contracts, and assess compliance.
Lucy Brand-Wesselink — ALFAM Consumer Credit, The Netherlands
ALFAM is a subsidiary of ABN AMRO specializing in consumer credits. In the sales process, customer applications need to be assessed effectively and efficiently. For example, it is not worth to put a lot of time into an application when it is clear early on that the application cannot be granted.
Lucy is a process manager in the Business Operating Office. She has used process mining to analyze ALFAM’s processes from many different angles. She has analyzed variation, re-processing, waiting times, and service levels. By visualizing the processes and the process problems, improvement opportunities could be crystallized in a powerful way. At camp, Lucy will share the results and the concrete steps that she has taken to get there.
Giancarlo Lepore — Zimmer Biomet, Switzerland
Zimmer Biomet designs and creates personalized joint replacements. Their orthopedic products, like knee replacements or hip replacements, are manufactured in more than 25 countries around the world.
Giancarlo is a senior business analyst in the Operational Intelligence team in Winterthur, Switzerland. He analyzes the production processes for processes managers to help them improve their operations. At camp, Giancarlo will share the results from several process mining use cases. He will also compare the traditional method of manual value stream mapping with a process mining-based analysis of the manufacturing flow.
Paul Kooij — Zig Websoftware, The Netherlands
Zig Websoftware creates web applications for housing associations in the Netherlands. Their workflow solution is used by the housing associations to, for instance, manage the process of finding and on-boarding a new tenant once the old tenant has moved out of an apartment.
Paul is one of the 60 specialists working at Zig Websoftware. At camp, he will tell us how process mining has helped their customer WoonFriesland improve the housing allocation process. Every day that a rental property is vacant costs the housing association money. After Paul’s process mining analysis, these vacancy costs could be reduced by 4,000 days within just the first six months.
Abs Amiri — SPARQ Solutions, Australia
SPARQ Solutions provides Information and Communications Technology services to the government-owned electricity suppliers Energex and Ergon Energy in Queensland, Australia. Due to government pressure, there has been an increased need to cut costs and become more efficient.
Abs is a senior analyst programmer and data science lead in the organisation. In this role he develops new and innovative ways to help Energex and Ergon Energy improve their operations. He found process mining to be an incredibly powerful tool to quickly discover the actual problems and involve the relevant people in the root cause analysis. At camp, Abs will present how he analyzed the overall dispatching process as well as the maintenance process for a single machine. He will also share his insight about how to position your process mining initiative in the organization to get the buy-in you need.
Process Mining Camp is the only conference worldwide where practitioners come together to discuss their process mining experiences. Don’t miss this unique opportunity to meet up with your peers and learn new tricks, and get your ticket now!
Are you getting ready for this year’s Process Mining Camp? If you haven’t registered yet, make sure to secure your ticket for 10 June. The early bird tickets were gone within less than five days, so be quick!
To get us all into the proper camp spirit, we will be releasing the videos from last year’s camp over the next weeks. The first speaker at Process Mining Camp 2015 was Léonard Studer from the City of Lausanne. As a process analyst, Léonard helps people at the municipality to better organize their work.
At camp, Léonard told us about a project, where they analyzed a complex construction permit process. Construction permit processes are notoriously complicated, because there are so many parties and rules involved. For example, the City of Lausanne is regulated by 27 different laws from Swiss federal law, cantonal law, and communal regulation.
In his presentation, Léonard did not only tell us all about the project and what came out of it, but he also did a deep dive into the overall approach, the enormous data challenges they were facing, and the tools that he used to resolve them. He gave an honest talk with lots of practical details. In his introduction, he puts it best:
Process mining itself is not a problem anymore. When I do process mining live in front of people they believe that they can do it themselves. What is difficult now is to get all these little things around your process mining project arranged correctly. This is a talk I will give you without any shame, I will not be blowing any smoke, I will not be bragging. I just want to tell you what I really did. I will also give you some tricks around process mining that may be useful to you.
In the previous article on wrong timestamp configurations we have seen how timestamp problems can influence the process flows and the process variants. One reason for why timestamps can cause problems is that they are not sufficiently different. For example, if you only have a date (and no time) then it may easily happen that two activities within the same case happen on the same day. As a result you don’t know in which order they actually happened!
Take a look at the following example: We can see a simple document signing process with four activities and three cases.
The order of the rows in each case is arbitrary. When importing this data set, the sequence of events is determined based on the timestamps. For example, the sequence of the steps ‘Created’ and ‘Sent to Customer’ for case 1 is reversed (compared to the original file), because the dates reflect that the two steps have happened in the opposite order (see screenshot below).
However, if two activities happen at the same time (on the same day in this example), then Disco does not know in which order they actually occurred. So, it keeps the order in which they appear in the original file. Because the order of the activities in the example file is random, this creates some additional variation in the process map (and in the variants) that should not be there.
For example, the three cases in the above example come from a purely sequential process. However, because sometimes multiple steps happen on the same day, and the order between them is arbitrary, you can see some additional interleavings in the process map. They reflect the different orderings of the same timestamp activities in the file (see screenshot below).
So, if you don’t have sufficiently fine-granular timestamps to determine the order of all activities, or if you have many steps in your process occurring exactly at the same time, it often creates more complexity than is already there. What can you do to distinguish the real process complexity from the one just caused by the same timestamp problem?
How to fix: You can either leave out events that have the same timestamps by choosing a “representative” event (see strategy 1 below), or you can try pre-sorting the data (see strategies 2-4 below) to reduce the variation that is caused by the same timestamp activities.
Strategy 1: “Representative” (Leaving out events)
The reason for ‘Same Timestamp’ activities is not always an insufficient level of granularity in the timestamp pattern. Sometimes, it is simply a fact that many events are logged at the same time.
Imagine, for example, a workflow system in a municipality, where the service employee types in the new address of a citizen who moved to a new apartment. After the street, street number, postal code, city, etc., fields in the screen have been filled, they press ‘Next’ to finalize the registration change and print the receipt.
In the history log of the workflow system, you will most likely see individual records of the changes to each of these fields (for example, a record of the ‘Old value’ and the ‘New value’ of the ‘Street’ attribute). However, all of them may have the same timestamp, which is the time when the employee pressed the ‘Next’ button and the data field changes were all finalized (at once).
Below, you can see another example of a highly automated process. Many steps happen at the same time.
However, you may not need all of these detailed events and can choose one of them to represent the whole subsequence. For example, in the case below the first of the four highlighted events could stand for the sequence of four. You can deselect the other steps via the ‘Keep selected’ option in the Attribute filter.
In general, focusing on just a few – the most relevant – milestone activities is one of the most effective methods to trim down the data set to more meaningful variants if you have too many – See also Strategy No. 9 in this article about How to Manage Complexity in Process Mining.
Strategy 2: Sorting based on sequence number
Sometimes you actually have information about in which sequence the activities occurred in some kind of sequence number attribute. This is great, because you can now sort your data set based on the sequence number (see below) and avoid the whole Same Timestamp Activities problem altogether.
Because Disco uses the sequence from the activities in your original file for the events that have the same timestamp, this pre-sorting step will influences the order in which the variants and the process flows are formed and, therefore, fix the random order of the Same Timestamp Activities.
Strategy 3: Sorting based on activity name
Of course you don’t always have a sequence number that you can rely on for sorting the data. So what else can you do?
Another way that often helps is that you can pre-sort the data simply based on the activity name. The idea is that at least the activities that have the same timestamp (and are sometinmes in this and sometimes in that order) are now always in the same order, even if the order itself does not make much sense.
This is easy to do: Simply sort the data based on your activity column before importing it. However, sometimes this strategy can also backfire, because you may – accidentally – introduce wrong orders in same timestamp activities that by coincidence were fine before.
For example, consider the outcome of sorting the data based on activity name for the document signing process above:
It has helped to reduce the variation in the beginning of the process, but at the same time it has introduced a reverse order for the activities ‘Document Signed’ and ‘Response Received’ for case 3 (which have the same timestamps but were in the right order by coincidence in the original file).
Strategy 4: Sorting based on ideal sequence
To influence the order of the Same Timestamp Activities in the “right” way, you can analyze those process sequences in your data that are formed by actual differences in the timestamp. You can also talk to a domain expert to help you understand what the ideal sequence of the process would be.
For example, if you look at case 2 in the document signing process, then you can see that the sequence is fully determined by different timestamps (see screenshot below).
We are now going to use this ideal sequence to influence the sorting of the original data. One simple way to do it is to pre-face the activity names by a sequence number reflecting their place in the ideal sequence (i.e., ‘1 – Created’, ‘2 – Sent to Customer’, ‘3 – Response Received’, and ‘4 – Document Signed’) by using Find and Replace.
After adding the sequence numbers, you can simply sort the original data by the activity column (see below).
This will bring the activities in the ideal sequence. When you now import the data in Disco, you should only see deviations from the ideal sequence if the timestamps actually reflect that.
In less than two months we all come together for this year’s Process Mining Camp in Eindhoven. Right now we are busy working with a number of speakers to bring you the most interesting and inspiring talks at this year’s camp.
Like our camp audience, our speakers come to camp from all over the world. They come from the biggest companies out there, and from smaller organizations, and they apply process mining in a wide spectrum of use cases and roles. What they have in common is, they have a great story to tell. This year’s program is almost finalized, and we can’t wait to share it with you very soon.
In the article on Zero timestamps we have seen how timestamp problems can lead to faulty case durations. But faulty timestamps do not only influence the case durations. They also impact the variants and the process maps themselves, because the order of the activities is derived based on the timestamps.
For example, take a look at the following data set with just one faulty timestamp. There is one case with a 1970 timestamp (see screenshot below – click on the image to see a larger version). As a result, the ‘Create case’ activity is positioned before the ‘Import forms’ activity.
If we look at the process map, then you see that in all other 456 cases the process flows the other way. Clearly, the reverse sequence is caused by the 1970 timestamp.
And if we look at the average waiting times in the process map, then this one faulty timestamp creates further problems and shows a huge delay of 43 years.
As you can see, data quality problems due to timestamp issues can distort your process mining analysis in many different places. Therefore, it is important to carefully assess the process map and the variants, if possible together with a domain expert, to spot any suspicious orderings of activities.
If you have found a problem with the timestamps, then there can be different reasons for why this is happening. Zero timestamps are just one possible reason. Here is the next one: Wrong timestamp configuration during import.
Wrong Timestamp Pattern Configuration
When you import a CSV or Excel file into Disco, the timestamp pattern is normally detected automatically. You don’t have to do anything. If it is not automatically detected, Disco lets you specify how the timestamp pattern should be interpreted rather than forcing you to convert your source data into a fixed timestamp format. And you can even work with different timestamp patterns in your data set.
However, if you have found that activities show up in the wrong order, or if you find that your process map looks weird and does not really show the expected process, then it is worth verifying that the timestamps are correctly configured during import.
You can do that by going back to the import screen: Either click on the ‘Reload’ button from the project view or import your data again. Then, select the timestamp column and press the ‘Pattern…’ button in the top-right corner. You will see a few original timestamps as they are in your file (on the left side) and a preview of how Disco interprets them (in green, on the right side).
Check in the green column whether the timestamps are interpreted correctly. Pay attention to the lower and upper case of the letters in the pattern, because it makes a difference. For example, the lower case ‘m’ stands for minutes while the upper case ‘M’ stands for months.
How to fix: If you find that the preview does not pick up the timestamps correctly, configure the correct pattern for your timestamp column in the import screen. You can empty the ‘Pattern’ field and start typing the pattern that matches the timestamps in your data set (use the legend on the right, and for more advanced patterns see the Java date pattern reference for the precise notation and further examples). The green preview will be updated while you type, so that you can check whether the timestamps are now interpreted correctly. Then, press the ‘Use Pattern’ button
Wrong Timestamp Column Configuration
Another timestamp problem that can result from mistakes during the import step is that you may have accidentally configured some columns as a timestamp that are not actually a timestamp column in the sense of a process mining timestamp (but, for example, indicate the birthday of the customer).
In the customer service refund example below, the purchase date in the data has the form of a timestamp. However, this is a date that does not change over time and should actually be treated as an attribute. You can see that both the ‘Complete Timestamp’ as well as the ‘Purchase Date’ column have the title clock symbol in the header, which indicates that currently both are configured as a timestamp.
If columns are wrongly configured as a timestamp, Disco will use them to calculate the duration of the activity. As a consequence, activities can show up in parallel although the are in reality not happening at the same time.
How to fix: Make sure that only the right columns are configured as a timestamp: For each column, the current configuration is shown in the header. Look through all your columns and make sure only your actual timstamp columns are showing the little clock symbol that indicates the timestamp configuration. Then, press again the ‘Start import’ button.
For example, in the customer service data set, we would change the configuration of the ‘Purchase Date’ column to a normal attribute as shown below.