Data Suitability Checklist for Process Mining

Lab testing

Once you start looking for process mining data within your organization, you will be faced with data sets for which you need to determine whether they are suitable for process mining or not.

Perhaps you have found an existing report and want to see if that data extract is usable for your process mining project. Or you have requested a data set set from your IT department and now you need to judge whether it fulfills the requirements for a process mining analysis.

What exactly do you need to look for? Here is a checklist with the questions that you can go through to assess the suitability of your data. You can also download this PDF version to print it out and check off each point.

Checklist Data Suitability

  1. Structured data? Do you have data with columns and rows?

  2. Case ID, Activity, and Timestamp columns available? Do you have at least one column that can be your case ID, your activity name, and your timestamp? See when a timestamp is not needed here.

  3. Same case ID in multiple rows? Does the same case ID show up in more than one row at least sometimes? If each row has a unique case ID, your data is either not usable or you may need to reformat it.

  4. Different activities in the same case? Does the activity name change at least sometimes within the same case? If the activity field does not change over time, it does not contain the history and you need to look for another activity column.

  5. Different timestamps in the same case? Does the timestamp change at least sometimes within the same case? If the timestamp field does not change over time, it does not contain the history and cannot be used as your timestamp column. You can import your data without timestamps if it is already sorted.

  6. Date and time in one column? Are the date and the time portion of your timestamp placed in the same column? Because you can have multiple timestamps, each timestamp needs to be in one column.

  7. Data in one file? If your data was distributed across multiple files (for example, because it comes from different IT systems), have you combined it into one file?

  8. Different timestamp patterns in separate columns? If you have timestamps with different timestamp patterns, are they placed in different columns?

  9. Activity names human-readable? Are your activity names understandable (not just a numeric value like an action code, or a transaction number)?

  10. Activity names generalized enough? Does the same activity in another case have the same activity label (not just a free-text field that is filled differently every time)?

Can you answer ‘Yes’ to all of the points above? Then you can import your data into Disco and continue by checking the quality of your data before starting the actual process mining analysis.

Time Capsule — Process Mining Perspectives Panel at Process Mining Camp 2014

To celebrate our brand-new home for camp talks we are releasing the talks from Process Mining Camp 2013 and 2014 for the first time. Grab a snack, sit back, and enjoy the journey through time, back to the early stages of our process mining community from six years ago!

Panel: Process Mining Perspectives

The closing panel discussion brought together different process mining perspectives. Wil van der Aalst (representing academia) was joined by Frank van Geffen (representing industry), Nicholas Hartman (representing consultancies), Christian Günther from Fluxicon, and two industry analysts, Marc Kerremans from Gartner and Neil Ward-Dutton from MWD Advisors in the UK.

The discussion was very lively and followed up on a number of themes that were brought up earlier throughout the day.

One of the themes was around adoption. While Marc Kerremans said that process mining was quickly climbing the Gartner hype cycle, Neil Ward-Dutton disagreed and insisted that we are much earlier in the adoption curve than that.

This was also discussed in the context of maturity: 90% of the business people use a pen, paper, Powerpoint, and Visio for process improvement. Process mining can provide huge benefits here also without combining it with other tools like data mining.

Wil van der Aalst suggested that today’s business analysts need to become nerdier to use new technologies. Furthermore, Frank van Geffen made the point that organizations need to create space for innovation. They need to allow people to experiment with new techniques like process mining to be able to eventually roll them out broadly and productively as done at the Rabobank.

Finally, it was also discussed that once one is using data-driven analysis techniques like process mining, one needs to ensure that data and analysis results are used responsively and in accordance with the rules and ethics of the society and the company.

Watch the panel now!


Read this update and stay tuned for this year’s Process Mining Camp!

Time Capsule — Wil Van Der Aalst at Process Mining Camp 2014

To celebrate our brand-new home for camp talks we are releasing the talks from Process Mining Camp 2013 and 2014 for the first time. Grab a snack, sit back, and enjoy the journey through time, back to the early stages of our process mining community from six years ago!

Wil van der Aalst at TU Eindhoven: Towards a Process Scientist (Netherlands)

There is an increasing focus on data analysis in the business world, which is reflected in the rising interest in big data, and a high demand for data scientists. Recently, the TU Eindhoven started the Data Science Center Eindhoven (DSC/e) to advance research and education in this new field.

However, the focus should not just be on data, but also on processes and organizations — We need process scientists! This is where process mining can play a key role.

Watch Wil’s talk now!


Read this update and stay tuned for this year’s Process Mining Camp!

Time Capsule — Frank Van Geffen at Process Mining Camp 2014

To celebrate our brand-new home for camp talks we are releasing the talks from Process Mining Camp 2013 and 2014 for the first time. Grab a snack, sit back, and enjoy the journey through time, back to the early stages of our process mining community from six years ago!

Frank van Geffen: Adopting Process Mining at the Rabobank (Netherlands)

Frank van Geffen is a Process Innovator at the Rabobank. He realized that it took a lot of different disciplines and skills working together to achieve what they have achieved. It’s not only about knowing what process mining is and how to operate the process mining tool. Instead, a lot of emphasis needs to be placed on the management of stakeholders and on presenting insights in a meaningful way for them.

The results speak for themselves: In their IT service desk improvement project, they could already save 50,000 steps by reducing rework and preventing incidents from being raised. In another project, business expense claim turnaround time has been reduced from 11 days to 1.2 days. They could also analyze their cross-channel mortgage customer journey process.

Watch Frank’s talk now!


Read this update and stay tuned for this year’s Process Mining Camp!

Time Capsule — Erik Davelaar at Process Mining Camp 2014

To celebrate our brand-new home for camp talks we are releasing the talks from Process Mining Camp 2013 and 2014 for the first time. Grab a snack, sit back, and enjoy the journey through time, back to the early stages of our process mining community from six years ago!

Erik Davelaar at KPMG: The Benefits of Process Mining in Auditing (Netherlands)

Erik Davelaar is an IT auditor at KPMG. He sees that one main benefit of process mining for the auditor is that they get a higher degree of assurance during the audit. Instead of using just a sample of 25 cases, all cases of the year can be analyzed. After a number of pilots in the leasing sector and at a mortgage provider he could convince his audit colleagues at their financial sector clients of the added value.

In one project, the processes were different for every country, and process mining helped to understand and audit these differences. In another project, the access rights were not enforced on the system level, but with process mining the segregation of duty violations could still be assessed objectively. And in the third project deviations from the expected process were found.

Watch Erik’s talk now!


Read this update and stay tuned for this year’s Process Mining Camp!