This is Flux Capacitor, the company weblog of Fluxicon.
You can find more articles here.

You should follow us on Twitter here.

Data Quality Problems in Process Mining and What To Do About Them — Part 11: Data Validation Session with Domain Expert

Expert interview

This is the eleventh article in our series on data quality problems for process mining. You can find an overview of all articles in the series here.

A common and unfortunate process mining scenario goes like this: You present a process problem that you have found in your process mining analysis to a group of process managers. They look at your process map and point out that this can’t be true. You dig into the data and find out that, actually, a data quality problem was the cause for the process pattern that you discovered.

The problem with this scenario is that, even if you then go and fix the data quality problem, the trust that you have lost on the business side can often not be won back. They won’t trust your future results either, because “the data is all wrong”. That’s a pity, because there could have been great opportunities in analyzing and improving this process!

To avoid this, we recommend to plan a dedicated data validation session with a process or domain expert before you start the actual analysis phase in your project. To manage expectations, communicate that the purpose of the session is explicitly not yet to analyze the process, but to ensure that the data quality is good before you proceed with the analysis itself.

You can ask both a domain expert and a data expert to participate in the session, but especially the input of the domain expert is needed here, because you want to spot problems in the data from the perspective of the process owner for whom you are performing the analysis (you can book a separate meeting with a data expert to walk through your data questions later). Ideally, your domain expert has access to the operational system during the session, so that you can look up individual cases together if needed.

To organize the data validation session with the domain expert, you can do the following:

You may find that the domain expert brings up questions about the process that are relevant for the analysis itself. This is great and you should write them down, but do not get side-tracked by the analysis and steer the session back to your data quality questions to make sure you achieve the goal of this meeting: To validate the data quality and uncover any issues with the data that might need to be cleaned up.

After the validation session, follow-up on all of the discovered data problems and investigate them. Also, keep track which of your original process questions may be affected by the data quality issues that you found. Document the actions that you have taken, or intend to take, to fix them.


Leave a reply