This is Flux Capacitor, the company weblog of Fluxicon.
You can find more articles here.

You should follow us on Twitter here.

Process Mining in Healthcare – Case Study No. 2 5

Previously, I had written about the challenges of applying process mining in the healthcare domain. And we talked about a case study where process mining was applied in a Dutch hospital. Here is another great example.

The hospital of São Sebastião in Santa Maria da Feira, Portugal, has 300 beds and an in-house IT system used across different departments. The researchers Álvaro Rebuge and Diogo Ferreira from our academic partner university IST – TU Lisbon applied process mining to the data collected by this IT system.

They analyzed the careflows of emergency patients, which involve activities comprising the triage, treatments, diagnosis, medical exams, and forwarding of patients. In this post, I summarize the main results from their interesting study. You can read the full paper here1 (limited access).

Goal of the analysis

For the people in the hospital it is crucial to have a good understanding of the clinical and administrative processes (called ‘careflows’). However, there were only 11 people in the IT department and these 11 people were responsible both for the maintenance and development of the in-house IT system as well as for the process analysis. Clearly, there was no room to perform a classical, manual process analysis via interviews because it would have been too time-consuming. There was also no money to hire external process consultants to do this.

So, the hospital teamed up with the process mining experts at IST to extend their IT system with process mining capabilities. For the case study, the emergency careflow—an administrative process—was chosen because:

The main goals of the analysis were to determine the regular behavior of the process, to gain insight into the variants and exceptions, the performance of the process, and about potential deviations from medical guidelines.

The event log

The data recorded by the IT system in the hospital was contained in a database with more than 400 tables. For the case study, only event data from the emergency careflows was extracted into a special database (see picture below). The Episode table lists the emergency patients, which are the case identifiers in this process. The remaining tables represent possible activities performed on each patient.

The new database contained all activities performed within the emergency careflows from January 2009 to July 2009. In total there were:

The researchers then focused on analyzing the radiology workflow of emergency patients.

Process mining results

The most important questions that should be answered through the process mining analysis were:

  1. What is the regular behavior of the radiology workflow?
  2. What are the variants and infrequent behavior?
  3. How is the performance?
  4. Are there deviations from medical guidelines?

The first two questions relate to the control-flow perspective of the process. Because of the complexity of healthcare processes, it is usually necessary to simplify or break up the process in some form. Otherwise you get this2:

The above process model represents the complete radiology workflow for emergency patients and was created using the Heuristics miner and converted into a Petri net.

In an earlier case study the researchers used trace clustering to obtain more usable process models. In this case study, another clustering technique—called ‘sequence clustering’—was used to separate regular and infrequent behavior. Each cluster then represents just a subset of similar cases in the event log rather than looking at all the (potentially very different) process instances at once. This clustering step can be performed multiple times to simplify complex models.

1. Regular behavior

The most dominant cluster revealed the regular behavior of the radiology workflow (covering almost 50% of the cases), which is shown below. It follows 4 simple steps: (1) The exam is requested; (2) the exam is scheduled; (3) the exam is performed; and (4) the exam is validated without report.

2. Variants and infrequent behavior

Several variants were found in other clusters. One variant (covering ca. 18% of the cases) is shown below, whereas the differences with respect to the regular process above are highlighted in red (see below). The main differences in this variant are:

  1. After the exam was requested, for 7.3% of the cases it was canceled (see probability of 0.073 at the arc).
  2. For 8% of the requested exams (see probability of 0.08) the process ended directly. In fact, these were the cases where employees were not using the IT system correctly because all exams should always be registered.
  3. For those cases where an exam was performed, 8.5% of them (see probability of 0.085) were validated with a report.
  4. In 18.7% of the cases where an exam was performed (see probability of 0.187), the exam was reported by the Institute of Telemedicine (ITM). This means that the hospital outsources the reporting of some exams, because the ITM is an external entity that delivers radiology services.

An infrequent yet very interesting pattern was found in another cluster (see picture below): Instead of first requesting the exam, there are situations where physicians schedule the exam, perform the exam, and only afterwards request the exam. This is not supposed to happen, and with further inspection this pattern occurred 131 times in the event log.

3. Performance analysis

The third analysis goal related to the performance of the process. For this, the data from the different clusters was exported and analyzed with the ‘Performance analysis with Petri net’ plug-in in ProM.

In the screenshot below, the performance results for the regular process are shown. On average, the overall flow time from the exam request to the validation of the exam for patients in the emergency radiology was 68 minutes. It took on average 38 minutes from the exam request until the exam was scheduled (see bottlenecks highlighted by red circles below), and about 25 minutes from the exam being performed until the validation of the exam.

In comparison, the overall flow time for cases where the report was performed by the external entity ITM was three times as long (on average three hours rather than one). It took ca. one hour after the exam was performed until the exam was sent to the ITM, and then it took another hour for the exam to be reported.

4. Deviations from medical guideline

The last analysis goal was related to one specific medical guideline in the emergency careflow. The rule says that when a patient is assigned to a physician, then this physician is responsible for the diagnosis, treatment, exam requests, and the forwarding of the patient: She must not handover her work to another physician during the process.

The researchers checked this rule based on the data collected in the IT system and visualized all violations in a social network-like view (see picture below).

In this picture, every number represents one physician in the hospital. Each arc represents the “handover of work” from one activity to the next one for the same patient. If the arc goes back to the same physician (self-loop), then no transfer of the patient to another physician has happened. However, for those where a transfer occurred, we can see this in the middle of the picture.

Like Álvaro and Diogo point out in their case study, these deviations do not necessarily need to be a problem for the hospital. Perhaps there were good reasons to initiate the transfer of these patients to a colleague. Nevertheless, this analysis shows that it is possible to automatically detect deviations from medical guidelines based on actual data.

More than an exercise

I really like this case study because it clearly shows how process mining can provide useful and relevant insights also into complex processes. Furthermore, the researchers implemented their approach on top of the hospital’s IT system, which means that the hospital benefits from this work beyond the study itself.

What do you think, will process mining capabilities be a standard component of all hospital IT systems in the future?


  1. The citation is Álvaro Rebuge, Diogo R. Ferreira, Business process analysis in healthcare environments: A methodology based on process mining, Information Systems, 2011 (to appear)  
  2. Although I am sure that in another process notation the model would have looked half as complex. Petri nets are just not suitable to describe as heterogeneous and complex a process as this one.  

Comments (5)

Very nice post !

Thanks for the feedback, Vojtech!

Very interresting and helpfull. Thank so much.

I am glad to hear that you like the article! Thanks, Cedrico.

Very interesting how variation analysis uncovered two areas where quality improvement interventions could be carried out. They could look for root causes in lack of use of their EHR (2.2), and where physicians were documenting post facto.

This could lead to changes in training, policies, etc, that would address the causes, and the results could be monitored to see if they had worked.

Would be interesting to find out if that is what they did.


Leave a reply