BPI Challenge 2015

The BPI Challenge is an annual process mining competition, which takes place for the fifth time this year. The goal of the challenge is to give both practitioners and researchers the opportunity to do a process mining analysis on real-life data.1

In this competition, anonymized but real data is provided and can be analyzed by anyone using any tools. Submissions can be handed in until June 28, 2015 and the winner will receive a very special BPI Challenge trophy. Read more about the additional student prizes at the BPI Challenge website.

As always, we make our process mining software Disco available for anyone for the purpose of this challenge. Read on to see what this year’s challenge is about and how you can get started.

The Process

This year’s data is provided by five Dutch municipalities. The data contains all building permit applications over a period of approximately four years. There are many different activities present, denoted by both codes (attribute concept:name) and labels, both in Dutch (attribute taskNameNL) and in English (attribute taskNameEN).

The cases in the log contain information on the main application as well as objection procedures in various stages. Furthermore, information is available about the resource that carried out the task and on the cost of the application (attribute SUMleges).

The processes in the five municipalities should be identical, but may differ slightly. Especially when changes are made to procedures, rules or regulations the time at which these changes are pushed into the five municipalities may differ. Of course, over the four year period, the underlying processes have changed.

The municipalities have a number of questions, outlined below:

  1. What are the roles of the people involved in the various stages of the process and how do these roles differ across municipalities?
  2. What are the possible points for improvement on the organizational structure for each of the municipalities?
  3. The employees of two of the five municipalities have physically moved into the same location recently. Did this lead to a change in the processes and if so, what is different?
  4. Some of the procedures will be outsourced from 2018, i.e. they will be removed from the process and the applicant needs to have these activities performed by an external party before submitting the application. What will be the effect of this on the organizational structures in the five municipalities?
  5. Where are differences in throughput times between the municipalities and how can these be explained?
  6. What are the differences in control flow between the municipalities?

The Data Set

There are five different log files available. Events are labeled with both a code and a Dutch and English label. Each activity code consists of three parts: two digits, a variable number of characters, and then three digits. The first two digits as well as the characters indicate the subprocess the activity belongs to. For instance ‘01_HOOFD_xxx’ indicates the main process and ‘01_BB_xxx’ indicates the ‘objections and complaints’ (‘Beroep en Bezwaar’ in Dutch) subprocess. The last three digits hint on the order in which activities are executed, where the first digit often indicates a phase within a process.

Each trace and each event, contain several data attributes that can be used for various checks and predictions. Furthermore, some employees may have performed tasks for different municipalities, i.e. if the employee number is the same, it is safe to assume the same person is being identified.

Further information about the challenge and how to submit can be found at http://www.win.tue.nl/bpi/2015/challenge.

We have imported these five files for you in a Disco project file that you can simply open with the freely available demo version of Disco. The only difference you will find in this project file compared to directly importing the XES files is that we used the English activity names and sorted same-timestamp events based on the action code attribute.

You can download both the Disco project file and the raw data files here:

Download the Disco project file that can be opened with the freely available demo version of Disco


Download the raw data files in a Zip file (CSV files, created from the XES files provided in the challenge)


Submissions can be made through the EasyChair system.

A submission should contain a pdf report of at most 30 pages, including figures, using the LNCS/LNBIP format specified by Springer (available both as a Word and as LaTeX template). Appendices may be included, but should only support the main text.

Submission deadline: June 28, 2015

Announcement of winners: at the 11th Workshop on Business Process Intelligence (BPI 15), Innsbruck, Austria, 31st August 2015

  1. If you are looking for even more data sets, take a look at the challenges from 2011, 2012, 2013, and 2014, too.