Case Study: Analyzing the Complaints Process at Granada City Council

Granada

This is a guest article by Arturo Martínez Escobar from the Ayuntamiento de Granada-Agencia Municipal Tributaria, Nemury Silega Martínez from the Universidad de las Ciencias Informáticas in Cuba, and by Manuel Noguera from the Universidad de Granada. If you have a process mining case study that you would like to share as well, please contact us via anne@fluxicon.com.

The city council of Granada was triggered to look at their dossier handling processes, because citizens had complained about delays. The administrative services group could identify time gaps where dossiers did not seem to advance, but they could not explain why these delays happened. Therefore, a process mining analysis was performed within the tax collection department of the Granada council. Because certain tasks were not registered in the IT system, the process mining analysis was combined with interviews of employees.

The results of this project changed the point of view of the managers in the department, who initially thought that the negligence of employees was the main cause of the delay. In reality, there were other factors such as a lack of staff rotation by incidents, and a lack of people with signing responsibilities whose rotation or absence was not covered, which influenced the delays. Due to this project, the organization has gained traceability and deadlines control, which results in benefits for citizens, public employees, and politicians.

Organization

Granada is a city with a population of 250,000 people in the South of Spain. The analysis was performed on one of the processes performed by the Municipal or Local Tax Agency, which acts as an agency with its own competencies for tax management and collection in the city of Granada.

Town Hall

Process

The target of analysis was the appeals and complaints process about the collection of taxes and other public revenue. For example, when a citizen does not agree with a tax collection statement they can register a claim, which then starts the appeal and complaints process. They receive a response to their claim at the end of the appeal and complaints process. Because citizens had to wait a long time for their response, they started to complain to the city council about these delays.

In fact, delays in the appeals and complaints process are not just a performance problem, but they are also a compliance problem: There are legal deadlines, and it is a legal obligation for the council to expedite dossiers ex officio in order to meet these deadlines.

After receiving complaints about these delays, manual inspections of individual dossiers confirmed that there were indeed requests that were not advanced in a timely manner. However, officials were not able to tell whether this was a common problem and they could not explain why these delays occurred.

The appeals and complaints process starts when a claim is registered in one of three places: (1) Electronically via the General Registry office in the City, (2) in person at one of the General Registry offices, or (3) at the registry of the Local Tax Agency.

Complaints Process

Once the application is registered at the General Registry office (see middle lane in the above picture), it must be received by the registry of the Local Tax Agency (see upper lane in the above picture) and subsequently be delivered to the Appeals and Complaints department (see lower lane in the picture above), which is responsible for the resolution of these appeals. If the application is recorded directly at the Local Tax Agency, a note is created by collateral effect on the overall check to maintain traceability of the record. After the physical arrival of all documents regarding the application, the applications are delivered to the Appeals Complaints department.

Based on the manual inspections, we suspected that most of the delays were in starting the application to resolution process (see area highlighted in red above).

Data

To find data for the process mining analysis, we checked the repositories such as the dossier database for records related to the activities in the process. We found that each application was written into a dossier row with information in a similar format. We also saw that the electronic registrations and the Registry Office IT system share the same database, and that each electronic registration description appeared with the suffix “-e" (which made it possible to distinguish electronic registrations from in person registrations).

While thee dossier database was shared, different tables were used for registering information at the different departments. This means that data from these different tables had to be combined to create a data set that can be used to analyze the full end-to-end process. The data contained a dossier ID, which could serve as the case ID. Furthermore, information about what was done (activity), who performed the process step (employee), and timestamps about the start of each step were available.

The main challenge was the re-creation of the dossier history from its beginning. The IT software applications that support the processes are all integrated within the municipal information system. However, they often use different field names regarding the same record across different tables. Furthermore, we had to clarify the actual meaning of each field in the database to identify exactly which records correspond to the citizen’s original appeal and which records correspond to new records created afterwards (indicating steps in the process).

To complicate things even more, although there was a dossier ID that could be used as a case ID, each application creates a new dossier ID to manage a new claim. The new dossier ID is different from the management dossier that motivated the citizen complaint (i.e., the original claim). Therefore, for each dossier ID there should exist a reference dossier ID or a dossier proceedings ID, which must be traced and correlated to create the full end-to-end data set.

Data

We solved this challenge by extracting the dossier ID relationships (see figure above) between the three tables from the Local agency event log, the main registry event log and the claims records event log. The dossier ID from claims records event log was then used as the overall case ID for the data set.

We had to pay extra attention due to the different ways that the process can be started: Some dossiers had to be traced through two registries (if started at the General Registry office) while others went through only one (if launched directly at the Local Tax Agency). Furthermore, the claim dossier ID is only created when it reaches the resources processing unit. However, its history starts earlier (passing through a database record or two before). Therefore, the claim dossier ID had to be filled in retroactively for these earlier events.

Activity Names

Finally, an additional challenge emerged from the fact that the activity names were not in a readable form. A combination of three fields of numerical values and attributes had to be mapped to a human-readable label that indicated the activity that was performed in a meaningful way (see the table above for an excerpt).

After all these data preparations, we had created a CSV file with a total of 7,582 events and 2,511 cases for our process mining analysis.

Results

We imported the data set in the process mining software Disco to discover the process that was performed for these 2,511 cases.

In the discovered process (see the process map below, which is based on the Total duration visualization), we could clearly see one major bottleneck, which has had the biggest impact on the delays. This confirmed our hypothesis from the manual inspections: There is a big delay before the application-to-resolution process is started in the Appeals and Claims department. We also saw that this was a general problem and not limited to a few exceptional claims.

Full Process

We then filtered the data set down to just the claim creation events and the start of the process in the Appeals and Claims claims department (see process map below). Certainly, 23.6 weeks, 18.6 weeks and 24.1 weeks on average to pass the registration processing does not look normal.

Partial Process

We then further analyzed the data set using the Dotted Chart plugin in ProM (see visualization below). The dotted chart plots each case in a horizontal line as a sequence of dots. Each dot represents one process step that was performed. The y-axis represents the time frame of the data set from March 2014 until September 2015.

From this visualization we can observe two things:

  1. There are areas in the timeline, where much less activity is shown than in other time periods. What happened there? Was the process stopped? Why?
  2. The vertical patterns of dots indicate that many activities were performed for different cases nearly at the same time. This indicates a batch processing pattern in the process.

Partial Process

As a next step, we had to find out what happened in those periods of inactivity in the process. Were these claims suspended or pending for some reason? Because there was no record in the data about what happened at that time, we performed interviews with employees working in the process to identify the root causes.

During these interviews we learned that:

  • The claims are handled manually and need to be physically classified once they get from the registration to the Appeals and Complaints department.

  • The classification task is in turn broken down into several manual activities. This classification step is very important for the performance of the process, because once the claim is cataloged it is further processed in batch. The classification is done by a knowledge worker who judges and sorts the claims based on the decision to be made (rejecting or approving the claim).

    

Because these classification tasks are manual activities, no data about how long it takes to organize and classify the records, nor about the role that the resource plays in this part of the process, is recorded.

  • Finally, we also learned that information that was already recorded in the computer system (for example, the citizen’s address) had to be re-registered manually. In addition to the extra work involved, this can generates redundant information and creates a potential for errors or inconsistencies.

We realized that — because they were not visible in the system — there was not enough awareness about the importance of these classification tasks. The feeling at the municipality was that the delays could be the fault of the employees. The employees were not recognized for their work on this highly specialized and important task.

Furthermore, it was now clear that the speed of the process could be improved by adding employees who would perform the manual re-registration tasks in the system until the data transfer was properly automated. This way, the knowledge workers would not have to spend time on these basic data entry tasks anymore.

In addition, we learned that there were also improvement possibilities in the resource organization: Particular people were responsible for signing the dossiers. When they were not available, for example, due to illnesses or vacation, then delays could occur. By ensuring a capacity to cover cases in situations of absence, delays in the dossier handling progress could be avoided.

Impact

Based on our analysis, we made the following recommendations:

  • Avoid the manual re-entry of data by automating the data transfer from the source system. In the meantime, assign additional resources to help with the manual data entry to reduce the work load in the manual classification task.

  • Automatically transfer claims to the Appeals and Complaints department once they are ready and create two new, manual activities in the process, where the resources can keep track of the manual activities (see illustration of the new process with the new steps highlighted in green below). This will provide a greater transparency and better accountability for the manual — and currently invisible — steps in the process.

  • Create a system in which authorized officers are appointed to sign the resolution of cases in situations of absence, so that the processing of the department is not halted by this absence.

New Process

As a result of these changes, we have now achieved the following improvements:

  • The workloads and planning can now be measured. This was not possible before. In addition, we can now provide more realistic responses to citizens about the progress (and expected completion) of their dossiers.

  • We could show that the problems in the process were not due to official neglect. Instead, there was a lack of traceability in the computer systems, because of a lack of alignment with the reality of the business process. The work environment has improved and the efforts of the employees handling the classification are now much more valued than before.

  • Given that the data preparation steps are now in place and can be easily repeated on new data, we can now continue to analyze and quantify this process to continuously improve it. We can even use these measurements to simulate what potential process improvements we can expect by adding more employees to certain tasks. This way, we can continue to resolve bottlenecks and reduce the average time of initiating and handling the complaints after being classified.

  • Finally, the improved records management is also very important from a compliance perspective, because we can ensure and prove that our process is aligned with the rules.

Process mining was totally unknown for the management of the organization as well as their technical staff. They have been impressed with the graphical power of representing the process flows and annotating the model with performance metrics. Frequent activities could be made visible and the results were easy to interpret.

Previously, the logs from the IT systems were only used for the purpose of occasional checks triggered by inquiries from citizens. The transaction records were never used to visualize, analyze or audit the process in a systematic way before. We have found that process mining can significantly help to improve administrative processes in our government agency, and we believe that this method is extensible beyond the department of resources and claims collection. For example, areas where we will explore process mining in the future are the social services processes, subsidies, tax collection and management, sports concession fees and licenses, etc.

We have learned that the exploratory analysis of the business process through process mining can reveal issues of concern that are unknown, and that it can impact the performance of the organization and performance of their employees. We have also seen that not all process activities are always captured in the system records. Therefore, it is good to sit down with the responsible users in an interview in addition to the process mining analysis to complete the picture. This can also help to uncover additional factors that influence the process, which are often not visible in the data.


Article

You can download this case study as a PDF here for easier printing or sharing with others.

Do I Need to Remove Outliers for My Process Mining Analysis?

Outliers in Process Mining

A data point that is significantly different from other data points in a data set is considered an outlier. If you find an outlier in your event log, should you remove it before you continue with your process mining analysis?

In process mining terms, an outlier can mean many different things:

  • A case that has a much longer duration than others
  • An event with a timestamp that lies in the future or way in the past
  • A case that has many more events than other cases
  • A variant that exhibits unique behavior
  • An attribute value that occurs only very few times or much more often compared to others
  • Activities that occur in a different order than what you normally see
  • The process starts or ends in a strange place

In machine learning, outliers are sometimes removed from the data sample during a cleaning step to improve the model. So, what about process mining: Should you remove such outliers when you find them to better represent the mainstream behavior of your process?

It depends.

First, you need to check whether the outlier is a data quality problem or whether it really happened in the process. As a rule of thumb, you should then remove outliers if they are there due to data quality issues and keep the ones that truly happened.

For example, one reason that a case has a much longer duration than others could be that it contains an event with a zero timestamp (such as 1900, 1970, or 2999). Zero timestamps can be errors or indicate that an activity has not happened yet. Either way, they do not reflect the actual time of the activity and, therefore, are misleading.

Another reason could be that the one case that took 20 times as long as you would expect (for example, 20 months instead of 4 weeks) really belongs to a crazy customer case that took multiple rounds, lots of ping pong between different departments, and simply an unusually long time to resolve. This is part of the process reality.

When you should remove outliers

You should clean up your outliers in the following situations:

  • Zero timestamps need to be first investigated and then you decide whether to remove just the event with the zero timestamp or the whole case, based on the situation.
  • If you have a very long case that is due to missing case IDs, you need to remove this case.
  • If you have activities that occur in a different order, first investigate the root cause. For example, if the different order is because of same timestamp activities, re-sort the data set and import it again (no removal of events is needed). If the different order is due to different timestamp granularities, import the data again on the most coarse-grained level. If the different order is due to different clocks, the differences need to be resolved before merging the data sets.
  • Cases that have an unusual start or end point most likely are no errors but simply incomplete cases. Nevertheless, if you want to analyze the end-to-end process then you should remove incomplete cases to prepare your data set for the analysis.

Be mindful of how much data you remove in the cleaning process. If too much is removed then the remaining data set may not be representative anymore.

And keep in mind that not all data quality problems are outliers! For example, the recorded timestamps may not reflect the actual time of activities but look entirely normal.

When you should keep outliers

The idea behind keeping outliers if they reflect what really happened is that you want to see the whole picture of the process. Sometimes, exceptions in the process are the most interesting result of your analysis. Especially when they imply compliance issues or security risks in the process (say, a violation of the segregation of duties rule).

For example, you should keep outliers in the following situations:

  • Cases with an unusually long duration that really took that long.
  • Variants that exhibit unusual behavior if it really happened. In fact, auditors often deliberately filter their data set in such a way that they only see the low-frequent variants because they are interested in the exceptional cases.
  • Activities that actually occurred in a different order.
  • Even if activities occur in a different order due to a data quality problem such as missing timestamps for activity repetitions, you would not remove these cases but interpret the results with the knowledge of the underlying data issue.
  • There are analyses for which incomplete cases should not be removed.

At the same time, there are reasons to specifically address - and sometimes even remove - outliers although they are “real”. For example:

So, if outliers really happened in the process then you generally want to keep them. Because you want to see everything that is really there (just like you don’t need a minimum number of data points to perform a process mining analysis). But you want to be aware of them in the analysis.

Disco 2.8

Software Update

We are happy to announce that we have just released Disco 2.8.

This release addresses an issue that affected the layout of process maps, especially on Windows, and another issue that could prevent startup of Disco on macOS 10.15 and later.

We recommend that you update at your earliest convenience. Like every release of Disco, this update fixes a number of bugs and improves the general performance and stability.

Thanks for using Disco, and thank you all for your feedback. With your help we will continue making Disco even better!

How To Update

Disco will automatically download and install this update the next time you run it, if you are connected to the internet.

If you have experienced problems with your current version of Disco, we recommend that you install this update of Disco manually. Please download and run the updated installer packages manually from fluxicon.com/disco/download

Changes

  • Process Map: Addressed an issue with map layout on some installations.
  • Filter: Improved restoring recipes.
  • macOS: Fixed an issue with startup on macOS 10.15 and later.
  • Control Center: Extended debug information
Disco 2.7

Software Update

We are happy to announce that we have just released Disco 2.7.

This release improves the performance and fidelity of Disco’s process maps, especially for data sets with large numbers of activities. We have also added the option to export and load holiday presets to TimeWarp, allowing you to more efficiently re-use your favorite set of holidays.

We recommend that you update at your earliest convenience. Like every release of Disco, this update fixes a number of bugs and improves the general performance and stability.

Keep your feedback coming — Your bug reports and suggestions help us make Disco faster, more stable, more polished, and more useful with every update!

How To Update

You need to install this update of Disco manually. Please download and run the updated installer packages manually from fluxicon.com/disco

Changes

  • Process Map:
    • Increased performance and stability of graph layout.
    • Improved the estimation of total and mean durations for aggregated paths.
    • Simplifying process maps with huge numbers of activities made more consistent.
    • Addressed an issue with restoring process maps with large numbers of activities.
  • Statistics:
    • High-resolution chart rendering on retina and HiDPI screens.
    • Optimized scaling of timeline charts.
  • TimeWarp:
    • Save and load presets for easier re-use.
    • Improved bank holidays navigation UI.
    • Added bank holidays calendar for Saudi Arabia.
  • Export:
    • Fixed an issue where graph export could include negative durations.
  • Workspace:
    • Improved data integrity safeguards.
  • Control Center:
    • Improved hardware detection.
    • Enable proxy server configuration via the system panel.
    • Changing the memory limit is now more reliable on Windows.
    • Extended system information.
  • UI:
    • Color management more robust.
    • Improved signup experience on Windows.
    • Improved shutdown flow.
    • Fixed an issue that could prevent proper startup on some setups.
    • Improved layout.
  • Connection:
    • Increased security and reliability.
    • Improved stability when connecting through a proxy.
  • Update:
    • More reliable auto-updates on Windows.
  • Sandbox:
    • Sandbox project is now also available offline.
    • Refreshed sandbox project.
  • Windows:
    • Improved general graphics fidelity and performance.
    • Improved installation experience.
  • Platform:
    • Improved experience for use with assistive devices.
    • Java update.
Process Mining Training Online

Process Mining Camp 2020

You have taken your first steps, dipped your toes in and played around, but now you want to get serious about process mining? Join one of our small-group trainings online!

Disco makes it very easy to get started with process mining: You import some data and Disco produces a process map. That is a great start, but your journey has only just begun — There are a lot of important topics around process mining that you really need to know, so that you can apply it productively: How do you prepare the data? How can you ensure data quality? How can you interpret your results? And also, what kinds of analyses can you even do in the first place?

We have heard all the questions that process mining newcomers ask, and we know the things they often miss when they start out. In this course, we have put all our experience together to give you the essentials that you need to know to be ready for using process mining in practice. Skip the learning by trial and error, and put that documentation and theory books aside for a minute — This training will give you the fundamental knowledge and skills to hit the ground running and use the full potential of process mining in your work.

Our training takes place over a series of interactive web training sessions. This means that you can jump in and ask questions at any point in time, just like you would do in a classroom setting. There are four sessions that run from 15:00 to 17:00 CEST each day (see your own timezone here). Between the sessions, you have a few days to process the materials and practice with additional exercises, on your own time. This online training covers all the practical process mining topics of our popular two-day on-site training.

Dates

The registration for the upcoming three trainings is now open:

See further details and reserve your spot for the training now!