Skip to main content
HomePodcastsMachine Learning

Unlocking Efficiency Gains Through Process Mining with Wil van der Aalst and Cong Yu, Chief Scientist and VP Engineering at Celonis

Wil, Cong, and Richie explore process mining and its development, popular use cases of process mining, how to scale process mining systems, prospects within the field and much more.
Dec 2023

Photo of Wil van der Aalst
Wil van der Aalst

Wil van der Aalst is a full professor at RWTH Aachen University, leading the Process and Data Science (PADS) group. He is also the Chief Scientist at Celonis, part-time affiliated with the Fraunhofer FIT, and a member of the Board of Governors of Tilburg University.

His research interests include process mining, Petri nets, business process management, workflow management, process modeling, and process analysis. Wil van der Aalst has published over 275 journal papers, 35 books (as author or editor), 630 refereed conference/workshop publications, and 85 book chapters.

Photo of Cong Yu
Cong Yu

Cong Yu leads the CeloAI group at Celonis focusing on bringing advanced AI technologies to EMS products, building up capabilities for their knowledge platform, and ultimately helping enterprises in reducing process inefficiencies and achieving operational excellence.

Previously, Cong was Principal (Research) Scientist / Research Director at Google Research NYC from September 2010 to July 2022, leading the NYSD/Beacon Research Group, and also taught at NYU Courant Institute of Mathematical Sciences.

Photo of Richie Cotton
Richie Cotton

Richie helps individuals and organizations get better at using data and AI. He's been a data scientist since before it was called data science, and has written two books and created many DataCamp courses on the subject. He is a host of the DataFramed podcast, and runs DataCamp's webinar program.

Key Quotes

The most interesting thing to me is that process mining is really connected to operations, the physical world of the company.

And there's a lot of process going on inside and outside the data system

I started working on this in the late 90s as a research project, right? And now it's a complete industry. So I think there are many developments that have happened that I find super exciting.

I think many organizations are struggling with applying machine learning in a business setting. And I think process mining is the technology to help lower the threshold for organizations to start using machine learning in a meaningful way. I expect that many organizations will get disappointed, that they have unrealistic ideas of everything that they can do. And I think process mining will prove to be something that is very down to earth, that will help you to generate the machine learning problems where, let's say, you get really reliable results. So I think these are, for me, the two things that I will be enjoying working on in the next couple of years.

Key Takeaways


Process mining is a unique field, different from traditional data mining and mainstream machine learning, offering specialized insights by combining process and data analysis.


Ensure successful implementation of process mining by involving various stakeholders, including executive leadership, data scientists, and business line owners, and aligning with IT departments for technological integration.


Start with manageable processes and gradually scale up to more complex ones. Continuous monitoring and improvement are key, as process mining should be a sustained effort rather than a one-time initiative.

Links From The Show


Richie Cotton: Welcome to DataFramed. This is Richie. Today we're going to be talking about the best kept secret in the world of data. It's a technique that could be used to save your company millions of dollars, but very few organizations seem to have adopted it yet. That technique is process mining, and the idea is very simple.

Every organization typically has Lots of processes that have grown organically over time, and no one has ever gotten around to thinking about how they might be optimized. And they have more processes where the steps are written down, but often they can't be followed, and employees behavior doesn't reflect the official process.

In both cases, by making use of data and process mining, You can make your workflows more efficient, saving you money and making your colleagues less frustrated. Today we have two guests from process mining company Solanus. Will van der Alst is their chief scientist and he's also a professor at RWTH Aachen University.

In fact, Wil's research into process mining and business process management has led him to be one of the most cited computer scientists in the world. Joining Will is Kong Yu. Salonis VP of Engineering for AI and Knowledge. Kong has been driving efforts to incorporate generative AI into the Salonis process mining platform.

He was previously a Research Director at Google and an Adjunct Professor at New York University. In short, the two of them know a thing or two about process mining. Let's hear what they have to say. Will and Cong t... See more

hanks for joining me on the show today.

Cong: Thanks, Richie.

Wil: It's great to be here.

Richie Cotton: I think just to begin with let's talk about what exactly process mining is. So, maybe Will, you can start. How do you think about it?

Wil: so process mining has been a technology that has been developed over the last, let's say 25 years. So it starts from event data and the goal is to use this event data, which is available in any organization. to first show what is really happening, often very surprising for people. So people often have an idea of what the processes are like, but then they see it based on the evidence.

typically very different. And then you use let's say these process mining results to improve the process. So you can also identify where are bottlenecks that you did not expect or that you would like to remove. Or where are compliance problems? If you have enough data, you can even let's say, predict that something bad is going to happen.

That's Kong's areas of expertise. And in all cases, you would like to take action to actually improve the process. So it's a very generic technology. And it's pretty sure that in the future, let's say most organizations will use that for many of their processes.

Richie Cotton: I think a lot of people will understand the idea of data mining. So, are process mining and data mining related to each other?

Wil: The difficulty is that people talk about AI, machine learning, data mining, and it's all very blurry. So I think that the best way to explain it is that process mining is a unique technology. It is not the same as data mining. It is not the same as mainstream machine learning, for example, neural networks.

The starting point for process mining, doing discovery and conformance checking is something very different. And it is different because it is this mix of both. Processes and data, right? And that combinations makes many of the algorithms unique. I understand it's very confusing for people, so you could say process mining is a kind of machine learning, and I would say yes. That is correct. But at the same time, it is very, it's a very different technology than, for example, deep learning or neural networks.

Richie Cotton: Okay, so, it's really a specialized set of techniques in order to help you understand your processes and help you make changes to your processes well, since you mentioned AI perhaps, Gong, can you tell us how AI fits into process mining?

Cong: Yeah, I think with the generative AI, I think there's a lot of potential. think as Will already mentioned that there's a lot of the predictive process mining capabilities that we can build into. And then also there are conversational interface that you can put over process mining technologies to have much better interaction between the process data and process insights with a human process mining, as Will mentioned, it's like, a somewhat complicated technology that's super important for the company. Therefore, having a convenient and conversation interface is going to be very helpful for practitioners.

Richie Cotton: Okay, so, it sounds like it's maybe early stages for incorporating AI into process mining, but there's a lot of

Cong: It's very exciting. Yes.

Richie Cotton: Alright, I'd love to talk about that more later on perhaps, but perhaps it's worth talking about how people actually make use of process mining. So, maybe one of the most popular use cases of process mining at the moment.

Wil: I think many organizations start with a standard financial processes. So there are any organization has something called purchase to pay. Order to cash as the standard processes that an organization has to buy stuff and to sell stuff, right? Any organization of some size has that.

And that is where most organizations start for the simple reason that it is completely known what kind of data is available. It is pretty standard. So you can basically plug and play and immediately do process mining. that's why many organizations start with that. We also have a lot of experience knowing in these types of processes. What are typical problems? Because we are analyzing, let's say, the same process for thousands off organizations. And, of course, you learn after a while. Okay, these are the typical problems that you encounter. Within many companies So that's the easy entry point at the same time.

I would argue that for a car manufacturer or for an airline or for a hospital, these processes are not the core processes. So for an airline, the core process is to move people from A to B. you look at the car manufacturer, the core process is to build cars that are beautiful and are very reliable, et cetera.

So what you see is that most organizations start with these standards, let's say financial administrative processes. And after having positive experiences with that, they move more to the processes that are really unique and core to the organization itself. But of course, you should make more effort then, because then your process is probably unique.

Richie Cotton: That's interesting that you suggest starting with something that's standard, like the general administration, financial stuff, and then moving on to something that's core to your business. So you go for, like, the easy stuff first, and then move on to maybe the high value but maybe a bit riskier stuff later on. Do you have any examples of companies that are heavily using process mining at the moment?

Wil: That's of course like there are many organizations already using process mining, I think many people do not realize that half of the Fortune 500 companies are already using process mining, but you also see is that let's say the adoption in Europe is more advanced than the adoption in the U.

S. So what you see, it depends where the audience is, but in countries like the Netherlands and Germany, it is much more widely used. But if you think of, let's say examples of companies, one could, for example, look at the car manufacturers. And I think it's safe to say that most of the car manufacturers are already using process mining.

So, for example, if you look at a company like BMW, that I know myself quite well, they are analyzing over 50 processes using process mining. And this ranges from, let's say, painting a car and these types of things, to things related to, let's say, servicing a car. To the financial things that I was talking about.

So examples are only present. It's also, if you look at, for example medical companies are using it like a company like Uber is using it the examples are everywhere. So perhaps Kong would like to add some examples that,

Cong: Yeah, I think that there's a number of companies for example, HP and Dell are working with us as well. And HP save a lot of money in terms of the cashflow operations. And I think there's a, I think on our website we probably have a list of all the companies that we can highlight as well.

Richie Cotton: Okay that's interesting. So, it sounds like this is predominantly being used by perhaps larger organizations. So, is there some sort of requirement to having like a large scale in order to be able to make use of process mining?

Wil: right. The threshold to get started with process mining is of course lower for larger organizations. And so, for example we often use the 80 20 rule, that if you do process mining, you find out that 80% of the things that you're handling, you are handling them more or less, correct.

And 20 percent there is a problem, right? There is a compliance problem, there is a performance problem or something like that. So if you are an organization and you only have 100 cases, that means that you find 20 cases where there is a problem. However if you are a large company, like for example Siemens, where you are having millions of transactions each year, And then let's say from a couple of million, 20%, you can see that the return on investment is much quicker.

And so what I expect to see is that like it starts with the larger organizations. But of course, our goal will be that process mining will become a kind of commodity. That also, let's say smaller enterprises will use but that's not the logical starting point. But we also have many examples of, let's say, smaller organizations applying it but you need to compute what is the return on investment of doing that. And for a larger organization, that is much easier.

Cong: this is also where I think the recent trend of large language model will probably help because process mining is a piece of enterprise software as any enterprise software. There's a learning curves and sometimes. Require setup and all these kind of things. So in a lot of ways, like large enterprises will have that capacity to do these kind of setup and onboarding.

While for smaller ones, hopefully with generative AI and large language models, these conversational interfaces will make it much easier for people to get on board. Onboarding, yeah.

Richie Cotton: in terms of how you get started with. Process mining. Okay. I'd like to talk a little bit about,

Maybe the more technical side of process mining. So, can you give me an idea of what sort of techniques process mining encompasses?

Wil: So if you look at the different phases, if you would roughly look at what are the different steps that you apply when you do process mining, then the first step is extracting the data from your information systems as if you're using Salesforce, Oracle, SAP, or any of these enterprise software systems, they are loaded with event data.

And the first step is always Salesforce. Scoping what you're interested in, extracting it and making it ready to do process mining. So that's step one. Then step two is what we call process discovery. So based on the event data, you uncover what is really happening. So you create transparency, and as I mentioned earlier, this is often shockingly different than what people expect.

And that in itself already has a value, right? Because people become aware of many problems. in the back of the mind, people had a notion that there was a problem, but it then becomes very visible and people will be eager to address it. tHat is step two. Step three is that you do conformance checking.

anD that basically means that you have an idea of what you want to happen and you look at what is really happening and for the most important pain points you start setting triggers. So if a supplier, for example, often changes the price, leading to all kinds of chaos in your organization you can detect when that is often happening for a certain supplier that's what we call conformance checking.

Richie Cotton: So this sounds like you might have two problems then. So either your process is stupid to begin with, or you have a sensible process, but people aren't actually complying with that process. Is that? What I understand by conformance checking those two different cases.

Wil: when we talk about processes, indeed, that is the process what people expect to happen or what people want to happen. And that is reality. And conformance checking exactly tries to, to see where the biggest differences are. And there are many deviations that are pretty harmless, right? You don't have to worry about them.

But there may also be, let's say, deviations that are causing incredible delays or are very risky from a compliance point of view. And if you have a lot of data, then you can even go one step further and you can predict that there is going to be a problem, That is a use case that I think at this point in time is only interesting for companies that have relatively stable processes able to also, Precisely describe what they want to happen in all cases.

The final step is that you actually change the process because it's very important that you should not stop at diagnostics and detecting that there is a problem or predicting that there is a problem. It's very important to take the action. And that is the reason why many of our customers are earning a lot of money because they turn these insights into actions.

And that is something that is crucial. So these are roughly the steps. I'm not sure whether Kong would like to add something to that.

Cong: want to say like, process are not necessarily stupid, and most people design the process to certain desirabilities, and it's just reality is much more complicated. You may have an approval process in place, but suddenly a customer comes in over the July 4th weekend, something has to be done, otherwise you're losing the customer.

You overwrite some of the stuff in order to make things happen. So, you know, a lot of people are trying to do the right thing and the reality, the complicated reality is you have a lot more process going on and then things do deviate from the original design. And the beauty of process mining is ability to capture these and really under I understand whether some of deviation may probably become part of the process from now on, as you understand them.

Some of the deviations are not necessarily correctly done, therefore you need to fix them, So I think from that perspective it's actually very helpful even if you have very good process modeling capabilities, process mining is still going to be very useful as time goes on. and following that, I think after conformance checking, we have this notion of so called process observability.

Then, and some of the important thing is to basically put some of these things that you are very interested in looking at and monitoring into the dashboarding and through our EMS system so people can observe them regular time frame. And then as Will said, like, as you observe them, you design actions, you design certain things to improve your process and saving money and be more efficient as you run through the process,

Richie Cotton: Perhaps you can tell me a bit more about the predictions. I don't maybe Kong, do you want to take this? So, I think like, but the standard machine learning problem, usually that output you get from it is like, oh, I've made a prediction of something. well, you mentioned that maybe sometimes you want to make predictions about processes, but in general, like what is the sort of output from a process?

mining analysis. I can give some very concrete examples, like we, we use the prediction capability pretty broadly. And it's usually trying to predict something bad is going to happen, help you, alert you ahead of time so you can do something correspondingly. Some of them are actually machine learning and supervised learning.

Cong: Some of them are actually unsupervised learning. For example we have this capability called a Duplicate Invoice Checker. What it's trying to do is basically like looking through the invoices you're paying, understanding which two invoices actually duplicates of each other. And if it's identical, it's very easy, but most of the time it's not identical.

There's a different name, slight typo in the supply name, a slight different amount, slight different date, all these kind of things, right? So internally, we have built a duplicate invoice checker using so called clustering and matching machine learning based on matching functions in order to detect those duplicates.

Now, if you imagine, it's a lot easier to prevent the duplicate invoice being paid. Instead of after you're paying them and you have to chase the money back, So this is where the predictability comes in. Like you look at it and you, as the invoice comes through, you basically predict, okay, this looks like a duplicate, therefore you wanna put a block on it, or you slow its payment.

then go ahead and examine them before you really pay them. And that's one example. Here we. Don't really use so called supervised machine learning. It's more unsupervised matching functions and simulated functions. model can already help a lot over there. But there are other more predictive cases.

For example, you want to predict whether this invoice is going to be paid late. And you're going to miss the deadline. Or your supplier is going to pay you late. And you're going to, they're going to miss the deadline. So all these kind of things, these are basically more predictive. You have to look at the behavior of the past supplier and the behavior of such kind of invoices with the same amount of amounts, and then there you have more sophisticated supervised learning capability.

But when we do supervised learning we always, like, we don't mix the data from different clients. We basically protect the other. A proprietor in this and the privacy of our clients, making sure I know that the leakage is happening. So we actually have to tailor the model to specifically to the client's information in order to build a model.

Richie Cotton: So, it just seemed like, even fairly simple, but common tasks like checking whether this invoices a duplicate or not, once you get it in that scale, you've got a chance for like really improving the efficiency of your financial setup and things like that.

Cong: dollars easily.

Richie Cotton: that, that's a great way of saving millions of dollars. Excellent. in terms of tooling because you both Strongly affiliated with Salonis, but can you maybe talk about what are the tools people use for process mining?

Wil: Yeah, so I should look at the market. it's very interesting that Gartner released the so called magic quadrant earlier this year identifying process mining as a separate category of tools, right? And I've been working in the field for many years, and I'm very happy that it is now recognized really as a separate, let's say, category of tools.

Like if you look at the market as a whole, then like, I think that are like 40 companies. Commercially offering process mining capabilities. there are let's say quite some companies offering this. However, if you look at like the market share. Let's say Salonis are by far the largest, right?

It depends a bit on how you count, but I think everybody agrees that more than half of the process mining customers are using Salonis. I Think it's also important to differentiate if you talk about process mining There are, let's say, process mining tools that focus more on the specialist, the data scientist that is using it as a tool.

And there are more, let's say, what we call enterprise level process mining tools that are broader, where the goal is also that many people in the organization are consuming the process mining results. Earlier when I was talking about conformance checking, I was basically talking about making the differences visible because what you, between what you want to happen and what really happens.

And it is very important to make these things visible at a large scale. So we have several customers where thousands of users inside the organization are looking at these results. And this is key because you can only, let's say, change behavior and you can only change processes if people are aware of the problems.

so I think there is a, to answer your question, there is a many offerings, but they are not necessarily comparable to each other.

Richie Cotton: so we'll maybe take the enterprise and the sort of individual data scientist cases separately, but so, I presume, like, your event data, is that going to live in a data warehouse or something like that? So, is Salonis, is that going to hook up to Like I'm trying to work out how it fits in with a broader software ecosystem.

So, just as long as I've got with your data warehouses that. What happens?

Wil: can think of process mining as a layer on top of existing applications, So, there is data in all the different source systems. So, like an organization of some size has 200 different information systems, Salesforce, SAP, Oracle, et cetera, et cetera. And most of the process problems are not inside a single system.

Most of the process problems are, let's say, at the boundaries of all of these different systems. So that's why for us it is very important to, extract these event data from the source systems and bring it into a shape that you can actually use it in a meaningful way. And this is often, let's say the part where most of the effort goes into.

After that, process mining does its job automatically. But bringing the data in the right shape is something that is very important. If you think of classical data warehousing systems they typically mostly work with numbers. If I can Explain it like that, so they are counting how many items did I sell in this month, et cetera, et cetera.

If you look at the event data, it is slightly different, right? Because these are also the steps in the production process. I don't know the treatment of COVID patients where we record what kind of medicine a patient gets every day, et cetera, et cetera. And this is different data than data typically in the very idealized data warehousing environment.

Cong: I think earlier we talked about the difference between process mining and data mining. To me, when I joined the company a year ago, the most interesting thing to me is that process mining is really connected to the operations, the physical world of the company, there's a lot of process going on inside and outside the data system.

super interesting if you can actually, that's why I think we are at the first innings of process mining. Because once you start bringing both data sitting inside the ERP and data warehousing, as well as data going on in the factory for all these sensor things, all these combined together. It's amazing, like, how efficient you can make.

Corporation enterprises be right. So it to me that I thought of creating the assembly line, this kind of thing. So,

Richie Cotton: yeah, certainly improving efficiency is something that's top of mind for a lot of businesses right now. So it does seem very timely to be talking about this. So, one last question on tooling before we move on we're talking about the sort of Salonis and the enterprise use case for individual data scientists.

I know we have a lot of Python users, a lot of our users in the audience. So, how might they get started going about process mining?

Cong: yeah, I'm actually a very new user. Well, it has a lot more knowledge about this. And I think the Python is certainly helpful. I think if you want to do machine learning on top of the process data and all the things, I think it's Python. But in Solanus, we offer a so called machine learning workbench as well to help people have Python scripts, right?

we also internally has this SQL like language called a process query language, which provides a lot of capability and functionalities for you to interact with the process data in a much easier way comparable to SQL. And this is I think I don't know if we have a, we have a whole website section dedicated to educate people about Pico, which can definitely be a good starting point to look into.

I know in Europe, Will has classes and teaching people about process mining as well as people languages and yeah,

Wil: so perhaps to add to that, like, if you look at the, in the scientific world, let's say PROM is one of the, let's say standard tools that people use. It is also a tool with the UI. I would say it is not intended as industry strength, but more as a play yard for all kinds of algorithms, et cetera, et cetera.

Of course, what we see in the last like five years is that many Young people are using Python. we also developed something called PM4Py and that's a Python library where you can apply the basic algorithms. It is also possible, let's say from Salonis to, let's say, call these types of routines, basically like any Python library.

However I think it's very important to distinguish between, let's say the enterprise situations and let's say the educational situation, because they are very different. So if you are dealing with, I don't know, billions of events, I'd say most of the open source stuff does not work anymore. And I think even if you look at a well known cases like the let's say purchase to pay an order to cash and all of these standard processes.

I think people typically underestimate the complexity of the data models behind it, so this is not a simple flat file. You're dealing with suppliers, with customers, with orders, with items with invoices, et cetera, et cetera. So this is typically, let's say, quite complicated.

As such, there are these open source and Python alternatives. To get in touch with the technology. I really understand it, but if you go to the enterprise use case, you typically need to start using commercial software.

Richie Cotton: the message I'm getting is that it actually requires quite a lot of data engineering in order to get started with this because your data for any given process is going to live in several different places and you've got to find a way to bring it all together to start doing the analysis.

Wil: That's also why I indicated that many organizations start with the standard processes where we can really help people to get started super quickly, Because then we know what tables to look at, etc. Just imagine that if you are ordering something from Amazon or something like that, you have dragged and dropped some articles you say, now I buy.

Something you should imagine the moment that you push the button. Okay, now I pay that in the information system. Many different tables are being updated just by the single action because you're the table with customer information is being updated. But you have an order. The order consists of multiple items the items that you have ordered, some may be in stock, some may not be in stock, some are sold directly by Amazon, et cetera, et cetera.

So behind the scenes, lots of things happening. And for the known use cases, like this is pretty standard, but if you go to a new application domain, this is more involved. also I get in a typical enterprise set up and there's usually multiple personas as well was saying, like with the commercial software, you can have the analysts and data scientists setting things up and then the business users can just interrogated data in a much more intuitive and straightforward way.

Cong: And for someone's, for example, we have this product called a business minor. Where you don't really need to be a data scientist and you just go in and it will highlight with a few, you answer a few questions and boom, you will get a very easy to monitor and a dashboard thing.

Richie Cotton: So perhaps for any organizations wanting to get started, you mentioned that like the financial cases are easy. I guess like all the data is going to be in Salesforce or some equivalent. So maybe it's just like one or two sources to deal with rather than the process being spread over lots of different cases. so, who needs to be involved in adopting process mining? Like what are the kind of roles or teams that need to be involved in like your first Process mining project. Kong, do you want to go first this time?

Cong: sure I can give it a try. Jump in here. And I think the first thing is you need to have a company executive leadership commitment to improving the process and looking for efficiency. without that, it's probably difficult. And then once you have the support of that, and you need someone like data scientist or analyst with the technical mindset and willing to work with for that personalness, and in order to start with some use cases and drive it through, and then after it, The initial setup, I think you probably want to have stakeholder buy in from different lines of business.

For example, we can monitor the AP process or constable process. And then whoever is the line of business owner, the CFOs or financial directors, needs to be like basically monitoring them and be bought into the whole thing. So, Will, do you have anything to add?

Wil: think if you look at research then what you see is that let's say the organizations that are most effective in applying process mining, they build this kind of center of excellence. So what you need is you need to have people that have the technical skills to, to work on the data pipeline, et cetera, et cetera.

You also need to have people that have domain knowledge, et cetera, et cetera. So, so you typically it is best to have a center of expertise in a larger organization. Because that also helps you to scale. And so one of the things that I think is super important if you do process mining is you do not don't want to do process mining for a single department or a single process.

That is the dumbest thing that you can do. Sometimes compare it to make it to making the weather forecast, So if I want to invest in making a weather forecast. I'm not going to make the weather forecast for a single day in a single city. If I want to make weather forecast and it has to pay off, I need to do it for all the cities every day.

And the same applies to process mining. So you need to have this center expertise. They do not need to be huge. But they need to have the capabilities in such a way that you will not apply it in one process, you will apply it in 50 processes continuously, every day. I think this is very important.

And then, like, like we get back to the personas that Kong mentioned before, suppose that you have, I don't know, a center of expertise of, let's say, 5 to 10 people. They can have an incredible impact on the whole organization because after they have done their work, there are thousands of people that may be consuming these process mining results every day, right?

And sometimes even these improvements are done automatically without people really knowing that behind the scenes process mining is, let's say, removing certain inefficiencies that are known.

Cong: another thing I think I forgot to mention is that because process mining software typically interact with the ERP system and various things, it is also important to get the buy in from the CIOs in the IT department. And sometimes the center of excellence we're always talking about is part of the IT department CIO, and then things can become smooth.

And if they are not, then we also do need to align with the IT department very much so as well.

Richie Cotton: Ah, the assumption that this would live within an IT department maybe rather than under the Chief Data Officer.

Cong: I think that depends on the enterprise Center for Excellence for some companies live under a chief data officer or even like, directly under the CEO's purview. Some others are sitting under within CIO. I actually don't know the percentage, I will probably have a better idea.

Wil: like also feel that like organization has been relabeling things a lot in recent years. So I think it's very difficult to put a number in general process mining is of course on the interface of IT data and business, right? And that makes it very difficult to pinpoint at one specific location.

It's also why top level management support is super important to get this going.

Cong: And also, I want to say, like Will said, like, And you can start with one process, one department, and the idea is like you try it out on a few small process or a few standard process and see the benefit of it, then you can grow. It's not like, Oh, every time you come in, you have to come in five, 50 process.

I think you can always start small to see the benefit.

Richie Cotton: yeah, so I was actually wanting to pick up on that point. So, I can see how there's definitely a benefit to dealing with lots of processes at once, because if you change one process and you optimize that, then maybe it's going to just cause a bottleneck somewhere else. But how do you go about scaling then?

Like, if, have you got to go from one to 50 processes pretty quickly then, or can you start with one and see how that works and gradually build it up?

Wil: I think it's very important to first understand the technology, And for that you need, it's best to start with a fairly straightforward process and then doing it. But at the same time, I think it's very important to have the ambition level to grow that, Because the benefits come really with scale. the speed at which you would like to do that, I think very much depends on the organization. Of course, most organizations also first want to see evidence that really works before they, let's say, invest more money. I think that's also perfectly normal.

Cong: I think that Richard was asking about how do you go from one process to another. I think in the this is actually a very nice. Segue into object centric process mining. Well, and because I think before the introduction of object centric process mining, it is indeed somewhat challenging going from one process to another because all the process are independently processed with object centric.

Like you don't deal with the process. You deal with business objects like invoice. Invoice can go through many processes, order to cash, accounts payable, accounts receivable, and these objects actually connect all these different processes together. Now, with object centric process mining, do a lot of object modeling up front.

Once you do that first one or two, then extending to the rest of the process is actually much easier compared to before. So, I think Will has been talking about object centric process mining for a long time, so I'll let him elaborate on the benefit of



Wil: so to, clarify it for the people that are. Listening the traditional setting off process mining is that you focus on a particular process, So you now want to improve this process and a process is typically defined by something that you're handling, which we call a case. so you can track one production order, you can track one shipment, you can track one COVID patient whatever setting you have.

so that's a classical setting. So all the events that you collect them. Focusing on the single case notion that leads to the situation that if you're doing process mining in the classical sense, for example, like, in the BMW setting where you have 50 processes, then you start extracting 50 times data from your source systems.

And each time you extract it from the viewpoints of one particular process that you would like to improve and understand and CC the problems there. If you do object centric process mining, you take a more holistic approach. You don't extract the data having in mind now I want to analyze this particular process.

I want to track this particular case, but you extract the data looking at basically all the events and all the objects that are in scope. like, we are now in this meeting, So there are four persons involved. We could think of many more objects involved in this.

In the classical setting, we would either follow the process of Kong, the process of Will, the process of Richie. But now we look at all of this in a holistic way, capturing the events and the objects as they are really there. after we have extracted the data, we start picking, okay, we would now like to analyze this process from this particular viewpoint.

And this is a let's say a super powerful mind shift. That is now, let's say ongoing. I've been doing research in the area for a very long time. And Salonius is the first commercial vendor really embracing this type of notion, which I think will fundamentally change the way that we talk about processes and do process mining.

Richie Cotton: So, this sounds like, it's really gonna help you understand the interactions between processes rather than just looking at individual


Wil: think of a customer that places an order which needs to be assembled to order. So the customer places an order and now things need to be assembled. And of the things that are being assembled some of the things may be in stock, other things are not in stock, and then you need to procure them. So one problem in procurement, and in the end influence let's say something at the other side of the organization.

The same as that, if you have procurement problems that may lead to higher shipment costs, Because you, you start shipping things in smaller batches, et cetera, et cetera. So there are all of these interdependencies that you can only see if you have this more holistic view that I was just describing.

Richie Cotton: Okay. I'D like to talk a bit about the sort of rewards from doing process mining. So, do you have. Any sense of like how you go about measuring the benefits from examining your processes and updating your processes?

Wil: So if you look at our website and you will see many examples I'm, I'm always a bit hesitant the sense that it's much better that our customers themselves say how much that they save. But we have many examples where customers are really saving millions of euros or dollars each year. By simply applying process mining.

And as I said, these large numbers also often come with a large scale, These are larger organizations and if I mentioned before, if 20 percent of your cases have certain problems and you can remove them, that is incredibly powerful.

Cong: Yeah, I think there are always two kinds of savings that you can think of. The one kind is like very quantifiable. And meaning like, you know how much money you're saving, like duplicate invoice detection, right? So, and for the duplicate invoice detected, and you put a payment blocker, you get the money back.

These values combined together is very clearly, you know, like, how much return on investment you're already getting just by doing that. And there are also other improvements that are more like KPI level, not necessarily in dollar amount, for example. Using process mining, you can improve your customer satisfaction.

If you look at customer support flow, you may be able to improve your working capital, increase that by 20 percent or something like that, right? These don't necessarily translate to them. Immediate bottom line savings directly, but in most of the enterprise, there's a conversion rate where the CFOs and CEOs use, okay, if I save 20 percent I improve 20 percent on working capital and how much does that actually translate to my bottom line saving?

And if I have like this customer satisfaction improved by 10%, how much does that will translate to top line growth in the future, and we, as well as saying, like, we typically let the customer tell us and decide and instead of tuning our own home and for the second kind, but for the first kind, we know exactly how much money we are saving the customers.

Wil: so if I may add two things to that. Cause I think it's an important point. I think the first point is that the benefits are clearly quantifiable. At the same time, we enjoy working with organizations where people feel a responsibility that the processes are running correctly. So if you think about it's astonishing how many organizations are paying invoices twice.

It's really astonishing. So if you are a customer and you go to a restaurant, right, you would never pay twice, And if somebody would say, yeah, you're paying twice, you would not ask what is the business case? No, you don't want to pay twice. If you go to the organizational setting, I think if you have your processes under control, you don't need to have a business case to avoid that you're paying things multiple times, it's very obvious. I should, I can also mention one other example that everybody can imagine how it works, which shows a benefit, which is which is the result of having better operations that immediately impacts also customers. So, for example, Lufthansa is using process mining for their standard processes, but they also use it to, let's say, minimize delays.

if you look at Frankfurt airport, Munich airport like large hubs of Lund, Lufthansa, their process mining is being used to analyze in real time what are the causes of delay? And then you would be astonished that like the moment that the plane lands and takes off again, let's say approximately 80 unique events are being recorded and you should think about things like fueling starts, fueling completed.

Baggage unloading started, baggage unloading completed, gate open, et cetera, et cetera. So you see all of these things. And I find it fascinating because if you think about this in one hour, many things need to happen. Concurrently. And if one of these things failed, if the cleaning crew is not there, the whole flight gets delayed.

And in the European flights, so it may be that the plane is making, let's say eight legs on one day. If in the morning you have a problem, you have a problem everywhere. And I think it's a nice example that you cannot immediately put into money. But where it is clear that because of process mining, you can really reduce the number of delays that planes have.

By really analyzing what the root causes are in real time. And I think it's a very nice example next to financial things, where people, if they stand in an airport that they realize that processes need to work, You typically only realize that there is a process if it doesn't work, That you're aware of it. If it runs well, you're not,

Cong: Yeah, here's one, a concrete example where with Johnson, we looked at their using process mining to look at their order delays and at the end of the project, we actually realized 20 time, 30 percent reduction in throughput time and 40 percent reduction in price changes. So these are very concrete cases.

That enterprise customer have been benefit from, benefiting from process mining.

Richie Cotton: I mean, those are both great examples. I think certainly a lot of people are going to be wishing that airlines did a bit more process money to make sure that their vacation flights are on time. And certainly that does seem like a really like a big enough improvement in the process is like is saving enough money for Johnson.

But, that's a real competitive advantage there. in terms of the timescales for a payoff. I know like often when you have like a new idea. Management often a bit antsy about like making sure there's a quick payoff. What sort of a timescale we're looking at from adopting process mining to seeing those improvements.

Wil: that's it. So if you to get started with process mining, your first project typically will take a bit more time to get started, But after that to roll out to the next process are things that are going much faster as so to roll out process mining, you should think in terms of weeks to get it running.

Then people need to get used to it, but basically the the results are there immediately, We rarely encounter processes where everything is perfect, right? If everything is perfect, you don't have to do process mining. So you typically start seeing improvements pretty quickly.

But you need to realize that process mining is a continuous effort, You're saving time. So even if process mining is running year after year, you're saving let's say money year after year. Because if you would stop doing process mining, probably, let's say, the unhealthy habits in the process would slip back.

Into it again, et cetera, et cetera. So think to say it's not like, you now make one big saving. It's something that is continuous and the longer you use it, the more you save.

Cong: we have clients working with us for five plus, six plus years, and I don't know what's the longest, we'll probably have some idea, but I think it's very common, the clients. Once they get a taste of how process mining is helping them monitor the process and improve the efficiency of their operation, and they really want to apply it to more and keep using it.

Richie Cotton: Okay. So maybe a few weeks to months to get started and then something you're going to work on over a period of several years. So it does seem like that the main point of this is just to keep improving processes over and over again. So how do you build up that? Feedback loop to make sure that processes do keep improving once you've like, you don't just do the analysis and then it goes nowhere.

Wil: so the two challenges that we spoke about one challenge, I think a lot that is, let's say getting the data into shape to do process mining. Assume that one has done that. Then the other challenge, like the process mining technology does what it should do. Yeah. Right. It always works.

It's a proven technology. But then, of course, the last part is, how are you going to implement the changes the moment that you see these inefficiencies, That's something that's very important. And there some of these improvements. They are basically manual, You see that there are lots of inefficiencies in a certain area.

You give the feedback and people get all the work instructions. And then hopefully it works by itself, but you continuously need to monitor it. So the feedback loop there is, let's say, I don't know, every week or every month, look at the actual data. And if you see that people start again working in the old way, you need to correct that.

With the more, let's say mature customers that we have, this feedback loop is more or less automatic in the following sense that for the known inefficiencies and the known Problems that we are frequently see, we set up workflows where from the process mining, you're automatically triggering workflows.

Interacting with the source systems to automatically overcome these inefficiencies. And that's why I said earlier, you can think of process mining as a layer on top of existing systems. So, for example we have like a component in Salonis that's called action flows. It's a company that we bought Integromat is now called make and with these workflows, we can interact with over 1000.

Other information systems, so of these 1000 plus SAP and Salesforce are, for example, two examples of what we can interact with. And of course, this is super powerful that the moment that you see certain known inefficiencies that you immediately respond the source system to make sure that doesn't happen again.

And then it becomes something automatic. Of course, what you see is that if you think about process improvement, you always start with the low hanging fruit, That is relatively easy. After you have that under control, you start looking at the more advanced things. As similar in the beginning probably let's say predictive analytics are not so important, right?

If you have lots of inefficiencies and obvious problems, you should first address those. But as you get more mature, you also go to more advanced types of analytics.

Richie Cotton: And it seems like there's an aspect to change management here especially when you've got lots of different teams working on a particular process. Do you have any advice for how you can handle the people side of things and managing across different teams and roles?

Wil: so here it is super important as I think we've mentioned before, is that there is top level support for doing process mining, right? Because if you find you need to be able to address them and there may be all kinds of, let's say, internal politics involved. So change management is super important and also buy in from higher level management is also something important.

On the other hand the power of process mining is that it is not a power point. People, if they look at the PowerPoint, they can question it. If you look at process mining and you see a certain inefficiency or you see certain compliance problems. People cannot deny it, Because if you see, for example, that the check is often bypassed, you can simply click on that line and you can see exactly the cases that had this undesirable behavior, et cetera, et cetera.

And I think in change management. It is super important to be able to present results that are undeniable, if people could question the accuracy of the statements that one is making, nothing will change. But if you can always drill down and show, okay. This is not some PowerPoint diagnostics.

You can really look at the cases that had this behavior and what they were costing. I think that really helps change management. But in the end it's important that people are willing to change in that setting.

Cong: I think in the longer time scale, believe, even it's happening now already in Europe, but we believe process mining is becoming an indispensable tool, just like a quarterly business review and revenue prediction that you need to do periodically and CFO and even CEOs will have to do.

Periodic look at those process mining observability results and understand how efficient the company's operations are and figure out ways to make them more efficient. And I think Salonis has been working with a lot of our European customers and they're already getting there. In the U.S., it's just picking up. Sure,

Richie Cotton: All right. So, before we wrap up can you tell me what are you most excited about in the world of process mining at the moment?

Cong: I have a lot of things I'm very excited about. Otherwise, I wouldn't be joining this company a year ago. One of the things that I think is super, super interesting to me, and internally we're just discussing, and with generative AI, large language model, there's going to be a lot of transformation happening.

In the process of the companies, right? Think about all of these AP clerks and encoding jail code in your invoices. And all these are going to be disrupted by large language and modern machine learning. and during this transformation process, it's very important for you to, for enterprise to monitor how effective These transformations are, are they actually improving the desired KPIs you care about, From that perspective process mining is your observability too. And you keep monitoring your process, you understand through this AI transformation, how efficient, how effective your process has become. And I think this is like, there's no other tool better than, like, None of the other tools you're thinking about like a data mining or BI is nearly at the process mining maturity in terms of observing your processes.

So, especially in this time, I think a lot of the CEOs are thinking about like, how is AI going to impact my business and my operations, and it's the right time to pick up process mining and really use that to monitor. And I think this is probably one of them. Most exciting things I'm thinking about that, of course, there are others as well.

Wil: me, like the whole area of process mining is super excited, I started working on this in the late 90s as a research project, And now it's a complete industry. So I think there are many developments that have happened that I find super exciting. But on the technical side, when I talked about, let's say, the two innovations that I find most exciting at this point in time.

One was already mentioned that was object centric process mining. And so, that requires us to basically reinvent existing process mining techniques and bring them to another level. And that is something that is super exciting. And we are working on that. Both on the research side and on the company side.

We are working on that with many people. All the people in the field see this as one of the breakthroughs, let's say of the last 10 years. What is happening now? So, so that's super exciting. Another thing that, that I find super exciting is that I think many organizations are struggling with applying machine learning in a business setting.

And I think process mining is the technology to help lower the threshold for organizations to start using machine learning in a meaningful way. I expect that many organizations will get disappointed and that they have unrealistic ideas of everything that they can do. And I think process mining will prove to be something that is very down to earth.

That will help you to generate the machine learning problems where let's say you get really reliable results. So I think these are for me the two things that I will be enjoying working on the next couple of years.

Richie Cotton: Excellent. It does really feel like process mining is just about ready to hit the mainstream and get a much wider adoption. All right. Brilliant. With that, I think thank you both for your time. It's been great having you both on the show.

Cong: thanks, Richie. Enjoyed the conversation very much.

Wil: Thanks very much. It was great.



The Future of Data Science in Insurance: Moving from an Analytics Garage to a Factory

The insurance industry is rife with data and potential use cases. In a recent webinar, Allianz Benelux Regional Chief Data & Analytics Officer Sudaman T M outlined how in order to scale value from data, insurance organizations should look to move from “an
DataCamp Team's photo

DataCamp Team

5 min


Data Trends & Predictions 2024 with DataCamp's CEO & COO, Jo Cornelissen & Martijn Theuwissen

Richie, Jo and Martijn discuss generative AI's mainstream impact in 2023, trends in AI and software development, how the programming languages for data are evolving, new roles in data & AI, and their predictions for 2024.
Richie Cotton's photo

Richie Cotton

32 min


Unlocking the Power of Data Science in the Cloud

Cloud analytics leaders from Exasol cover the motivation for moving analytics to the cloud, economic triggers for migration, success stories, the importance of flexibility and open-mindedness and much more.
Richie Cotton's photo

Richie Cotton

41 min


[Radar Recap] Building a Learning Culture for Analytics Functions, with Russell Johnson, Denisse Groenendaal-Lopez and Mark Stern

In the session, Russell Johnson, Chief Data Scientist at Marks & Spencer, Denisse Groenendaal-Lopez, Learning & Development Business Partner at Booking Group, and Mark Stern, VP of Business Intelligence & Analytics at BetMGM will address the importance of fostering a learning environment for driving success with analytics.
Adel Nehme's photo

Adel Nehme

41 min


Inside Algorithmic Trading with Anthony Markham, Vice President, Quantitative Developer at Deutsche Bank

Richie and Anthony cover what algorithmic trading is, the use of machine learning techniques in trading strategies, the challenges of handling large datasets with low latency, risk management in algorithmic trading and much more. 
Richie Cotton's photo

Richie Cotton

30 min


Analyzing Top Runner Performance from A to Z with AI using DataLab (fka Workspace)

Join Filip Schouwenaars, VP of Engineering at DataCamp Workspace to see a real-time example of how AI can enable faster time to insight on data analysis tasks.
Filip Schouwenaars's photo

Filip Schouwenaars

See MoreSee More