Although you might not have realized, processes take up an indispensable role in our daily lives. Your actions and those of others generate an extensive amount of data. Whether you are ordering a book, a train crosses a red light, or your thermostat heats your bathroom, every second millions of events are taking place which are stored in data centers around the world. These enormous sets of event data can be used to gain insight into processes in a virtually unlimited range of fields. However, the analysis of this data requires its own set of specific formats and techniques. This course will introduce you to process mining with R and demonstrate the different steps needed to analyze business processes.
The amount of event data has grown enormously during the last decades. A considerable amount of this data is recorded within the context of various business process. In this chapter, you will discover a methodology for analyzing process data, consisting of three stages: extraction, processing and analysis. You will have our first encounter with the specific elements of process data which are required for analysis, and have a first deep dive into the world of activities and traces, which will allow you to reveal of first glimpse of the process.
A process can be looked at from different angles: the control-flow, the performance and the organizational background. In this chapter, you will make a deep dive into each of these perspectives. The control-flow refers to the different ways in which the process can be executed, and thus, how it is structured. Considering performance, we are both interested in discovering how long things take, as well as when they take place. Finally, the organizational perspective looks at the actors in the process.
Event data rarely comes in a form which is ready to analyze. Therefore, you often require a set of tools to get the data in the right shape, before we can answer our research question. At the end of this chapter, you will be familiar with three common preprocessing tasks: filtering data, aggregating events and enriching data.
In this final chapter we will use everything we have learned so far to do and end-to-end analysis of an order-to-cash process. Firstly, we will transform data from various sources to an event log. Secondly, we will have a helicopter view of the process, exploring the dimensions of the data and the different activities, stages and flows in the process. Finally, we will combine preprocessing and analysis tools to formulate an answer to several research questions.
PrerequisitesWorking with Data in the Tidyverse
Author of bupaR package
“I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.”
Devon Edwards Joseph
Lloyds Banking Group
“DataCamp is the top resource I recommend for learning data science.”
Harvard Business School
“DataCamp is by far my favorite website to learn from.”
Decision Science Analytics, USAA