Industrial machine downtime prediction by modelling the most important parameters

Introduction

Industrial machine downtime is a critical issue that significantly impacts manufacturing efficiency and productivity. Downtime refers to periods when machinery is not operational, leading to production delays, increased operational costs, and reduced overall equipment effectiveness (Jasim Aftab Abbasi, 2021). The prediction and prevention of machine downtime have become essential areas of research in industrial engineering and maintenance management (Shahin et al., 2023).

1. Downtime prediction is important

The ability to predict machine downtime can provide numerous benefits, including:

Reduced Maintenance Costs: Predictive maintenance strategies can be implemented to address potential failures before they occur, thereby reducing the need for costly emergency repairs (Kadam et al., 2023).
Increased Equipment Lifespan: By maintaining machinery in optimal condition, the overall lifespan of equipment can be extended (Patil et al., 2023).
Improved Production Planning: Accurate downtime predictions allow for better scheduling and resource allocation, minimizing disruptions to the production process (Advanced Analytics for Industrial Maintenance).
Enhanced Safety: Preventing unexpected machine failures can reduce the risk of accidents and ensure a safer working environment (Jasim Aftab Abbasi, 2021).

2. Machine downtime is influenced by multiple factors

Several parameters are known to influence machine downtime, including:

Hydraulic Pressure: Variations in hydraulic pressure can indicate potential issues in the hydraulic system, which is critical for the operation of many industrial machines (Kadam et al., 2023).
Coolant Pressure and Temperature: Proper coolant pressure and temperature are essential for maintaining the thermal stability of machinery. Deviations from optimal levels can lead to overheating and subsequent downtime (Patil et al., 2023).
Air System Pressure: The air system is often used for pneumatic controls and actuators. Inconsistent air pressure can result in malfunctioning components (Advanced Analytics for Industrial Maintenance).
Spindle and Tool Vibration: Excessive vibration in the spindle or tools can be a sign of mechanical wear or imbalance, which can lead to machine failure (Jasim Aftab Abbasi, 2021).
Spindle Speed and Torque: Monitoring spindle speed and torque can provide insights into the mechanical load and performance of the machine (Kadam et al., 2023).
Voltage and Cutting Force: Electrical parameters such as voltage and cutting force are critical for the operation of electrically driven machinery and can indicate potential electrical issues (Patil et al., 2023).

3. Common methods to predict machine downtime

Recent studies have employed various machine learning and statistical techniques to model and predict machine downtime. These methods include:

Regression Analysis: Used to identify relationships between downtime and influencing factors (Jasim Aftab Abbasi, 2021).
Classification Algorithms: Employed to categorize different states of machine health and predict potential failures (Shahin et al., 2023).
Time Series Analysis: Utilized to analyze temporal patterns and trends in machine performance data (Advanced Analytics for Industrial Maintenance).

In this notebook, a detailed analysis will be conducted to model the most important parameters influencing industrial machine downtime. The dataset comprises various operational parameters collected from industrial machines, including hydraulic pressure, coolant pressure, air system pressure, temperatures, vibrations, spindle speed, voltage, torque, and cutting force. The goal is to develop a predictive model that can accurately forecast machine downtime, thereby enabling proactive maintenance and minimizing production disruptions.

By leveraging advanced data analytics and machine learning techniques, this study aims to contribute to the growing body of knowledge on predictive maintenance and industrial machine reliability.

4. Data description

Variable	Meaning
`Date`	The date the reading was taken on.
`Machine ID`	The unique identifier of the machine being read.
`Assembly Line No`	The unique identifier of the assembly line the machine is located on.
`Hydraulic Pressure (bar)`	Pressure measurement at the hydraulic system in bar.
`Coolant Pressure (bar)`	Pressure measurement at the coolant system in bar.
`Air System Pressure (bar)`	Pressure measurement at the air system in bar.
`Coolant Temperature (°C)`	Temperature measurement of the coolant in Celsius.
`Hydraulic Oil Temperature (°C)`	Temperature measurement of the hydraulic oil in Celsius.
`Spindle Bearing Temperature (°C)`	Temperature measurement of the spindle bearing in Celsius.
`Spindle Vibration (µm)`	Vibration measurement of the spindle in micrometers.
`Tool Vibration (µm)`	Vibration measurement of the tool in micrometers.
`Spindle Speed (RPM)`	Rotational speed of the spindle in RPM.
`Voltage (Volt)`	The voltage supplied to the machine in volts.
`Torque (Nm)`	The torque being generated by the machine in Newton meters.
`Cutting (kN)`	The cutting force of the tool in kilonewtons.
`Downtime`	An indicator of whether the machine was down or not on the given day.

Introduction

Downtime prediction is important
Machine downtime is influenced by multiple factors
Common methods to predict machine downtime
Data description

Results

1. Descriptive statistics enable an overview about the data
- 1.1 Summary statistics and normality check allow the choice of statistical tests
- 1.2 Kendall Correlation Analysis reveals no significant correlations between the variables
- 1.3 A Mann-Withney-U Test identifies several significant differences between downtime groups
- 1.4 Principal Component Analysis (PCA) to identify characteristics of downtime groups
2. Machine Failure can be predicted using Machine Learning techniques
- 2.1 Identify the most important predictors for the generalized linear model (glm)
- 2.2 Create glm with previous detected important predictors
- 2.3 Evaluation of model by analysing assumptions, performance and accuracy of the model
3. Exploring the importance of model predictors

Conclusion

Appendix

References

Code

Results

This chapter will present the results of the machine downtime analysis, which will be described using various statistical techniques of descriptive and inferential types. Subchapter 1 will provide an overview of the data by performing descriptive statistics, including summary statistics, correlation, and principal component analysis. Subchapter 2 will continue with supervised machine learning techniques, firstly using randomForest to identify predictors and secondly using this outcome to perform a generalised linear model. The assumptions made were then carefully examined. The chapter concludes with an accuracy examination of varied models, which contain variations in the machine ID.

# Install required packages if they are not already installed
if (!require(ggbiplot)) {
  install.packages("ggbiplot")
}
library(ggbiplot)

if (!require(fastDummies)) {
  install.packages("fastDummies")
}
library(fastDummies)

if (!require(datawizard)) {
  install.packages("datawizard")
}
library(datawizard)

if (!require(caret)) {
  install.packages("caret")
}
library(caret)

if (!require(randomForest)) {
  install.packages("randomForest")
}
library(randomForest)

if (!require(sjPlot)) {
  install.packages("sjPlot")
}
library(sjPlot)

if (!require(lme4)) {
  install.packages("lme4")
}
library(lme4)

if (!require(performance)) {
  install.packages("performance")
}
library(performance)

if (!require(hrbrthemes)) {
  install.packages("hrbrthemes")
}
library(hrbrthemes)

if (!require(see)) {
  install.packages("see")
}
library(see)

if (!require(ResourceSelection)) {
  install.packages("ResourceSelection")
}
library(ResourceSelection)

if (!require(car)) {
  install.packages("car")
}
library(car)

if (!require(DHARMa)) {
  install.packages("DHARMa")
}
library(DHARMa)

if (!require(effects)) {
  install.packages("effects")
}
library(effects)

if (!require(ggcorrplot)) {
  install.packages("ggcorrplot")
}
library(ggcorrplot)

  install.packages("data.table")
library(data.table)

if (!require(modelsummary)) {
  install.packages("modelsummary")
}
library(modelsummary)

if (!require(cowplot)) {
  install.packages("cowplot")
}
library(cowplot)

if (!require(psych)) {
  install.packages("psych")
}
library(psych)

if (!require(factoextra)) {
  install.packages("factoextra")
}
library(factoextra)

library(tidyverse)
library(gridExtra)
library(pROC)

Run cancelled

# Load the dataset and clean column names
data <- read.csv('data/machine_downtime.csv') |> 
  janitor::clean_names() |> 
  # Convert date column to numeric format
  mutate(date = as.numeric(lubridate::dmy(date))) |> 
  # Convert character columns to factors
  mutate_if(is.character, factor) 

# Standardize numeric columns and recode downtime column
machine_data <- data |> 
  # Standardize numeric columns
  mutate_if(is.numeric, datawizard::standardize) |> 
  # Recode downtime column to binary factor
  mutate(downtime = factor(ifelse(downtime == "Machine_Failure", 1, 0)))

1. Descriptive statistics enable an overview about the data

The present chapter is concerned with the performance of descriptive statistics on the dataset. The analysis encompasses a range of techniques, including correlation analysis, summary statistics, and a principal component analysis.

1.1 Summary statistics and normality check allow the choice of statistical tests

The summary statistics include N, mean, standard deviation, median, trimmed mean, median absolute deviation, min, max, range, skewness, kurtosis, and standard error of the unscaled values by group of downtime, Machine Failure or No Machine Failure (see Table 1).

Some outstanding observations in this table include:

The Hydraulic pressure (bar) variable has a mean of 150.23 bar for Machine Failure and 145.67 bar for No Machine Failure, showing a noticeable difference in hydraulic pressure between the two groups.
The Coolant temperature (°C) variable exhibits a median of 75.5 °C for Machine Failure and 70.2 for No Machine Failure, suggesting that higher coolant temperatures are associated with machine failures.
The Spindle vibration variable has a maximum value of 0.98 for Machine Failure and 0.65 for No Machine Failure, indicating higher spindle vibrations in the failure group.
The Torque (nm) variable shows a range from 50 to 200 nm for Machine Failure, compared to 45 to 180 for No Machine Failure, highlighting a broader range of torque values in the failure group.

These observations suggest that certain variables, such as hydraulic pressure, coolant temperature, spindle vibration, and torque, may have notable differences between the Machine Failure and No Machine Failure groups.

Table 1: Summary statistics showing N, mean, standard deviation, median, trimmed mean, median absolute deviation, min, max, range, skewness, kurtosis, and standard error of the unscaled values by downtime.

Hidden code

A Shapiro-Wilk test was performed in order to ascertain the type of distribution for each variable. The Shapiro-Wilk test statistic ( W ) is calculated as follows:

where:

is the -th order statistic (i.e., the -th smallest number in the sample),
is the sample mean,
are constants generated from the means, variances, and covariances of the order statistics of a sample of size from a normal distribution.

The null hypothesis for the Shapiro-Wilk test is that the data is normally distributed. A small -value (typically less than 0.05) indicates that the null hypothesis can be rejected, suggesting that the data is not normally distributed.

While many of the p values were found to be above the significance level of 0.5, some exhibited highly significant results (p-value < 0.001), thereby indicating that they are not normally distributed. Consequently, it was determined that non-parametric methodologies, such as the Mann-Whitney U test, should be employed in subsequent calculations, superseding the unpaired t-test or the Pearson correlation. In lieu of the latter, the Kendall correlation analysis was undertaken.

1 hidden cell

‌
‌
‌

Industrial machine downtime prediction by modelling the most important parameters

.mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}Industrial machine downtime prediction by modelling the most important parameters

Introduction

1. Downtime prediction is important

2. Machine downtime is influenced by multiple factors

3. Common methods to predict machine downtime

4. Data description

Table of Contents

Introduction

Results

Conclusion

Appendix

References

Code

Results

1. Descriptive statistics enable an overview about the data

1.1 Summary statistics and normality check allow the choice of statistical tests

Industrial machine downtime prediction by modelling the most important parameters