Can we predict machine failures before they happen?
Using machine learning to predict mechanical failure
Image from H.O.Penn
Creator Notes:
- Use
lightmode view for optimal reading. - Plots are interactive, allowing you to explore the data in detail.
Executive Summary
- LightGBM with RobustScaler achieved high accuracy (98.9%) in predicting machine failures.
- Mechanical stress (torque, cutting force, pressure) are the most important factors for predicting failures.
- A single model might be effective for all machines due to consistent failure patterns across them.
- Investigate systematic issues causing failures (e.g., maintenance practices, environment).
- Continuously monitor and evaluate the deployed LightGBM model's performance.
I. Background
Manufacturing downtime costs our aerospace and medical components facility both time and money. With three specialized machines producing different sized components, any unexpected failure disrupts production schedules and impacts delivery deadlines. Our current reactive maintenance approach means we fix problems after they occur — leading to longer downtimes and missed deadlines.
II. Objectives
This project aims to predict machine failures before they happen, enabling our maintenance team to plan repairs proactively. We will develop a predictive model using a year's worth of operational data to identify early warning signs of potential failures. Our key goals are identifying the most reliable failure indicators and determining whether machine-specific models perform better than a general approach.
III. The data
Our dataset contains daily operational measurements from three production machines spanning one year. Each record includes 13 sensor measurements — from hydraulic pressure to spindle vibration — along with machine identifiers and downtime status. The data comes from critical systems including cooling, hydraulics, and cutting mechanisms. All measurements follow standardized units and are recorded at consistent daily intervals.
| Column Name | Description | Unit | Significance |
|---|---|---|---|
| Date | Daily timestamp of readings | YYYY-MM-DD | Tracks temporal patterns and maintenance history |
| Machine_ID | Unique machine identifier | Text | Enables machine-specific analysis and comparisons |
| Assembly_Line_No | Production line location | Integer | Maps physical layout and workflow dependencies |
| Hydraulic_Pressure | Hydraulic system pressure | bar | Indicates fluid power system health |
| Coolant_Pressure | Cooling system pressure | bar | Monitors heat dissipation efficiency |
| Air_System_Pressure | Pneumatic system pressure | bar | Reflects compressed air system status |
| Coolant_Temperature | Cooling system temperature | Celsius | Tracks thermal management effectiveness |
| Hydraulic_Oil_Temperature | Hydraulic fluid temperature | Celsius | Indicates system stress and oil condition |
| Spindle_Bearing_Temperature | Bearing temperature | Celsius | Monitors critical component health |
| Spindle_Vibration | Spindle oscillation | micrometers | Detects mechanical imbalances |
| Tool_Vibration | Cutting tool movement | micrometers | Indicates tool wear and stability |
| Spindle_Speed | Rotational velocity | RPM | Measures cutting performance |
| Voltage | Electrical input | volts | Monitors power supply stability |
| Torque | Rotational force | Nm | Indicates mechanical load |
| Cutting | Tool force | KN | Measures material removal effort |
| Downtime | Operational status | Boolean | Records machine availability |
The company has stored the machine operating data in a single table, available in 'data/machine_downtime.csv'.
# Data manipulation and analysis
import pandas as pd
import numpy as np
# Visualization libraries
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
import plotly.figure_factory as ff
from plotly.subplots import make_subplots
# Scikit-learn imports
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler, MinMaxScaler, RobustScaler, OneHotEncoder
from sklearn.impute import SimpleImputer, KNNImputer
# Metrics and evaluation
from sklearn.metrics import (
accuracy_score, precision_score, recall_score, f1_score,
roc_curve, precision_recall_curve, auc, roc_auc_score,
confusion_matrix, classification_report
)
from sklearn.model_selection import cross_val_score
# Models
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.tree import DecisionTreeClassifier
from xgboost import XGBClassifier
from lightgbm import LGBMClassifier
# Configuration settings
import warnings
warnings.filterwarnings('ignore')# Load the dataset and display the first 10 rows
data = pd.read_csv('data/machine_downtime.csv')
display(data.head(10))# Display information about the dataset including the data types and non-null counts for each column
data.info()IV. EDA Summary
For a comprehensive understanding of the data analysis, readers are encouraged to review my previous report, which provides detailed exploratory data analysis. Here are the key findings from How do machines behave before downtime? that inform our current modeling approach:
Correlation Analysis:
- Sensor measurements show independence (correlation coefficients < 0.25)
- All variables retained for modeling due to their independent predictive potential
- Temporal features include day_of_week and is_weekday indicators
Temporal Patterns:
- Peak failure incidents observed in March-April 2022
- Weekday failures occur 3x more frequently than weekend failures
- Strong correlation between production schedules and machine reliability
High-Predictive Variables:
- Hydraulic pressure (bimodal distribution)
- Tool vibration measurements
- Spindle speed readings
- Cutting force metrics
- Torque values
- Coolant pressure levels
Preprocessing Strategy:
-
Missing Value Treatment:
- Mean imputation for normal distributions
- KNN imputation for bimodal distributions
-
Outlier Handling:
- Retain outliers as they represent valid operational states
- Evaluate multiple scaling methods
- RobustScaler anticipated as optimal choice due to outlier presence
-
Class Balance:
- No class imbalance treatment needed for 'downtime' variable
-
Feature Engineering:
- Initial approach without feature engineering due to low multicollinearity
- Will reassess based on baseline model performance
Modeling Approach:
- Baseline: Logistic Regression
- Advanced Models:
- Gradient Boosting (primary candidate for non-linear relationships)
- Random Forest (alternative for handling outliers and interactions)
This structured approach will guide our model development and evaluation process.
V. Data Preparation
Before preprocessing, the dataset was prepared by standardizing column names to snake_case for consistency, expanding date features to extract relevant temporal information (e.g., day of week) for potential model improvement, and mapping the target variable.
# Replace parentheses with underscores and convert to lowercase
data.columns = [col.replace('(', '_').replace(')', '').lower() for col in data.columns]
# Convert to datetime to use .dt accessor
data['date'] = pd.to_datetime(data['date'] )
# Extract day of the week
data['day_of_week'] = data['date'].dt.strftime('%A')
# Create boolean column to check if is weekday
data['is_weekday'] = data['date'].dt.weekday < 5
# Define ordered categories for days
day_order = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
# Convert day to ordered categorical variable
data['day_of_week'] = pd.Categorical(data['day_of_week'], categories=day_order, ordered=True)
# Convert the target variable to binary
data['downtime'] = data['downtime'].map({
'Machine_Failure': 1,
'No_Machine_Failure': 0})# Check encoding
print("Target value counts:")
print(data['downtime'].value_counts())
print("\nTarget unique values:")
print(data['downtime'].unique())As observed, our target class distribution is balanced.
2. Train-Test Split
With only 2,500 records, we opt for a simple train-test split without validation. We allocate 80% (2,000 samples) for training and 20% (500 samples) for testing. A validation set is unnecessary for datasets under 10,000 records as it would reduce our training data significantly.
TARGET = 'downtime'
TEST_SIZE = 0.2
RANDOM_STATE = 1
# Separate features (X) and target (y)
X = data.drop(['date', TARGET], axis=1) # Drop date and target
y = data[TARGET]
# Print dataset overview
print("Dataset Overview:")
print(f"Total samples: {X.shape[0]}")
print(f"Features: {X.shape[1]}")
print("\nTarget distribution:")
print(y.value_counts(normalize=True))
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
X, y,
test_size=TEST_SIZE,
random_state=RANDOM_STATE,
stratify=y
)
print("\nClass Distribution:")
print(f" Training set:\n{pd.Series(y_train).value_counts(normalize=True).to_string()}") #to_string for better formatting
print(f" Test set:\n{pd.Series(y_test).value_counts(normalize=True).to_string()}") #to_string for better formatting